Abstract
Motivation: Molecular core structures and R-groups are essential concepts in drug development. Integration of these concepts with conventional graph pre-training approaches can promote deeper understanding in molecules. We propose MolPLA, a novel pre-training framework that employs masked graph contrastive learning in understanding the underlying decomposable parts in molecules that implicate their core structure and peripheral R-groups. Furthermore, we formulate an additional framework that grants MolPLA the ability to help chemists find replaceable R-groups in lead optimization scenarios. Results: Experimental results on molecular property prediction show that MolPLA exhibits predictability comparable to current state-of-the-art models. Qualitative analysis implicate that MolPLA is capable of distinguishing core and R-group sub-structures, identifying decomposable regions in molecules and contributing to lead optimization scenarios by rationally suggesting R-group replacements given various query core templates.
Original language | English |
---|---|
Pages (from-to) | i369-i380 |
Journal | Bioinformatics |
Volume | 40 |
DOIs | |
Publication status | Published - 2024 Jul 1 |
Bibliographical note
Publisher Copyright:© The Author(s) 2024. Published by Oxford University Press.
ASJC Scopus subject areas
- Statistics and Probability
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics