MolPLA: a molecular pretraining framework for learning cores, R-groups and their linker joints

Mogan Gim, Jueon Park, Soyon Park, Sanghoon Lee, Seungheun Baek, Junhyun Lee, Ngoc Quang Nguyen, Jaewoo Kang

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Motivation: Molecular core structures and R-groups are essential concepts in drug development. Integration of these concepts with conventional graph pre-training approaches can promote deeper understanding in molecules. We propose MolPLA, a novel pre-training framework that employs masked graph contrastive learning in understanding the underlying decomposable parts in molecules that implicate their core structure and peripheral R-groups. Furthermore, we formulate an additional framework that grants MolPLA the ability to help chemists find replaceable R-groups in lead optimization scenarios. Results: Experimental results on molecular property prediction show that MolPLA exhibits predictability comparable to current state-of-the-art models. Qualitative analysis implicate that MolPLA is capable of distinguishing core and R-group sub-structures, identifying decomposable regions in molecules and contributing to lead optimization scenarios by rationally suggesting R-group replacements given various query core templates.

    Original languageEnglish
    Pages (from-to)i369-i380
    JournalBioinformatics
    Volume40
    DOIs
    Publication statusPublished - 2024 Jul 1

    Bibliographical note

    Publisher Copyright:
    © The Author(s) 2024. Published by Oxford University Press.

    ASJC Scopus subject areas

    • Statistics and Probability
    • Biochemistry
    • Molecular Biology
    • Computer Science Applications
    • Computational Theory and Mathematics
    • Computational Mathematics

    Fingerprint

    Dive into the research topics of 'MolPLA: a molecular pretraining framework for learning cores, R-groups and their linker joints'. Together they form a unique fingerprint.

    Cite this