Abstract
Pre-trained chemical language models (CLMs) excel in the field of molecular property prediction, utilizing string-based molecular descriptors such as SMILES for learning universal representations. However, such string-based descriptors implicitly contain limited structural information, which is closely associated with molecular property prediction. In this work, we introduce Moleco, a novel contrastive learning framework to enhance the understanding of molecular structures within CLMs. Based on the similarity of fingerprint vectors among different molecules, we train CLMs to distinguish structurally similar and dissimilar molecules in a contrastive manner. Experimental results demonstrate that Moleco significantly improves the molecular property prediction performance of CLMs, outperforming state-of-the-art models.
| Original language | English |
|---|---|
| Title of host publication | EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Industry Track |
| Editors | Franck Dernoncourt, Daniel Preotiuc-Pietro, Anastasia Shimorina |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 408-420 |
| Number of pages | 13 |
| ISBN (Electronic) | 9798891761667 |
| DOIs | |
| Publication status | Published - 2024 |
| Event | 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 - Hybrid, Miami, United States Duration: 2024 Nov 12 → 2024 Nov 16 |
Publication series
| Name | EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Industry Track |
|---|
Conference
| Conference | 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 |
|---|---|
| Country/Territory | United States |
| City | Hybrid, Miami |
| Period | 24/11/12 → 24/11/16 |
Bibliographical note
Publisher Copyright:© 2024 Association for Computational Linguistics.
ASJC Scopus subject areas
- Computational Theory and Mathematics
- Computer Science Applications
- Information Systems
- Linguistics and Language
Fingerprint
Dive into the research topics of 'Moleco: Molecular Contrastive Learning with Chemical Language Models for Molecular Property Prediction'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS