Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 95% Match Research Paper Computational Chemists,Drug Discovery Scientists,AI Researchers in Cheminformatics 2 days ago

MolChord: Structure-Sequence Alignment for Protein-Guided Drug Design

graph-neural-networks › molecular-modeling
📄 Abstract

Abstract: Structure-based drug design (SBDD), which maps target proteins to candidate molecular ligands, is a fundamental task in drug discovery. Effectively aligning protein structural representations with molecular representations, and ensuring alignment between generated drugs and their pharmacological properties, remains a critical challenge. To address these challenges, we propose MolChord, which integrates two key techniques: (1) to align protein and molecule structures with their textual descriptions and sequential representations (e.g., FASTA for proteins and SMILES for molecules), we leverage NatureLM, an autoregressive model unifying text, small molecules, and proteins, as the molecule generator, alongside a diffusion-based structure encoder; and (2) to guide molecules toward desired properties, we curate a property-aware dataset by integrating preference data and refine the alignment process using Direct Preference Optimization (DPO). Experimental results on CrossDocked2020 demonstrate that our approach achieves state-of-the-art performance on key evaluation metrics, highlighting its potential as a practical tool for SBDD.
Authors (7)
Wei Zhang
Zekun Guo
Yingce Xia
Peiran Jin
Shufang Xie
Tao Qin
+1 more
Submitted
October 31, 2025
arXiv Category
cs.AI
arXiv PDF

Key Contributions

MolChord integrates NatureLM and a diffusion encoder for aligning protein and molecule structures with their sequential representations, and uses DPO with a property-aware dataset to guide molecule generation towards desired pharmacological properties. This addresses critical challenges in effectively aligning structural and property information for SBDD.

Business Value

Accelerates the drug discovery process by enabling more efficient and accurate generation of novel drug candidates with desired properties, potentially reducing R&D costs and time-to-market for new therapeutics.