Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Tandem Mass Spectrometry enables the identification of unknown compounds in
crucial fields such as metabolomics, natural product discovery and
environmental analysis. However, current methods rely on database matching from
previously observed molecules, or on multi-step pipelines that require
intermediate fragment or fingerprint prediction. This makes finding the correct
molecule highly challenging, particularly for compounds absent from reference
databases. We introduce a framework that, by leveraging test-time tuning,
enhances the learning of a pre-trained transformer model to address this gap,
enabling end-to-end de novo molecular structure generation directly from the
tandem mass spectra and molecular formulae, bypassing manual annotations and
intermediate steps. We surpass the de-facto state-of-the-art approach DiffMS on
two popular benchmarks NPLIB1 and MassSpecGym by 100% and 20%, respectively.
Test-time tuning on experimental spectra allows the model to dynamically adapt
to novel spectra, and the relative performance gain over conventional
fine-tuning is of 62% on MassSpecGym. When predictions deviate from the ground
truth, the generated molecular candidates remain structurally accurate,
providing valuable guidance for human interpretation and more reliable
identification.
Authors (4)
Laura Mismetti
Marvin Alberts
Andreas Krause
Mara Graziani
Submitted
October 27, 2025
Key Contributions
This paper introduces a framework for end-to-end de novo molecular structure generation from MS/MS spectra using test-time tuning (TTT) of pre-trained transformer models. This approach bypasses manual annotations and intermediate steps, surpassing state-of-the-art methods like DiffMS on popular benchmarks.
Business Value
Accelerates drug discovery and natural product identification by enabling faster and more accurate elucidation of unknown molecular structures from spectral data.