Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Large language models have demonstrated remarkable reasoning capabilities
across diverse natural language tasks. However, comparable breakthroughs in
scientific discovery are more limited, because understanding complex physical
phenomena demands multifaceted representations far beyond language alone. A
compelling example is the design of functional materials such as MOFs-critical
for a range of impactful applications like carbon capture and hydrogen storage.
Navigating their vast and intricate design space in language-based
representations interpretable by LLMs is challenging due to the numerous
possible three-dimensional atomic arrangements and strict reticular rules of
coordination geometry and topology. Despite promising early results in
LLM-assisted discovery for simpler materials systems, MOF design remains
heavily reliant on tacit human expertise rarely codified in textual information
alone. To overcome this barrier, we introduce L2M3OF, the first multimodal LLM
for MOFs. L2M3OF integrates crystal representation learning with language
understanding to process structural, textual, and knowledge modalities jointly.
L2M3OF employs a pre-trained crystal encoder with a lightweight projection
layer to compress structural information into a token space, enabling efficient
alignment with language instructions. To facilitate training and evaluation, we
curate a structure-property-knowledge database of crystalline materials and
benchmark L2M3OF against state-of-the-art closed-source LLMs such as GPT-5,
Gemini-2.5-Pro and DeepSeek-R1. Experiments show that L2M3OF outperforms
leading text-based closed-source LLMs in property prediction and knowledge
generation tasks, despite using far fewer parameters. These results highlight
the importance of multimodal approaches for porous material understanding and
establish L2M3OF as a foundation for next-generation AI systems in materials
discovery.
Authors (7)
Jiyu Cui
Fang Wu
Haokai Zhao
Minggao Feng
Xenophon Evangelopoulos
Andrew I. Cooper
+1 more
Submitted
October 23, 2025
Key Contributions
Introduces L2M3OF, the first multimodal LLM specifically designed for Metal-Organic Frameworks (MOFs). This model aims to overcome the limitations of language-only representations in scientific discovery by integrating diverse data modalities to navigate the complex design space of MOFs, which are critical for applications like carbon capture and hydrogen storage.
Business Value
Accelerates the discovery and design of novel MOFs, which are crucial for developing advanced materials for energy storage, carbon capture, and catalysis, leading to significant advancements in sustainability and industrial processes.