arxiv_ml 85% Match Research Paper Materials scientists,ML researchers,Computational chemists,Data scientists in materials 2 weeks ago

AtomBench: A Benchmark for Generative Atomic Structure Models using GPT, Diffusion, and Flow Architectures

generative-ai › diffusion

📄 Abstract

Abstract: Generative models have become significant assets in the exploration and identification of new materials, enabling the rapid proposal of candidate crystal structures that satisfy target properties. Despite the increasing adoption of diverse architectures, a rigorous comparative evaluation of their performance on materials datasets is lacking. In this work, we present a systematic benchmark of three representative generative models- AtomGPT (a transformer-based model), Crystal Diffusion Variational Autoencoder (CDVAE), and FlowMM (a Riemannian flow matching model). These models were trained to reconstruct crystal structures from subsets of two publicly available superconductivity datasets- JARVIS Supercon 3D and DS A/B from the Alexandria database. Performance was assessed using the Kullback-Leibler (KL) divergence between predicted and reference distributions of lattice parameters, as well as the mean absolute error (MAE) of individual lattice constants. For the computed KLD and MAE scores, CDVAE performs most favorably, followed by AtomGPT, and then FlowMM. All benchmarking code and model configurations will be made publicly available at https://github.com/atomgptlab/atombench_inverse.

Authors (3)

Charles Rhys Campbell

Aldo H. Romero

Kamal Choudhary

Submitted

October 17, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

This paper presents AtomBench, a systematic benchmark for evaluating generative atomic structure models, including GPT, Diffusion, and Flow architectures. It addresses the lack of rigorous comparative evaluation by assessing performance on superconductivity datasets using KL divergence and MAE, providing insights into the strengths and weaknesses of different generative approaches for materials discovery.

Business Value

Accelerates the discovery of new materials with desired properties, potentially leading to breakthroughs in areas like superconductivity, energy storage, and catalysis, by providing a standardized way to evaluate and select the best generative models.

Paper Metadata

Innovation Type

Benchmark and Evaluation Framework

Deployment Feasibility

High, as it provides a framework for evaluating existing and future models, aiding in the selection of the most effective ones for practical material design.

Limitations Addressed

Lack of standardized and rigorous comparative evaluation for diverse generative models in materials science.

Technical Tags

generative modelscrystal structure predictiontransformerdiffusion modelsflow matchingbenchmarkmaterials sciencesuperconductivity

Research Topics

Generative ModelingMaterials DiscoveryModel EvaluationDeep Learning Architectures

Methods & Architectures

AtomGPT (transformer-based)Crystal Diffusion Variational Autoencoder (CDVAE)FlowMM (Riemannian flow matching)Comparative evaluationTraining generative models TransformerVariational AutoencoderFlow Matching

Applications & Tasks

Materials Science Drug Discovery Chemistry Generative modeling for materialsComparative benchmarkingStructure prediction Generating novel crystal structuresEvaluating generative model performancePredicting material properties

Datasets & Benchmarks

Datasets

JARVIS Supercon 3D, DSA/B from the Alexandria database

Kullback-Leibler (KL) divergenceMean Absolute Error (MAE)

Related Fields

Computational Materials ScienceMachine LearningDeep LearningGenerative Models

Keywords

generative modelsatomic structurebenchmarkGPTdiffusion modelsflow architecturesmaterials discoverycrystal structuressuperconductivitycomparative evaluationKL divergenceMAEAtomBench

Academic Context

#Generative Modeling#Materials Discovery#Model Evaluation#Deep Learning Architectures

Commercial Potential

Potential Products

Material design platformsAI-driven R&D tools

Target Industries

Materials ManufacturingSemiconductorsEnergyPharmaceuticals

Use Case Examples

Accelerating the search for new superconductorsDesigning novel catalystsDiscovering new battery materials

Competitive Edge

Provides a standardized evaluation framework, enabling direct comparison of different generative model architectures on specific materials science tasks, unlike ad-hoc evaluations.

Market Opportunity

Growing market for AI in materials discovery.

Revenue Models

N/A (benchmark)

Resource Requirements

Compute Needs

Moderate to High (for training and evaluating multiple large generative models)

Data Requirements

Large datasets of crystal structures with associated properties (e.g., superconductivity data).

Deployment Constraints

Requires expertise in generative modeling and materials science for effective use and interpretation.

Scalability

The benchmark itself is scalable to new models and datasets. Model training scalability depends on the specific architecture.

Production Readiness

Maturity Level

Research/Development

Time to Market

N/A (benchmark)

Patent Potential

Low (benchmark framework itself)

View Full Paper Back to Papers