Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: The advent of single-cell Assay for Transposase-Accessible Chromatin using
sequencing (scATAC-seq) offers an innovative perspective for deciphering
regulatory mechanisms by assembling a vast repository of single-cell chromatin
accessibility data. While foundation models have achieved significant success
in single-cell transcriptomics, there is currently no foundation model for
scATAC-seq that supports zero-shot high-quality cell identification and
comprehensive multi-omics analysis simultaneously. Key challenges lie in the
high dimensionality and sparsity of scATAC-seq data, as well as the lack of a
standardized schema for representing open chromatin regions (OCRs). Here, we
present ChromFound, a foundation model tailored for scATAC-seq. ChromFound
utilizes a hybrid architecture and genome-aware tokenization to effectively
capture genome-wide long contexts and regulatory signals from dynamic chromatin
landscapes. Pretrained on 1.97 million cells from 30 tissues and 6 disease
conditions, ChromFound demonstrates broad applicability across 6 diverse tasks.
Notably, it achieves robust zero-shot performance in generating universal cell
representations and exhibits excellent transferability in cell type annotation
and cross-omics prediction. By uncovering enhancer-gene links undetected by
existing computational methods, ChromFound offers a promising framework for
understanding disease risk variants in the noncoding genome.
Authors (12)
Yifeng Jiao
Yuchen Liu
Yu Zhang
Xin Guo
Yushuai Wu
Chen Jiang
+6 more
Key Contributions
ChromFound is presented as the first universal foundation model for single-cell ATAC-seq data, addressing the lack of such models for this modality. It utilizes a hybrid architecture and genome-aware tokenization to capture long genomic contexts and regulatory signals, enabling high-quality zero-shot cell identification and multi-omics analysis.
Business Value
Accelerates biological research and drug discovery by providing powerful tools for analyzing complex genomic data, potentially leading to new therapeutic targets and personalized treatments.