Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Trustworthy language models should provide both correct and verifiable
answers. However, citations generated directly by standalone LLMs are often
unreliable. As a result, current systems insert citations by querying an
external retriever at inference time, introducing latency, infrastructure
dependence, and vulnerability to retrieval noise. We explore whether LLMs can
be made to reliably attribute to the documents seen during continual
pretraining without test-time retrieval, by revising the training process. To
study this, we construct CitePretrainBench, a benchmark that mixes real-world
corpora (Wikipedia, Common Crawl, arXiv) with novel documents and probes both
short-form (single-fact) and long-form (multi-fact) citation tasks. Our
approach follows a two-stage process: (1) continual pretraining to index
factual knowledge by binding it to persistent document identifiers; and (2)
instruction tuning to elicit citation behavior. We introduce Active Indexing
for the first stage, which creates generalizable, source-anchored bindings by
augmenting training with synthetic data that (i) restate each fact in diverse,
compositional forms and (ii) enforce bidirectional training (source-to-fact and
fact-to-source). This equips the model to both generate content from a cited
source and attribute its own answers, improving robustness to paraphrase and
composition. Experiments with Qwen-2.5-7B&3B show that Active Indexing
consistently outperforms a Passive Indexing baseline, which simply appends an
identifier to each document, achieving citation precision gains of up to 30.2%
across all tasks and models. Our ablation studies reveal that performance
continues to improve as we scale the amount of augmented data, showing a clear
upward trend even at 16x the original token count. Finally, we show that
internal citations complement external ones by making the model more robust to
retrieval noise.
Authors (5)
Yukun Huang
Sanxing Chen
Jian Pei
Manzil Zaheer
Bhuwan Dhingra
Key Contributions
This paper proposes a retrieval-free method for knowledge attribution in LLMs, enabling them to reliably cite sources seen during continual pretraining without requiring test-time retrieval. It introduces a two-stage process: continual pretraining with 'Active Indexing' to bind factual knowledge to document identifiers, followed by instruction tuning to elicit citation behavior, aiming to produce correct and verifiable answers.
Business Value
Increases the trustworthiness and reliability of LLM-generated content, crucial for applications requiring factual accuracy and verifiability, such as research, journalism, and legal services.