Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Mathematical reasoning is central to artificial intelligence, with
applications in education, code generation, and research-level mathematical
discovery. Mathematical competitions highlight two problem types: theorem
proving, requiring rigorous proofs, and answer construction, requiring creative
generation and formal verification of mathematical objects. Existing research
reveals that LLMs can tackle difficult answer-construction tasks but are prone
to errors from hallucinations and unverifiable steps, while symbolic methods
guarantee rigor but falter in creative answer construction. This raises a key
understudied question: how to solve answer-construction problems while
preserving both LLM creativity and mathematical rigor? To address this problem,
we introduce the Enumerate-Conjecture-Prove (ECP) framework, a modular
neuro-symbolic method integrating LLM-based enumeration and pattern-driven
conjecturing with formal theorem proving in Lean, and ConstructiveBench, a
dataset of 3,640 formal answer-construction problems from math competitions.
ECP is model agnostic and shows consistent improvements over pure LLM
baselines: on the subset of PutnamBench for answer construction, ECP formally
solves 6 out of 337 answer-construction problems end to end (up from 4 without
ECP) using GPT-5 mini and DeepSeek-Prover-V2-7B. On ConstructiveBench, ECP
achieves 33.1% end-to-end state-of-the-art accuracy (up from 32.5%),
demonstrating its potential to advance formal mathematical reasoning by
combining LLM conjecturing with formal verification. Our code and dataset are
publicly available at GitHub (https://github.com/sunjia72/ECP) and Hugging Face
(https://huggingface.co/datasets/sunjia72/ConstructiveBench).
Authors (5)
Jialiang Sun
Yuzhi Tang
Ao Li
Chris J. Maddison
Kuldeep S. Meel
Key Contributions
Introduces the Enumerate-Conjecture-Prove (ECP) framework, a novel neuro-symbolic method that combines LLM creativity for enumeration and conjecturing with the rigor of formal theorem proving in Lean. This framework addresses the challenge of solving answer-construction problems in mathematics while preserving both LLM capabilities and formal guarantees, supported by the new ConstructiveBench dataset.
Business Value
Enhances AI's capability in formal reasoning and creative problem-solving, potentially accelerating mathematical discovery, improving educational tools, and advancing AI for scientific research.