Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Lean Finder: Semantic Search for Mathlib That Understands User Intents

About

We present Lean Finder, a semantic search engine for Lean and mathlib that understands and aligns with the intents of mathematicians. Progress in formal theorem proving is often hindered by the difficulty of locating relevant theorems and the steep learning curve of the Lean 4 language, making advancement slow and labor-intensive. Existing Lean search engines, though helpful, rely primarily on informalizations (natural language translation of the formal statements), while largely overlooking the mismatch with real-world user queries. In contrast, we propose a user-centered semantic search tailored to the needs of mathematicians. Our approach begins by analyzing and clustering the semantics of public Lean discussions, then fine-tuning text embeddings on synthesized queries that emulate user intents. We further align Lean Finder with mathematicians' preferences using diverse feedback signals, encoding it with a rich awareness of their goals from multiple perspectives. Evaluations on real-world queries, informalized statements, and proof states demonstrate that our Lean Finder achieves over $30\%$ relative improvement compared to previous search engines and GPT-4o. In addition, Lean Finder is compatible with LLM-based theorem provers, bridging retrieval with formal reasoning. Lean Finder is available at: https://leanfinder.github.io

Jialin Lu, Kye Emond, Kaiyu Yang, Swarat Chaudhuri, Weiran Sun, Wuyang Chen• 2025

Related benchmarks

TaskDatasetResultRank
Theorem RetrievalAugmented Proof State
R@10.246
5
Theorem RetrievalRaw Proof State
Recall@18.3
5
Theorem RetrievalLean Informalized Statement
R@164.2
4
Theorem RetrievalLean Synthetic User Query (Sec. 3.1)
Recall@154.4
4
Theorem RetrievalLean Augmented Statement
R@182.7
4
Formal Mathematics RetrievalUser Study Real-world Queries
R@1139
3
RetrievalLean Retrieval Informalized Statement modality (test)
R@164.2
3
RetrievalLean Retrieval Synthetic User Query modality (test)
R@154.4
3
RetrievalLean Retrieval Augmented Statement modality (test)
R@10.827
3
RetrievalAugmented Proof State (test)
R@124.6
2
Showing 10 of 14 rows

Other info

Follow for update