Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Auto-formalization on FormalMath-Lite
Loading...
99.8
Pass@8
LongCat-Flash-Prover
86.28
89.79
93.3
96.81
Mar 22, 2026
Pass@8
Updated 26d ago
Evaluation Results
Method
Method
Links
Pass@8
LongCat-Flash-Prover
w/ TIR=true
2026.03
99.8
LongCat-Flash-Prover
2026.03
98.6
Goedel-V2-Formalizer-8B
2026.03
98.1
Goedel-V2-Formalizer-32B
2026.03
98.1
Kimi-K2.5
2026.03
97.9
Claude-Opus-4.5
2026.03
97.9
Gemini-3 Pro
2026.03
97.4
ATF-8B-Distilled
2026.03
95.5
DeepSeek-V3.2
2026.03
95.2
ATF-32B
2026.03
94.3
StepFun-Formalizer-32B
2026.03
90.9
StepFun-Formalizer-7B
2026.03
88
Kimina-Autoformalizer-7B
2026.03
86.8
Feedback
Search any
task
Search any
task