Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Fermi

Benchmarks

Task NameDataset NameSOTA ResultTrend
Fermi Problem SolvingFermi
Pass@1 Accuracy43.51
24
Multi-hop Open-domain Question AnsweringFermi
Accuracy65.1
6
Question AnsweringFERMI (test)
Accuracy40.1
4
Showing 3 of 3 rows