Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ThmQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
math reasoningThmQA
Multi@5 Accuracy34.3
16
Theorem Question AnsweringThmQA
Pass@126.12
15
Showing 2 of 2 rows