Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

JEEBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Math reasoningJEEBench
Accuracy74.4
82
Downstream AccuracyJEEBench
Accuracy44.8
12
Scientific ReasoningJEEBench
Mean Accuracy52.28
2
Showing 3 of 3 rows