Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
General ReasoningGeneral Reasoning Suite Average
Pass@178.3
63
General ReasoningGeneral Reasoning Suite MMLU Pro, Super GPQA, GPQA Diamond, BBEH
MMLU Pro84
35
Showing 2 of 2 rows