Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Capability Benchmarks

Benchmarks

Task NameDataset NameSOTA ResultTrend
General Capability8 capability benchmarks Aggregate
Average Capability67.14
26
General Capability EvaluationCapability Benchmarks
Score74.32
10
Showing 2 of 2 rows