Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Aggregate Model Evaluation

Benchmarks

Task NameDataset NameSOTA ResultTrend
General Model Performance EvaluationAggregate Model Evaluation 16 benchmarks
Average Score45.8
2
Showing 1 of 1 rows