Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Aggregated LLM Evaluation on 8 Standard Benchmarks Aggregate

73.7Average Accuracy

Full model

57.89261.99666.170.204May 1, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.05
73.7
2026.05
71
2026.05
66.5
2026.05
60.5
2026.05
58.5