Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

WildBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction FollowingWildBench (test)
Info Seek58.6
27
General Instruction FollowingWildBench
Score92.6
19
General chatWildBench 2025 (test)
WB-Elo1,062.4
12
Subjective EvaluationWildBench
Score0.8604
5
Open-ended text generationWildBench
Score-1.7
4
General Language Model EvaluationWildBench
WildBench Score26.95
2
Showing 6 of 6 rows