Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reasoning Tasks

Benchmarks

Task NameDataset NameSOTA ResultTrend
ReasoningReasoning Tasks Average
Average Score68.6
32
Zero-shot EvaluationReasoning tasks
Reasoning Accuracy70.7
7
Showing 2 of 2 rows