Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FLenQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
ReasoningFLenQA 1000 tokens
Accuracy78.5
15
ReasoningFLenQA 500 tokens
Accuracy74
15
ReasoningFLenQA 250 tokens
Accuracy80
15
ReasoningFLenQA 3000 tokens
Accuracy39.3
9
ReasoningFLenQA 2000 tokens
Accuracy52.5
9
Showing 5 of 5 rows