Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Riddle

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringRiddle (test)
Accuracy67.84
45
Multiple Choice Question AnsweringRiddle
Accuracy76.62
24
Multiple Choice Question AnsweringRiddle (test)
Accuracy67.84
21
Logic ReasoningRiddle 1.0 (test)
F1 Score69
7
Commonsense ReasoningRiddle
Accuracy71.3
4
Showing 5 of 5 rows