Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Machine Learning on Timely-Eval
Loading...
0.939
Leaf Classification Accuracy
TimelyLM-8B
0.29524
0.46237
0.6295
0.79663
Jan 23, 2026
Leaf Classification Accuracy
Space-ship Accuracy
Pizza Accuracy
Insult Detection Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Leaf Classification Accuracy
Space-ship Accuracy
Pizza Accuracy
Insult Detection Accuracy
TimelyLM-8B
size=8B
2026.01
0.939
0.744
0.761
0.84
DeepSeek-V3.2
2026.01
0.72
0.681
0.686
0.78
Gemini2.5-pro
variant=pro
2026.01
0.714
0.654
0.746
0.751
GPT-5.1(medium)
variant=medium
2026.01
0.675
0.635
0.58
0.79
Qwen3-32B
size=32B
2026.01
0.518
0.501
0.59
0.572
Qwen3-14B
size=14B
2026.01
0.464
0.559
0.607
0.706
Qwen3-8B
size=8B
2026.01
0.32
0.178
0.286
0.519
Feedback
Search any
task
Search any
task