Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Decision Making on Decision Making (ID)
Loading...
47.9
Accuracy
TIMEOMNI-1
3.7
15.175
26.65
38.125
Sep 29, 2025
Accuracy
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Success Rate
TIMEOMNI-1
Base LLM=Qwen2.5-Instr...
2025.09
47.9
100
Mistral-Small-3.1-24B-Ins
2025.09
44.7
100
GPT-4.1-Nano
2025.09
28.9
99.5
Time-R1
Base LLM=Qwen2.5-Instr...
2025.09
27.8
95.7
GPT-4.1-2025-04-14
2025.09
25.5
100
Qwen2.5-Instruct-7B
2025.09
25.5
100
Mistral-7B-v0.3
2025.09
24.3
94.2
Time-MQA
Base LLM=Qwen2.5-7B
2025.09
23.8
58
Llama-3.1-70B-Instruct
2025.09
20.3
96.8
Time-MQA
Base LLM=Llama3-8B
2025.09
12
13.3
Llama-3.1-8B-Instruct
2025.09
7.4
28.7
ChatTS
2025.09
5.8
27.1
Time-MQA
Base LLM=Mistral-7B-v0.3
2025.09
5.4
36.1
Feedback
Search any
task
Search any
task