Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Causality Discovery OOD Evaluation
Loading...
100
Success Rate
GPT-4.1-2025-04-14
-2.024
24.463
50.95
77.437
Sep 29, 2025
Success Rate
ACC
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
ACC
GPT-4.1-2025-04-14
2025.09
100
35.9
Mistral-Small-3.1-24B-Ins
2025.09
100
25.8
Qwen2.5-Instruct-7B
2025.09
100
26.3
Llama-3.1-70B-Instruct
2025.09
99.9
28.9
TIMEOMNI-1
Base LLM=Qwen2.5-Instr...
2025.09
99.8
64
GPT-4.1-Nano
2025.09
98.4
28
Mistral-7B-v0.3
2025.09
82.6
26.9
Time-MQA
Base LLM=Mistral-7B-v0.3
2025.09
52.2
4
Time-R1
Base LLM=Qwen2.5-Instr...
2025.09
48.9
31.4
Time-MQA
Base LLM=Llama3-8B
2025.09
37.2
31.2
Time-MQA
Base LLM=Qwen2.5-7B
2025.09
32
30.5
ChatTS
2025.09
26.7
18.6
Llama-3.1-8B-Instruct
2025.09
1.9
-
Feedback
Search any
task
Search any
task