Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Correlation Reasoning on Spatio-Temporal Synthetic Dataset 1.0 (test)
Loading...
87.12
Accuracy
STReasoner-8B (Ours)
2.2664
24.2957
46.325
68.3543
Jan 6, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
STReasoner-8B (Ours)
Category=Spatio-Tempor...
2026.01
87.12
Qwen3-VL-8B-Instruct - SFT+S-GRPO
Category=Open-Source M...
2026.01
83.92
Qwen3-8B - SFT+S-GRPO
Category=Open-Source M...
2026.01
81.34
Qwen3-VL-8B-Instruct - SFT
Category=Open-Source M...
2026.01
80.78
Qwen3-8B - SFT
Category=Open-Source M...
2026.01
79.84
Claude-4.5-Sonnet
Category=Proprietary M...
2026.01
77.87
Claude-4.5-Sonnet
Category=Proprietary M...
2026.01
76.04
GPT-5.2
Category=Proprietary M...
2026.01
65.08
GPT-5.2
Category=Proprietary M...
2026.01
58.79
Qwen3-VL-8B-Instruct
Category=Open-Source M...
2026.01
53.52
Time-R1-7B
Category=Time Series R...
2026.01
48.62
ChatTS-8B
Category=Time Series L...
2026.01
41.08
Time-MQA-7B
Category=Time Series L...
2026.01
17.9
Qwen3-8B
Category=Open-Source M...
2026.01
5.53
Feedback
Search any
task
Search any
task