Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Edge Sorting on fundamental dynamic graph tasks Level 0
Loading...
100
Accuracy
o3
-2.96
23.77
50.5
77.23
Dec 24, 2025
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy
o3
2025.12
100
Deepseek-R1
2025.12
100
o3
2025.12
96
Deepseek-R1
2025.12
94
DeepSeek-R1-Distill-Qwen-14B
2025.12
94
DeepSeek-R1-Distill-Qwen-32B
2025.12
93
QwQ-32B
2025.12
93
DeepSeek-R1-Distill-Qwen-32B
2025.12
92
QwQ-32B
2025.12
87
DeepSeek-R1-Distill-Qwen-14B
2025.12
80
GPT-4o-mini
2025.12
71
Qwen2.5-32B-Instruct
2025.12
46
DeepSeek-R1-Distill-Qwen-7B
2025.12
39
Qwen2.5-32B-Instruct
2025.12
38
DeepSeek-R1-Distill-Qwen-7B
2025.12
34
openPangu-7B
2025.12
32
openPangu-7B
2025.12
21
GPT-4o-mini
2025.12
17
Random Baseline
2025.12
1
Random Baseline
2025.12
1
Feedback
Search any
task
Search any
task