Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
State Transition Graph on STG High/Huge (OOD)
Loading...
53.4
Accuracy
Qwen2.5-1.5B
24.488
31.994
39.5
47.006
Nov 27, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen2.5-1.5B
Architecture=AR
2025.11
53.4
Llama-3.1-8B
Architecture=AR
2025.11
49.6
C2DLM
Architecture=DLM
2025.11
32.4
Llama-3.2-1B
Architecture=AR
2025.11
27.9
LLaDA-8B-Instruct
Architecture=DLM, Trai...
2025.11
25.6
Feedback
Search any
task
Search any
task