Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
State Transition Graph on STG High/Huge (IID)
Loading...
57.8
Accuracy
Llama-3.1-8B
23.48
32.39
41.3
50.21
Nov 27, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Llama-3.1-8B
Architecture=AR
2025.11
57.8
Qwen2.5-1.5B
Architecture=AR
2025.11
51.5
Llama-3.2-1B
Architecture=AR
2025.11
37.4
C2DLM
Architecture=DLM
2025.11
35.2
LLaDA-8B-Instruct
Architecture=DLM, Trai...
2025.11
24.8
Feedback
Search any
task
Search any
task