Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Knowledge on C-EVAL
Loading...
88.12
Score
Qwen3-30B-A3B-Inst-2507
65.6352
71.4726
77.31
83.1474
Jan 27, 2026
Jan 29, 2026
Jan 31, 2026
Feb 2, 2026
Feb 4, 2026
Feb 6, 2026
Feb 9, 2026
Score
True Positive Fraction
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
True Positive Fraction
Qwen3-30B-A3B-Inst-2507
2026.02
88.12
100
Ling-flash-2.0
2026.02
87.54
100
LLaDA2.1-flash
Inference Mode=S Mode
2026.02
86.93
271
LLaDA2.1-flash
Inference Mode=Q Mode
2026.02
86.71
175
LLaDA2.0-flash
2026.02
85.21
190
Ling-mini-2.0
2026.02
82.17
-
LLaDA2.0-mini
2026.02
81.8
1.78
Qwen3-8B
no think=true
2026.02
80.6
-
LLaDA2.1-mini
mode=Q Mode
2026.02
78.59
1.91
LLaDA2.1-mini
mode=S Mode
2026.02
78.4
3.39
KEEL
Training=SFT, Evaluati...
2026.01
68.4
-
Pre-LN
Training=SFT, Evaluati...
2026.01
66.5
-
Feedback
Search any
task
Search any
task