Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Contextual Understanding and Reasoning on OpenHuEval
Loading...
63.03
HuWildBench WBScore
Qwen3-4B
15.9596
28.1798
40.4
52.6202
Jan 3, 2026
HuWildBench WBScore
HuSimpleQA Acc
HuProverbRea Acc OE
HuProverbRea Acc 2CQ
HuMatchingFIB B Acc
HuMatchingFIB Q Acc
HuStandardFIB B Acc
HuStandardFIB Q Acc
Overall Performance (OpenHuEval)
Updated 4d ago
Evaluation Results
Method
Method
Links
HuWildBench WBScore
HuSimpleQA Acc
HuProverbRea Acc OE
HuProverbRea Acc 2CQ
HuMatchingFIB B Acc
HuMatchingFIB Q Acc
HuStandardFIB B Acc
HuStandardFIB Q Acc
Overall Performance (OpenHuEval)
Qwen3-4B
Implementation patchin...
2026.01
63.03
7.3
62.47
74.98
39.59
5.94
13.2
1.08
33.44
Racka-4B
Implementation patchin...
2026.01
57.17
10.05
61.94
77.53
38.93
4.68
18.98
2.15
33.93
Qwen3-4B-Base
Implementation patchin...
2026.01
52.59
5.91
41.15
0
42.3
5.58
0
0
18.44
PULI-LlumiX-Llama-3.1 8B
Implementation patchin...
2026.01
17.77
20.03
75.86
77.36
33.54
3.96
29.16
2.15
32.47
Feedback
Search any
task
Search any
task