Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Contextual Understanding and Reasoning on OpenHuEval

63.03HuWildBench WBScore

Qwen3-4B

15.959628.179840.452.6202Jan 3, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
63.037.362.4774.9839.595.9413.21.0833.44
2026.01
57.1710.0561.9477.5338.934.6818.982.1533.93
2026.01
52.595.9141.15042.35.580018.44
2026.01
17.7720.0375.8677.3633.543.9629.162.1532.47