Share your thoughts, 1 month free Claude Pro on usSee more

Contextual Understanding and Reasoning on OpenHuEval

63.03HuWildBench WBScore

Qwen3-4B

Updated 1mo ago

Evaluation Results

Method	Links
Qwen3-4B 2026.01		63.03	7.3	62.47	74.98	39.59	5.94	13.2	1.08	33.44
Racka-4B 2026.01		57.17	10.05	61.94	77.53	38.93	4.68	18.98	2.15	33.93
Qwen3-4B-Base 2026.01		52.59	5.91	41.15	0	42.3	5.58	0	0	18.44
PULI-LlumiX-Llama-3.1 8B 2026.01		17.77	20.03	75.86	77.36	33.54	3.96	29.16	2.15	32.47