Share your thoughts, 1 month free Claude Pro on usSee more

Reward Prediction on Human preference annotation (test)

30Effort

Qwen2.5-7B-Instr.

Updated 5mo ago

Evaluation Results

Method	Links
Qwen2.5-7B-Instr. 2026.01		30	26	28	28
GPT-4.1 2026.01		44	22	30	32
gpt-oss-20b 2026.01		44	32	35	37
GPT-4.1 2026.01		52	25	31	36
GPT-5 2026.01		56	54	49	53
Gemini 2.5 Flash 2026.01		57	25	29	37
Gemini 2.5 Flash 2026.01		61	53	45	53
IntelliReward 2026.01		70	76	70	72