Share your thoughts, 1 month free Claude Pro on usSee more

Human Expert Alignment on FeedEval

83.2Specificity Accuracy

Gemma3-Inst.

Updated 5mo ago

Evaluation Results

Method	Links
Gemma3-Inst. 2026.01		83.2	89.3	85.3	89.5	82.2	70.2
Llama3-3B-Inst. 2026.01		82	88	86.4	91.2	83.5	70.9
Phi-3-Mini 2026.01		81.1	86	87.1	92	82	70
Qwen2-3B-Inst. 2026.01		80.7	87	75.5	82.4	83.3	70.3
Gemini-2.5-Pro 2026.01		75.7	84.5	55.6	62.2	61.3	41.7
GPT-5.1 2026.01		72.9	83.3	58.4	69.7	64	44.5