Share your thoughts, 1 month free Claude Pro on usSee more

Multi-trait Automated Essay Scoring on ASAP Prompt 7 (test)

69.5Ideas Score

Human Rater 1 - Human Rater 2

Updated 4mo ago

Evaluation Results

Method	Links
Human Rater 1 - Human Rater 2 2026.02		69.5	57.6	56.7	54.4	62
Llama 4 (multi-agent prompting framework) 2026.02		69.3	64.7	61.5	59.7	63.8
Llama 4 (No Examples) 2026.02		60.1	49.8	48.8	44.5	50.8
Llama 4 (Reduced Rubric) 2026.02		57.5	54.1	63.9	49	56.1
Llama 2 2026.02		9.1	2.3	32.7	15.4	15.1
GPT 3.5 2026.02		4.5	6.8	9.7	7.9	7.3