Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-trait Automated Essay Scoring on ASAP Prompt 7 (test)
Loading...
69.5
Ideas Score
Human Rater 1 - Human Rater 2
1.9
19.45
37
54.55
Feb 2, 2026
Ideas Score
Organisation Score
Conventions Score
Style Score
Overall Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Ideas Score
Organisation Score
Conventions Score
Style Score
Overall Score
Human Rater 1 - Human Rater 2
type=Human baseline
2026.02
69.5
57.6
56.7
54.4
62
Llama 4 (multi-agent prompting framework)
Prompting Strategy=Few...
2026.02
69.3
64.7
61.5
59.7
63.8
Llama 4 (No Examples)
Prompting Strategy=Zer...
2026.02
60.1
49.8
48.8
44.5
50.8
Llama 4 (Reduced Rubric)
Prompting Strategy=Few...
2026.02
57.5
54.1
63.9
49
56.1
Llama 2
Prompting Strategy=1 shot
2026.02
9.1
2.3
32.7
15.4
15.1
GPT 3.5
Prompting Strategy=1 shot
2026.02
4.5
6.8
9.7
7.9
7.3
Feedback
Search any
task
Search any
task