Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Human Preference Evaluation on Minecraft Human Preference Study (model-vs-model trials)
Loading...
290
Wins
MD4
209.92
230.71
251.5
272.29
Apr 22, 2026
Wins
Appearances
Preference Rate
Preference Rate (95% CI Lower Bound)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Wins
Appearances
Preference Rate
Preference Rate (95% CI Lower Bound)
MD4
version=p2
2026.04
290
505
57.4
53.1
DDPM
version=p2
2026.04
282
530
53.2
49
MD4
version=p4
2026.04
255
517
49.3
45
Real
2026.04
213
528
40.3
36.2
Feedback
Search any
task
Search any
task