Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Human Preference Alignment benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Human Preference Alignment
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
PKU-SafeRLHF
EXO-BT
BLEU
0.324
31
2mo ago
MT-Bench
RAG-Pref
SAG
6.83
20
21d ago
PickScore
SuperFlow
PickScore
86.851
20
23d ago
MM-AlignBench 1.0 (test)
Claude3.5V-Sonnet
Win Rate
84.9
18
3mo ago
HPS v2.1
Infinity-EE-26
Anime Score
32.06
16
1mo ago
Out-of-Domain (test)
Annotators
Agreement
81.8
15
2mo ago
HH (test)
TRE-P
Reward
3.8764
14
3mo ago
REACT-Video
REACT
Acc (Tie, Overall)
61
12
3mo ago
HPD v2
BranchGRPO
HPS-v2.1
0.379
10
2mo ago
HPD v2 (test)
DanceGRPO
HPSv2.1
0.371
7
2mo ago
Human Preference Alignment Out-of-Domain (test)
TAFS-GRPO
HPS-v2.1
35.3
7
3mo ago
Human Preference Alignment In-Domain (test)
TAFS-GRPO
Pick Score
22.46
7
3mo ago
DrawBench
Flow-GRPO
HPS-v2.1
37.7
6
21d ago
SD3.5 Medium
TMPO
HPS-v2.1
0.361
6
21d ago
FLUX.1 (dev)
TMPO
HPS-v2.1
0.36
6
21d ago
Multi-Challenge
Qwen3-30A3-2507
Avg@3
49.4
6
3mo ago
ArenaHard V2
Qwen3-30A3-2507
Avg@3 Score
60
6
3mo ago
DrawBench Held-out (test)
RAM
PickScore (Training Reward)
23.67
5
21d ago
Human Preference Alignment
OP-GRPO
PickScore
23.64
5
1mo ago
HPDv2
TreeGRPO
HPS-v2.1
0.3735
5
3mo ago
PickScore
VGPO
PickScore (Task)
23.55
5
3mo ago
VideoGen-RewardBench (test)
VideoReward
VQ Acc (w/ Tie)
66
5
3mo ago
User study dataset HH-RLHF and PKU-SafeRLHF prompts (test)
DPO-HPS
Quality Score
3.93
4
2mo ago
DrawBench Task-specific (test)
DenseGRPO
PickScore (Task Metric)
24.64
4
3mo ago
PKU-Safety
DPO-HPS
Win Rate
67.1
3
2mo ago
Showing 25 of 26 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs