Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Harmlessness on Template T3 GPT-4 evaluation (test)
Loading...
87.5
Win Rate
SafeDPO
25.2248
41.3924
57.56
73.7276
May 26, 2025
Win Rate
Tie Rate
Lose Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Win Rate
Tie Rate
Lose Rate
SafeDPO
Judge Model=GPT-4, Eva...
2025.05
87.5
10.38
2.12
SafeRLHF
Judge Model=GPT-4, Eva...
2025.05
68.75
19.38
11.88
DPO-HARMLESS
Judge Model=GPT-4, Eva...
2025.05
58.38
33.25
8.38
DPO-SAFEBETTER
Judge Model=GPT-4, Eva...
2025.05
43.88
45.5
10.62
DPO-HELPFUL
Judge Model=GPT-4, Eva...
2025.05
27.62
49.62
22.75
Feedback
Search any
task
Search any
task