Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Iterated Prisoner's Dilemma (IPD)
Loading...
2.745
Payoff per Round
SEPO
1.59684
1.89492
2.193
2.49108
May 29, 2026
Payoff per Round
Exploitation Score
Extortion Score
Safety Score
NRA
Updated 2d ago
Evaluation Results
Method
Method
Links
Payoff per Round
Exploitation Score
Extortion Score
Safety Score
NRA
SEPO
Backbone=Gemma 4 E4B-i...
2026.05
2.745
0.703
15.9
0.772
-0.09
Base
Backbone=Gemma 4 E4B-it
2026.05
2.741
0.703
15.5
0.761
-0.091
SFT
Backbone=Gemma 4 E4B-it
2026.05
2.719
0.828
15.3
0.455
-0.105
SEPO
Backbone=Qwen 3.5-4B,...
2026.05
2.379
1.25
26.1
-1.847
-0.08
SFT
Backbone=Qwen 3.5-4B
2026.05
1.773
3.125
27.8
-7.894
-0.225
Base
Backbone=Qwen 3.5-4B
2026.05
1.641
5
14.6
-13.109
-0.372
Feedback
Search any
task
Search any
task