Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Morality Evaluation on Commonsense
Loading...
10
Mean Improvement
iPASwo
0.64
3.07
5.5
7.93
Sep 25, 2025
Mean Improvement
95% CI (Lower Bound)
P-Value
Updated 16d ago
Evaluation Results
Method
Method
Links
Mean Improvement
95% CI (Lower Bound)
P-Value
iPASwo
Model=DeepSeek-R1-Dist...
2025.09
10
7
0
iPASa
Model=DeepSeek-R1-Dist...
2025.09
9
6
0
iPASa
Model=Llama-3.1-8B-Ins...
2025.09
4
1
0
iPASwo
Model=Llama-3.1-8B-Ins...
2025.09
4
1
0
PASf
Model=Llama-3.1-8B-Ins...
2025.09
4
2
0
PASf
Model=DeepSeek-R1-Dist...
2025.09
3
2
0
PASf
Model=Nous-Hermes-2-Mi...
2025.09
2
0
0.01
iPASa
Model=Nous-Hermes-2-Mi...
2025.09
1
0
0.02
iPASwo
Model=Nous-Hermes-2-Mi...
2025.09
1
0
0.05
Feedback
Search any
task
Search any
task