Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Moral Judgment on ETHICS Justice (test)
Loading...
0.9
Mean Shift (Positive Emotion)
Qwen-3-8B
0.0264
0.2532
0.48
0.7068
Apr 21, 2026
Mean Shift (Positive Emotion)
Mean Shift (Negative Emotion)
Collapse Rate (Positive Emotion)
Collapse Rate (Negative Emotion)
Flip Rate (Positive Emotion)
Flip Rate (Negative Emotion)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Mean Shift (Positive Emotion)
Mean Shift (Negative Emotion)
Collapse Rate (Positive Emotion)
Collapse Rate (Negative Emotion)
Flip Rate (Positive Emotion)
Flip Rate (Negative Emotion)
Qwen-3-8B
Parameters=8B
2026.04
0.9
1.09
52
40
18
19
Qwen-3-30B
Parameters=30B
2026.04
0.79
0.1
26
44
3
4
Llama-3.1-8B
Parameters=8B
2026.04
0.76
1.3
37
51
11
20
GPT-OSS-20B
Parameters=20B
2026.04
0.74
0.13
36
42
14
16
GPT-5.1
2026.04
0.22
0.29
18
30
4
3
Llama-3.3-70B
Parameters=70B
2026.04
0.15
0.42
26
50
7
7
Gemini-3-Flash
2026.04
0.06
0.44
18
58
3
4
Feedback
Search any
task
Search any
task