Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Structural Bias Evaluation on MNLI
Loading...
98.1
Accuracy
C2PO
35.492
51.746
68
84.254
Dec 29, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
C2PO
Backbone=DeepSeek-R1-D...
2025.12
98.1
GRPO
Backbone=DeepSeek-R1-D...
2025.12
92.5
C2PO
Backbone=LLaMA-2-13B-Chat
2025.12
86.2
CPO
Backbone=LLaMA-2-13B-Chat
2025.12
84.1
IPO
Backbone=LLaMA-2-13B-Chat
2025.12
81.2
FR
Backbone=LLaMA-2-13B-Chat
2025.12
80.3
FR
Backbone=DeepSeek-R1-D...
2025.12
79.2
DPO
Backbone=LLaMA-2-13B-Chat
2025.12
67.2
CPO
Backbone=DeepSeek-R1-D...
2025.12
65.1
IPO
Backbone=DeepSeek-R1-D...
2025.12
64.8
BCO
Backbone=DeepSeek-R1-D...
2025.12
64.7
BCO
Backbone=LLaMA-2-13B-Chat
2025.12
64.4
GRPO
Backbone=LLaMA-2-13B-Chat
2025.12
58.4
DPO
Backbone=DeepSeek-R1-D...
2025.12
37.9
Feedback
Search any
task
Search any
task