Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Decision Inference on MMLU
Loading...
0.772
Accuracy
Ours
0.68568
0.70809
0.7305
0.75291
Feb 18, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Ours
2025.02
0.772
KTO
Framework=RLHF
2025.02
0.753
DPO
Framework=RLHF
2025.02
0.746
ReFT
Framework=Reasoning
2025.02
0.743
Skywork
Framework=RLAIF
2025.02
0.737
PPO
Framework=RLHF
2025.02
0.723
DeepSeek
2025.02
0.721
o3-mini
2025.02
0.707
CoT+SFT
Framework=Reasoning
2025.02
0.705
Proxy LLM
2025.02
0.703
SFT
2025.02
0.689
Feedback
Search any
task
Search any
task