Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General LLM Evaluation on Overall
Loading...
38.74
Overall Score
UM-190k
33.072
34.5435
36.015
37.4865
Nov 14, 2025
Overall Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Overall Score
UM-190k
Training Dataset=UM-190k
2025.11
38.74
UM-187k
Training Dataset=UM-187k
2025.11
38.55
UM-170k
Training Dataset=UM-170k
2025.11
37.78
TuluDPO
Training Dataset=TuluDPO
2025.11
37.63
ORPO
Training Dataset=ORPO
2025.11
36.56
UltraFB
Training Dataset=UltraFB
2025.11
35.68
CodePref
Training Dataset=CodePref
2025.11
35.3
HelpSteer
Training Dataset=HelpS...
2025.11
34.99
SFT
Training Dataset=SFT
2025.11
33.29
Feedback
Search any
task
Search any
task