Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Alignment on Base Model Evaluation Set
Loading...
79.93
Win Rate
AlignX
11.1444
29.0022
46.86
64.7178
Feb 7, 2026
Win Rate
Success Score
TI Score
Average Error
Updated 1mo ago
Evaluation Results
Method
Method
Links
Win Rate
Success Score
TI Score
Average Error
AlignX
Backbone=DeepSeek-7B
2026.02
79.93
35.88
74.91
39.65
AlignX
Backbone=Mistral-7B
2026.02
78.65
36.95
72.45
38.72
AlignX
Backbone=Gemma-7B
2026.02
75.8
38.1
69.85
35.85
AlignX
Backbone=LLaMA-2-7B
2026.02
37.45
40.2
39.6
12.28
TrinityX
Backbone=LLaMA-2-7B
2026.02
36.75
41.03
40.66
12.12
H3Fusion
2026.02
13.79
42
18.82
-3.13
Feedback
Search any
task
Search any
task