Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Preference evaluation on LLM-as-a-Judge comparison set
Loading...
33.7
TCRM Better Rate
TCRM
32.015
32.8575
33.7
34.5425
Apr 24, 2026
TCRM Better Rate
Tie Rate
Baseline Better Rate
Order Dependence Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
TCRM Better Rate
Tie Rate
Baseline Better Rate
Order Dependence Rate
TCRM
training_setup=PPO, tu...
2026.04
33.7
9.2
34.3
22.8
Feedback
Search any
task
Search any
task