Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Following on UltraChat
Loading...
67.8
RM Score
SFT + TTL
64.056
65.028
66
66.972
May 8, 2026
RM Score
Win Rate
Toxicity
Updated 23d ago
Evaluation Results
Method
Method
Links
RM Score
Win Rate
Toxicity
SFT + TTL
Base LLM=Qwen2.5-7B-In...
2026.05
67.8
58.4
38
Base SFT
Base LLM=Qwen2.5-7B-In...
2026.05
64.2
-
45
Feedback
Search any
task
Search any
task