Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
User feature quality evaluation on Honor dataset
Loading...
2.51
Rationality Score
Human
1.8756
2.0403
2.205
2.3697
Apr 26, 2026
Rationality Score
Interpretability Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Rationality Score
Interpretability Score
Human
Type=Human-written
2026.04
2.51
2.55
BUA
Type=Fine-tuned model
2026.04
2.46
2.39
No Tune
Type=Base model
2026.04
1.9
1.83
Feedback
Search any
task
Search any
task