Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM agent alignment evaluation on 1000 prompts (test)
Loading...
1
Usefulness Score
LLM Default
0.95
0.975
1
1.025
Apr 19, 2026
Usefulness Score
Neutrality Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Usefulness Score
Neutrality Score
LLM Default
Training duration (met...
2026.04
1
0
LLM DReST
Training duration (met...
2026.04
1
0.966
Feedback
Search any
task
Search any
task