Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Proactive Assistant Evaluation on Synthetic Category (test)
Loading...
14
Response Frequency
LLAMAPIE
13.3
13.65
14
14.35
May 7, 2025
Response Frequency
Avg Word Length
Valuable
Pertinent
Competent
Unobtrusive
Transparent
Controllable
Deferent
Anticipatory
Safe
Rubric Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Response Frequency
Avg Word Length
Valuable
Pertinent
Competent
Unobtrusive
Transparent
Controllable
Deferent
Anticipatory
Safe
Rubric Score
LLAMAPIE
evaluation_protocol=GP...
2025.05
14
2.08
4.32
4.52
4.67
4.77
4.73
4.74
4.77
4.35
4.89
4.21
Feedback
Search any
task
Search any
task