Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on Runbook
Loading...
4.58
GenericJudge Score
GPT-5 + HYVE
4.2472
4.3336
4.42
4.5064
Apr 7, 2026
GenericJudge Score
Token Usage
Latency (s)
Updated 11d ago
Evaluation Results
Method
Method
Links
GenericJudge Score
Token Usage
Latency (s)
GPT-5 + HYVE
Optimization=HYVE pipe...
2026.04
4.58
1.1
36.79
GPT-5
Optimization=Standard...
2026.04
4.55
1.1
38.89
GPT-4.1 + HYVE
Optimization=HYVE pipe...
2026.04
4.32
286.9
7.18
GPT-4.1
Optimization=Standard...
2026.04
4.26
289.7
7.58
Feedback
Search any
task
Search any
task