Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-hop verification on HoVer
Loading...
1,255
LLM Throughput (token/s)
FlashEvolve
97.48
397.99
698.5
999.01
May 8, 2026
LLM Throughput (token/s)
Proposal Throughput (proposal/min)
Updated 22d ago
Evaluation Results
Method
Method
Links
LLM Throughput (token/s)
Proposal Throughput (proposal/min)
FlashEvolve
Deployment=vLLM, Model...
2026.05
1,255
5.9
Combee
Deployment=vLLM, Model...
2026.05
891
2
Combee
Deployment=vLLM, Model...
2026.05
810
2
GEPA
Deployment=vLLM, Model...
2026.05
461
2.5
FlashEvolve
Deployment=API, Model=...
2026.05
352
9.1
Combee
Deployment=API, Model=...
2026.05
348
0.8
Combee
Deployment=API, Model=...
2026.05
214
0.7
GEPA
Deployment=API, Model=...
2026.05
142
1.8
Feedback
Search any
task
Search any
task