Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Prompt Optimization on Composite Suite (HotpotQA, IFBench, HoVer, PUPA, AIME, LiveBench-Math)

69HotpotQA Score

GEPA

36.7645.1353.561.87Jul 25, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.07
6952.7251.6794.4759.3364.1365.2212.19
2025.07
65.6755.9556.6796.4659.3364.1366.3613.33
2025.07
65.6749.8354.6790.0552.6759.3162.039
2025.07
62.3348.6447.6785.6846.6763.8459.146.11
2025.07
60.3351.194674.1845.3360.7456.33.27
2025.07
5849.1548.3383.3751.3361.8458.675.64
2025.07
3847.7946.3378.5749.3358.253.03-
2025.07
3852.0451.3391.8548.6760.9757.144.11