Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-fidelity Bandit Optimization on LLM-as-a-judge residual-mismatch Λ=128000 (test)

4,023.4Mean Cost-Weighted Pseudo-Regret

TACC

3,969.9724,330.6114,691.255,051.889May 8, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
4,023.4247.3
2026.05
5,083.2286.8
2026.05
5,201289.7
2026.05
5,359.1281.8