Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-agent policy synthesis on Cleanup

2.75U Score

Gemini 3.1 Pro

-0.0060.70951.4252.1405Mar 19, 2026
Updated 2mo ago

Evaluation Results

MethodLinks
2026.03
2.750.54432.6
2026.03
1.790.13386
2026.03
1.370.09294.6
2026.03
1.140.47233
2026.03
1.013.06137
0.771.75209.5
2026.03
0.450.45274.1
2026.03
0.160.2208.6
0.10.6116.4