Share your thoughts, 1 month free Claude Pro on usSee more

Multi-agent policy synthesis on Cleanup

2.75U Score

Gemini 3.1 Pro

Updated 4mo ago

Evaluation Results

Method	Links
Gemini 3.1 Pro 2026.03		2.75	0.54	432.6
Gemini 3.1 Pro 2026.03		1.79	0.13	386
Claude Sonnet 4.6 2026.03		1.37	0.09	294.6
Claude Sonnet 4.6 2026.03		1.14	0.47	233
Claude Sonnet 4.6 2026.03		1.01	3.06	137
GEPA (Gemini 3.1 Pro) 2026.03		0.77	1.75	209.5
Gemini 3.1 Pro 2026.03		0.45	0.45	274.1
Q-learner 2026.03		0.16	0.2	208.6
BFS Collector 2026.03		0.1	0.61	16.4