Cleanup

Benchmarks

Task Name	Dataset Name	SOTA Result
Multi-agent policy synthesis	Cleanup	U Score2.75	9
Multi-agent Social Dilemma Equality Evaluation	Cleanup	Equality Score (E)95.9	9
Social Outcome Evaluation	Cleanup Normal (train test)	Outcome U0.6	4
Social Outcome Evaluation	Cleanup Hard (train test)	Outcome U31	4
Robot Plan Execution	Cleanup real-world	Success Rate (New Objects)3	2

Showing 5 of 5 rows