Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Human Preference for Oral Argument Simulation

Benchmarks

Task NameDataset NameSOTA ResultTrend
Human Preference EvaluationHuman Preference for Oral Argument Simulation (Evaluation set)
Wins72
9
Showing 1 of 1 rows