Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CaSiNo

Benchmarks

Task NameDataset NameSOTA ResultTrend
Bargaining AllocationCaSiNo
IR (%)100
8
Opponent priority-ordering predictionCaSiNo (5-fold CV)
EMA53.84
5
Opponent modelingCaSiNo (External)
EMA53.84
5
NegotiationCaSiNo
IR1
3
Negotiation performance and belief calibrationCaSiNo native (held-out dialogues)
Accept-F10.947
3
NegotiationCaSiNo (test)
IR1
1
Negotiation Comprehension & SimulationAugmented CaSiNo
Success Rate (Simple Tasks)60
1
Mental State InferenceAugmented CaSiNo
Accuracy40
1
Joint Outcome & Utility PredictionCaSiNo
Accuracy95.5
1
Opponent priority-ordering predictionCaSiNo (150-dialogue held-out split)
EMA37.3
1
Showing 10 of 10 rows