Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Opponent priority-ordering prediction on CaSiNo (150-dialogue held-out split)
Loading...
37.3
EMA
70B structured-CoT prompted
35.435
36.3675
37.3
38.2325
May 6, 2026
EMA
Top-1 Acc
NDCG@3
Updated 27d ago
Evaluation Results
Method
Method
Links
EMA
Top-1 Acc
NDCG@3
70B structured-CoT prompted
Backbone=Llama-3.3-70B...
2026.05
37.3
62.99
70.94
Feedback
Search any
task
Search any
task