Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DEMO

Benchmarks

Task NameDataset NameSOTA ResultTrend
Dialogue Agent InteractionDEMO
Goal Achievement8.447
20
Element AwarenessDEMO
Goal Recognition Score6.508
20
Dialogue Agent InteractionDEMO Average
Goal Achievement Score8.14
12
Dialogue Agent InteractionDEMO Non-Collaboration set
Goal Achievement Score8.03
12
Dialogue Agent InteractionDEMO Collaboration set
Goal Achievement Score8.65
12
Overall EvaluationDEMO
Overall Score6.779
10
Dialogue Agent EvaluationDEMO
Element Awareness - Goal7.238
10
Showing 7 of 7 rows