Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dialogue Reasoning on MUTUAL

80.8Accuracy

LED

29.42442.76256.169.438Jun 15, 2025Aug 9, 2025Oct 4, 2025Nov 28, 2025Jan 23, 2026Mar 19, 2026May 14, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.05
80.8-
2026.05
80.2-
2026.05
79.8-
2026.05
79-
2026.05
79-
2026.05
78.8-
2026.05
74.4-
2026.05
73.7-
2026.05
73.7-
2025.06
71.3-
2025.06
70.84-
2025.06
68.53-
2026.05
68.4-
2025.06
68.33-
2025.06
68.01-
2025.06
67.44-
2025.06
67.43-
2025.06
67.24-
2026.05
66.9-
2026.05
66.8-
2026.05
66.4-
2026.05
66.2-
2026.05
66-
2026.05
61.4-
2026.05
48.3-
2026.05
48-
2026.05
47.8-
2026.05
47-
2026.05
46.5-
2026.05
46.3-
2026.05
46.2-
2026.05
45.9-
2026.05
44.4-
2026.05
43.9-
2026.05
42.5-
2026.05
34-
2026.05
32.7-
2026.05
31.4-
2023.10
-0.203
2023.10
-0.034
2023.10
-0.467
2023.10
-0.467
2023.10
--0.087
2023.10
-0.022
2023.10
-0.118
2023.10
-0.124
2023.10
-0.111
2023.10
-0.018
2023.10
-0.183
2023.10
-0.188