Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning evaluation on DialogSum

99.1Reasoning

Aligner

-1.57224.56450.776.836Feb 4, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.02
99.1
2024.02
98.8
2024.02
87.6
2024.02
83.5
2024.02
82.9
2024.02
78.7
2024.02
78.5
2024.02
73.6
2024.02
73.1
2024.02
70.8
2024.02
70.1
2024.02
69.5
2024.02
69.2
2024.02
65.9
2024.02
60.7
2024.02
58.7
2024.02
58.5
2024.02
58.5
2024.02
58.5
2024.02
53.4
2024.02
49.5
2024.02
47.9
2024.02
39.3
2024.02
24
2024.02
17.2
2024.02
15.6
2024.02
10.4
2024.02
9.7
2024.02
6.8
2024.02
6.2
2024.02
6
2024.02
3.3
2024.02
2.3