Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Analysis on DA
Loading...
92.5
R Metric
HalluClean
8.26
30.13
52
73.87
Nov 12, 2025
R Metric
Q Metric
Updated 27d ago
Evaluation Results
Method
Method
Links
R Metric
Q Metric
HalluClean
Protocol=Detection → R...
2025.11
92.5
86
HalluClean
Protocol=Detection → R...
2025.11
89
83
DeepSeek-V3
Protocol=Direct Ask, B...
2025.11
86.5
78
GPT-4o-mini
Protocol=Direct Ask, B...
2025.11
84.5
79.5
ChatProtect
Protocol=Detection → R...
2025.11
79.5
74
Llama-3-70B
Protocol=Direct Ask, B...
2025.11
74.5
68.5
DeepSeek-R1
Protocol=Direct Ask, B...
2025.11
74
67.5
GPT-3.5-turbo
Protocol=Direct Ask, B...
2025.11
57.5
53
Step-by-Step
Protocol=Detection → R...
2025.11
54.5
51.5
Plan-and-Solve
Protocol=Detection → R...
2025.11
11.5
10.5
Feedback
Search any
task
Search any
task