Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hallucination detection on MMLU-Pro

69.81Accuracy

LEAP

53.669257.859662.0566.2404Nov 8, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.11
69.81-75.31
2025.11
64.23-71.18
2025.11
63.21-71.15
2025.11
62.26-71.43
2025.11
61.05-69.23
2025.11
60.61-72.19
2025.11
60.14-70.37
2025.11
59.6-69.23
2025.11
59.56-64.52
2025.11
59.12-70.42
2025.11
58.85-69.51
2025.11
58.67-68.04
2025.11
58.34-68.51
2025.11
58.23-71.11
2025.11
57.96-70.49
2025.11
57.33-68
2025.11
56.67-67.5
2025.11
56.67-67.17
2025.11
56.66-69.54
2025.11
55.81-68.28
2025.11
55.67-66.67
2025.11
55.33-65.59
2025.11
55.33-68.84
2025.11
55.25-70.67
2025.11
55-70.97
2025.11
54.88-70.74
2025.11
54.68-55
2025.11
54.67-69.64
2025.11
54.41-54.05
2025.11
54.29-70.37
2026.01
-79.56-
2026.01
-74.83-
2026.01
-76.76-
2026.01
-71.22-
2026.01
-77.85-
2026.01
-81.08-
2026.01
-77.56-
2026.01
-74.05-
2026.01
-82.21-
2026.01
-87.08-
2026.01
-77.73-
2026.01
-77.56-
2026.01
-57.92-
2026.01
-74.65-
2026.01
-75.85-