Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hallucination Detection on SugarCrepe 1.0 (test)

98.86Avg-M Score

Human

62.501671.940881.3890.8192May 20, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
98.86999799100999999
2026.05
9291.695.494.597.688.393.582.9
2026.05
89.789.289.795.689.786.590.589.8
2026.05
88.782.886.98993.885.99488.6
2026.05
88.372.88484.192.776.789.683.3
2026.05
888887.593.79187.688.786.5
2026.05
82.386.491.586.996.476.868.270.2
2026.05
82.286.994.987.497.877.367.763.7
2026.05
81.976.278.381.990.177.98980
2026.05
80.580.990.68595.374.570.666.5
2026.05
80.48391.88696.376.368.560.8
2026.05
79.78391.484.996.270.87061.6
2026.05
797684.285.794.574.870.667.4
2026.05
77.275.388.38293.572.668.260.8
2026.05
77.17887.382.693.669.16861.2
2026.05
73.662.374.471.784.965.476.169.4
2026.05
66.760.660.166.261.37977.378.4
2026.05
63.982.138.470.779.757.360.158.8