Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Hallucination Detection on HELM Sentence Level v1.0 (test)

0.8835AUC

MIND

0.240260.4072550.574250.741245Mar 11, 2024
Updated 3d ago

Evaluation Results

MethodLinks
2024.03
0.88350.476
2024.03
0.87740.5244
2024.03
0.8680.4087
2024.03
0.81030.3312
2024.03
0.79720.1678
2024.03
0.78950.5032
2024.03
0.78760.4857
2024.03
0.78730.2306
2024.03
0.78430.1096
2024.03
0.76440.3809
2024.03
0.75930.2151
2024.03
0.75930.1057
2024.03
0.75630.1423
2024.03
0.75490.177
2024.03
0.75090.3161
2024.03
0.74970.0839
2024.03
0.7490.1167
2024.03
0.74240.0745
2024.03
0.74130.2057
2024.03
0.73020.0732
2024.03
0.72630.0573
2024.03
0.72280.0601
2024.03
0.72060.2075
2024.03
0.71210.1808
2024.03
0.7074-0.0467
2024.03
0.7044-0.0316
2024.03
0.70190.1216
2024.03
0.69870.0456
2024.03
0.6980.0376
2024.03
0.69270.0773
2024.03
0.68510.2032
2024.03
0.68460.3789
2024.03
0.67550.4938
2024.03
0.6594-0.0563
2024.03
0.65830.0086
2024.03
0.65650.3835
2024.03
0.64790.2405
2024.03
0.64790.117
2024.03
0.64180.2293
2024.03
0.64010.1822
2024.03
0.63290.1346
2024.03
0.6212-0.141
2024.03
0.61780.1268
2024.03
0.61750.1461
2024.03
0.60980.1752
2024.03
0.60430.4273
2024.03
0.60250.1842
2024.03
0.58780.0152
2024.03
0.58720.0154
2024.03
0.58340.3092
2024.03
0.5777-0.0015
2024.03
0.57570.1204
2024.03
0.57490.1193
2024.03
0.55460.1504
2024.03
0.54420.0936
2024.03
0.54090.0369
2024.03
0.53650.059
2024.03
0.53270.0667
2024.03
0.52180.0995
2024.03
0.51280.0208
2024.03
0.51270.1199
2024.03
0.49310.2029
2024.03
0.48050.1043
2024.03
0.46580.2115
2024.03
0.46130.1856
2024.03
0.44790.1287
2024.03
0.44390.2375
2024.03
0.44110.0438
2024.03
0.4166-0.0337
2024.03
0.41140.0858
2024.03
0.41080.0986
2024.03
0.40660.013
2024.03
0.4040.182
2024.03
0.376-0.0378
2024.03
0.37250.1092
2024.03
0.3536-0.0339
2024.03
0.33520.0826
2024.03
0.3164-0.0166
2024.03
0.3105-0.1016
2024.03
0.3047-0.0003
2024.03
0.3026-0.0136
2024.03
0.3013-0.0404
2024.03
0.2673-0.0971
2024.03
0.265-0.0483