Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HalOmi

Benchmarks

Task NameDataset NameSOTA ResultTrend
Omission DetectionHalOmi Zero-Shot 1.0
ROC AUC0.84
7
Omission DetectionHalOmi Low-Resource 1.0
ROC AUC0.76
7
Omission DetectionHalOmi High-Resource 1.0
ROC AUC81
7
Hallucination DetectionHalOmi Zero-Shot 1.0
ROC AUC0.66
7
Hallucination DetectionHalOmi Low-Resource 1.0
ROC AUC0.79
7
Hallucination DetectionHalOmi High-Resource 1.0
AUC (ROC)0.91
7
Showing 6 of 6 rows