Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MISBENCH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Misinformation DetectionMISBENCH (Multi-hop based Misinformation) 1.0 (test)
Factual Memory Success Rate96.88
12
Misinformation DetectionMISBENCH One-hop based Misinformation 1.0 (test)
Factual Memory Success Rate91.44
12
Showing 2 of 2 rows