Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DeceptArena

Benchmarks

Task NameDataset NameSOTA ResultTrend
Deception DetectionDeceptArena (test)
False Assertion Score0.927
4
Showing 1 of 1 rows