Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Deception Detection on DeceptionBench
Loading...
93.5
Response AUROC
DECOR
89.34
90.42
91.5
92.58
May 19, 2026
Response AUROC
Thought AUROC
Updated 14d ago
Evaluation Results
Method
Method
Links
Response AUROC
Thought AUROC
DECOR
Backbone=GPT-4o
2026.05
93.5
92
CoT Red-Handed
Backbone=GPT-4o, Venue...
2026.05
93.1
85.2
Constitutional Monitor
Backbone=GPT-4o, Venue...
2026.05
92.4
81
DeceptionBench
Backbone=GPT-4o, Venue...
2026.05
90.6
87.5
Prompt-based Zero-shot
Backbone=GPT-4o
2026.05
89.9
85.4
Prompt-based Few-shot
Backbone=GPT-4o
2026.05
89.5
87
Feedback
Search any
task
Search any
task