Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Deception Evaluation on Strategic Deception
Loading...
28.86
CoT Plan Fidelity
GRPO
-0.9464
6.7918
14.53
22.2682
Mar 27, 2026
CoT Plan Fidelity
Actual Deception Success Rate
CoT Faithfulness
Updated 19d ago
Evaluation Results
Method
Method
Links
CoT Plan Fidelity
Actual Deception Success Rate
CoT Faithfulness
GRPO
Backbone=Llama-3.1-8B
2026.03
28.86
53.51
52.6
GRPO
Backbone=Qwen3-8B
2026.03
28.81
73.66
28.95
SAR
Backbone=Qwen3-8B
2026.03
9.81
38.87
61.12
SAR
Backbone=Llama-3.1-8B
2026.03
7.82
26.05
73.74
CoT Monitor
Backbone=Qwen3-8B
2026.03
3.4
55.11
44.68
CoT Monitor
Backbone=Llama-3.1-8B
2026.03
0.2
86.57
13.42
Feedback
Search any
task
Search any
task