Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Autonomous replication on target-1 (citrusdrop) API model runs
Loading...
85
Non-refusal Count
GPT-5.1
34.04
47.27
60.5
73.73
May 7, 2026
Non-refusal Count
Total Runs
Responding Replica Rate
Updated 23d ago
Evaluation Results
Method
Method
Links
Non-refusal Count
Total Runs
Responding Replica Rate
GPT-5.1
Released=Nov 2025
2026.05
85
127
0
GPT-5
Released=Aug 2025
2026.05
47
50
0
Claude Opus 4.5
Released=Nov 2025
2026.05
37
43
16
Claude Opus 4
Released=May 2025
2026.05
36
36
6
Claude Opus 4.6
Released=Feb 2026
2026.05
36
40
81
GPT-5.4
Released=Mar 2026
2026.05
36
82
33
Feedback
Search any
task
Search any
task