Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Risk Assessment on Risk Assessment 500-sample
Loading...
85.8
Data Exfiltration
MD + SSL
79.04
80.795
82.55
84.305
Apr 27, 2026
Data Exfiltration
Destructive Behaviors
Privilege Escalation
Covert Execution
Resource Abuse
Credential Access
Macro F1
Updated 1mo ago
Evaluation Results
Method
Method
Links
Data Exfiltration
Destructive Behaviors
Privilege Escalation
Covert Execution
Resource Abuse
Credential Access
Macro F1
MD + SSL
Input View=MD + SSL
2026.04
85.8
85
64.2
73
78.8
85.2
78.7
Full SSL
Input View=Full SSL
2026.04
85.5
85.1
66.7
70
75.6
82.2
77.5
SSL-Shallow
Input View=SSL-Sh.
2026.04
81.5
71.4
56.7
68.2
75
69.3
70.4
Full MD
Input View=Full MD
2026.04
81.1
68
68.5
71.1
80.5
76.8
74.4
Desc
Input View=Desc
2026.04
79.3
64.5
49.1
66.5
75.2
67.2
66.9
Feedback
Search any
task
Search any
task