Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agent Safety Evaluation on Agent-SafetyBench aggregated clean and five attack types

26.31UBR

SAFEHARNESS

24.883234.514144.14553.7759Apr 15, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.04
26.3120.277.9259.43.8709
2026.04
28.7426.494.2567.53.481,152
2026.04
31.8829.291.7562.33.140
2026.04
33.7628.783.6754.42.96678
2026.04
34.5932.395.4262.92.891,137
2026.04
38.1436.994.8357.52.621,181
2026.04
39.183895.0856.52.551,477
2026.04
39.2535.591.555.22.550
2026.04
39.7333.986.83542.52148
2026.04
40.6738.794.6755.72.460
2026.04
40.7934.886.8352.62.4582
2026.04
43.0842.696.9254.32.320
2026.04
44.841.890.5848.52.2354
2026.04
44.9939.486.548.12.220
2026.04
45.9942.891.3348.62.1796
2026.04
46.8344.894.6749.52.14168
2026.04
47.19459549.92.1288
2026.04
48.2245.291.25462.070
2026.04
49.5248.496.4247.72.020
2026.04
51.1849.295.2545.61.950
2026.04
51.2649.595.7546.41.95184
2026.04
52.8151.396.4244.71.8967
2026.04
54.25395.3341.91.850
2026.04
56.5154.395.4240.61.77182
2026.04
56.5455.796.2540.51.7775
2026.04
56.7255.394.9239.31.76215
2026.04
57.3256.295.6739.11.740
2026.04
59.2656.994.9237.41.6999
2026.04
61.8361.695.4233.61.620
2026.04
61.9860.995.3333.91.610