Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Tool-use agent security evaluation on SIREN

23.56Explicit Directive (UA)

CaMeL

0.84646.743212.6418.5368Jan 9, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
23.5644.833.820.1117.8825.1410.6630.512011.6825.3424.87046.79
2026.01
17.2416.0952.171.0921.7924.0228.31014.673.3327.538.1340.570.3274.49
2026.01
17.2445.984.3516.313.9737.4313.9733.091.33010.7427.8426.55030.84
2026.01
15.5264.9411.9669.5715.0854.1912.8756.628.6748.6712.9358.9230.5616.6565.31
2026.01
14.3722.9925.541.6317.8837.9919.85012.672.6718.4611.9939.30.2140.82
2026.01
10.9249.4313.5959.7813.4155.8723.165.5114415.8533.0647.6310.2255.1
2026.01
10.3462.0710.8769.0215.6448.0411.432.729.335211.5750.8921.299.873.47
2026.01
8.6263.2221.7454.3512.2947.4915.443.319.33613.8732.6424.660.4243.88
2026.01
8.0562.072528.2610.0664.817.286.251013.3314.632.6459.7514.1276.53
2026.01
5.7558.6215.7640.767.2640.2214.719.934.6728.6710.3233.268.851.4834.69
2026.01
4.0240.235.431.095.5922.9111.413.67.3318.677.1918.566.532.3248.98
2026.01
3.4588.5126.0971.225.765.3612.1375.3713.3367.3315.9573.8339.5238.8879.59
2026.01
2.8749.433.260.542.7922.359.560.374.6743.335.1120.137.480.1145.92
2026.01
2.387.9352.7240.2229.0574.311.0358.8286020.3363.6143.9439.8377.55
2026.01
2.390.2316.358.1510.6166.4815.817.72432.6710.6447.2421.298.1143.88
2026.01
1.7287.9346.7437.524.5861.4518.380.742.671219.536.735.630.2171.43