Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Tool-Using Agent Safety on MT-AgentRisk 1.0 (test)

27.14Filesystem ASR

GPT-5.2

24.682841.268957.85574.4411Feb 13, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
27.1471.4322.1467.1425.716022.8671.43604025.4866.3--
2026.02
37.1455.7457.8627.1428.5758.5748.5745.712073.344.6643.56--
2026.02
514034.2950.7137.1454.2931.4367.1486.67039.7350.4155.9-24
2026.02
55.7132.8652.143537.1447.148017.1446.6746.6755.0733.97--
2026.02
7018.5764.291562.8625.7158.5732.8646.6733.3363.2921.92--
2026.02
71.4324.2965.7115.7148.5741.4378.5714.2953.3333.3365.4822.74--
2026.02
78.575.7170.717.147018.5778.5711.4360073.159.5915.6-56.3
2026.02
804.29801.435027.1491.432.8673.33076.167.12--
2026.02
82.867.1465.7112.1467.143087.145.7126.676071.7815.4360.7-64.6
2026.02
82.867.1472.865.7185.714.29808.5780078.96.0320.5-73.5
2026.02
88.571.4367.8613.578010901.4360078.087.6741.8-77.4
2026.02
88.57080085.714.2992.861.4373.33084.931.111.5-84.5