Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Safety and Helpfulness Evaluation on Phone-Harm Harm-150

0.4HR

GPT-5

-0.55885.913112.38518.8569Apr 10, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.04
0.437.54.260.340.8
2026.04
0.8443.3310.530.587.69
2026.04
3.5844.4439.566.421.58
2026.04
4.345.4542.557.513.85
2026.04
4.346.4644.217.973.85
2026.04
4.3779.885.295.3419.23
2026.04
5.7354.5557.1411.827.69
2026.04
24.3747.4755.9322.5946.15