Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dataset Diversity and Coverage Evaluation on AgentHazard 3-app
Loading...
0.976
Goal-text Entropy
AgentHazard (Liu et al., 2025) (3-app)
0.9272
0.9516
0.976
1.0004
May 27, 2026
Goal-text Entropy
CLIP Pairwise Distance (per-base)
Action Coverage
Updated 6d ago
Evaluation Results
Method
Method
Links
Goal-text Entropy
CLIP Pairwise Distance (per-base)
Action Coverage
AgentHazard (Liu et al., 2025) (3-app)
n=456
2026.05
0.976
0.552
3
Feedback
Search any
task
Search any
task