Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Trustworthiness Evaluation on Trustworthiness Average (human evaluation)

0.88Control Win Rate

Sparse Activation Control

0.846720.855360.8640.87264Nov 4, 2024
Updated 4d ago

Evaluation Results

MethodLinks
0.880.120
0.8480.0920.06