Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Helpfulness Evaluation on Figstep-audio Harmful-Safe
Loading...
88.8
BRR
SARSteer
51.152
60.926
70.7
80.474
Oct 20, 2025
BRR
Updated 26d ago
Evaluation Results
Method
Method
Links
BRR
SARSteer
Model=Kimi-Audio
2025.10
88.8
SARSteer
Model=Qwen2-Audio
2025.10
79.95
SARSteer
Defense runtime (s)=266
2025.10
79.95
MDSteer-c2r
Model=Kimi-Audio
2025.10
79.68
No Defense
Model=Qwen2-Audio
2025.10
70.2
No Defense
2025.10
70.2
RRS
Defense runtime (s)=1145
2025.10
70
AdaShield
Model=Qwen2-Audio
2025.10
69.8
MDSteer-h2s
Model=Kimi-Audio
2025.10
68.8
FSD
Model=Qwen2-Audio
2025.10
63.2
No Defense
Model=Kimi-Audio
2025.10
61.4
FSD
Model=Kimi-Audio
2025.10
61.2
MDSteer-h2s
Model=Qwen2-Audio
2025.10
60.8
MDSteer-c2r
Model=Qwen2-Audio
2025.10
54.2
AdaShield
Model=Kimi-Audio
2025.10
52.6
Feedback
Search any
task
Search any
task