Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Toxicity Detection on RealTox Standard
Loading...
87.56
Accuracy
Trajectory (Raw)
74.6536
78.0043
81.355
84.7057
Mar 1, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Trajectory (Raw)
Backbone=Qwen2.5-14B
2026.03
87.56
Trajectory (Raw)
Backbone=Qwen3-32B
2026.03
87.07
Trajectory (Raw)
Backbone=Qwen3-30B MoE
2026.03
86.57
TaT (Disp.)
Backbone=Qwen2.5-14B
2026.03
85.16
Trajectory (Raw)
Backbone=Llama3.1-8b
2026.03
82.16
TaT (Disp.)
Backbone=Qwen3-30B MoE
2026.03
79.43
TaT (Disp.)
Backbone=Llama3.1-8b
2026.03
79.35
Linear Probe
Backbone=Qwen3-30B MoE
2026.03
77.87
Linear Probe
Backbone=Llama3.1-8b
2026.03
77.86
TaT (Disp.)
Backbone=Qwen3-32B
2026.03
77.7
Linear Probe
Backbone=Qwen2.5-14B
2026.03
76.46
Linear Probe
Backbone=Qwen3-32B
2026.03
75.15
Feedback
Search any
task
Search any
task