Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Toxicity Classification on RTP-LX French (test)
Loading...
84.9
Class 0 Precision
DeepSeek-Chat
78.036
79.818
81.6
83.382
Aug 15, 2025
Class 0 Precision
Class 0 Recall
Class 0 F1 Score
Class 1 Precision
Class 1 Recall
Class 1 F1 Score
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Class 0 Precision
Class 0 Recall
Class 0 F1 Score
Class 1 Precision
Class 1 Recall
Class 1 F1 Score
Accuracy
DeepSeek-Chat
2025.08
84.9
94
89.3
93.3
83.3
88.1
88.7
GPT-4o
2025.08
83.6
89.3
86.3
88.5
82.4
85.4
85.9
GPT-4o-Mini
2025.08
83.1
93.5
88
92.5
81
86.3
87.2
Mistral Large
2025.08
79.6
92
85.4
90.5
76.5
82.9
84.2
Our model
2025.08
78.3
96.4
86.4
95.3
73.2
82.8
84.8
Feedback
Search any
task
Search any
task