Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Detection on Algospeak New word 1.0 (test)
Loading...
0.99
Adjusted R2
GPT-4o-m
0.7612
0.8206
0.88
0.9394
May 7, 2026
Adjusted R2
Spearman Rank Correlation Significance
Majority Fit Estimation
Significance Count
Updated 26d ago
Evaluation Results
Method
Method
Links
Adjusted R2
Spearman Rank Correlation Significance
Majority Fit Estimation
Significance Count
GPT-4o-m
Model Full Name=gpt-4o...
2026.05
0.99
-
-
-
GPT-4o
Model Full Name=gpt-4o
2026.05
0.99
-
-
-
Qwen
Model Full Name=Qwen3-...
2026.05
0.96
-
-
-
Grok
Model Full Name=grok-4...
2026.05
0.93
-
-
-
Claude
Model Full Name=claude...
2026.05
0.9
-
-
-
Llama
Model Full Name=llama-...
2026.05
0.9
-
-
-
Mistral
Model Full Name=Minist...
2026.05
0.77
-
-
-
Significance Count
2026.05
-
-
-
7
Feedback
Search any
task
Search any
task