Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WildChat

Benchmarks

Task NameDataset NameSOTA ResultTrend
Next Token PredictionWildChat
Next Token Accuracy51
32
SafetyWildChat
Refusal Rate42.92
20
Safety EvaluationWildChat
Safe@197.5
18
Quantization DetectionWildChat
Statistical Power AUC64.2
18
Jail-breaking detectionWildChat
AUC (Statistical Power)0.895
18
Fingerprint DetectionWildChat Fr
FSR1
18
Proactive next utterance predictionWildChat (test)
LLM-Judge52.16
17
Safety EvaluationWildChat (test)
WildChat Score69.85
13
Model RoutingNB-WildChat
Uniqueness Score42.6
11
Synthetic Text GenerationWildChat
Mean Embedding Similarity0.31
10
Safety EvaluationWildChat unsafe prompts
Not-Unsafe Rate99.82
9
Next Token PredictionWildChat
BERT-Small Next Token Accuracy (eps=inf)28.78
5
Over-safety measurementWildChat
User Score15.1
2
Showing 13 of 13 rows