| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Safety Evaluation | WildChat | Safe@197.5 | 34 | |
| Next Token Prediction | WildChat | Next Token Accuracy51 | 32 | |
| Safety classification disagreement | WildChat Content | Disagreement Rate (per 1k Conversations)0.4 | 30 | |
| Safety classification disagreement | WildChat Intent | Disagreement Rate (per 1k conv)0.4 | 30 | |
| Writing | WildChat 5,000 conversations | KLfwd0.392 | 24 | |
| Coding | WildChat 5,000 conversations | KL Divergence (Forward)0.26 | 24 | |
| Safety | WildChat | Refusal Rate42.92 | 20 | |
| Indirect prompt injection defense | Wildchat | Attack Success Rate (ASR)0.33 | 18 | |
| Quantization Detection | WildChat | Statistical Power AUC64.2 | 18 | |
| Jail-breaking detection | WildChat | AUC (Statistical Power)0.895 | 18 | |
| Fingerprint Detection | WildChat Fr | FSR1 | 18 | |
| Proactive next utterance prediction | WildChat (test) | LLM-Judge52.16 | 17 | |
| Safety Evaluation | WildChat (test) | WildChat Score69.85 | 13 | |
| LLM Inference | WildChat | RPS (Requests/s)75.17 | 11 | |
| Model Routing | NB-WildChat | Uniqueness Score42.6 | 11 | |
| Synthetic Text Generation | WildChat | Mean Embedding Similarity0.31 | 10 | |
| Safety Evaluation | WildChat unsafe prompts | Not-Unsafe Rate99.82 | 9 | |
| User Simulation | WildChat (In-Distribution) | Turn Count8.62 | 8 | |
| Safety Alignment | WildChat | DSR81.2 | 6 | |
| LLM alignment | WildChat (train) | Loss1.1951 | 6 | |
| Next Token Prediction | WildChat | BERT-Small Next Token Accuracy (eps=inf)28.78 | 5 | |
| Dialogue | WildChat | Lexical Coverage47.3 | 4 | |
| KV cache reuse efficiency | WildChat | Match Rate50.2 | 4 | |
| Secret loyalty evaluation | WildChat final training-evaluation step (held-out) | Activation Rate77 | 4 | |
| Disallowed Content Evaluation | WildChat | Not Unsafe Rate98 | 4 |