| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| SHINES | Llama-3.1-8B-Instruct | Relevance89 | 18 | 4d ago | |
| SST-2, CoLA, and HateXplain (test) | RawAt | Throughput (ex/s)19.23 | 13 | 4d ago | |
| GSM8K (train) | TRICE without CV | Rationale Validity Rate98.9 | 4 | 4d ago | |
| ScienceQA (test) | MMCOT | B-197 | 3 | 4d ago | |
| CounselingWAI Bond dimension 1.0 (test) | CARE | BLEU0.28 | 1 | 4d ago | |
| CounselingWAI Task dimension 1.0 (test) | CARE | BLEU0.23 | 1 | 4d ago | |
| CounselingWAI Goal dimension 1.0 (test) | CARE | BLEU22 | 1 | 4d ago | |
| Simulation dataset audio | GoT | Mean Subjective Rating6.3 | 1 | 4d ago | |
| Moshi full-duplex | GoT | Mean Rating6.85 | 1 | 4d ago | |
| GPT-4 full-duplex audio | GoT | Mean Rating7.07 | 1 | 4d ago |