| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| S-SQuAD (test) | ALBERT-base + DA + KD | EM64.1 | 16 | 4d ago | |
| Spoken-SQuAD | contr-cos-all + asr + giga | EM76.11 | 15 | 4d ago | |
| LibriSQA Part II | Transcripts + Llama 3.1 8B | Accuracy74.9 | 11 | 4d ago | |
| LibriSQA (Part I) | Transcripts + EuroLLM 9B | Accuracy87.6 | 11 | 4d ago | |
| Spoken SQuAD (test) | Transcripts + EuroLLM 9B | Accuracy91.1 | 11 | 4d ago | |
| UltraEval-Audio S2S | AlpacaEval Score0.7338 | 9 | 4d ago | ||
| Spoken SQuAD (B) ASR (test) | Ensembled [(e) plus (d)] | EM60.37 | 9 | 4d ago | |
| Our Bench | Accuracy76.34 | 8 | 4d ago | ||
| Spoken-MQA | Accuracy83.95 | 8 | 4d ago | ||
| VERA | Accuracy18.54 | 8 | 4d ago | ||
| Spoken SQuAD NoiseV2 (test) | BiDAF (WORD+CHAR+PHONEME+SYLLABLE) | EM20.66 | 5 | 4d ago | |
| Spoken SQuAD NoiseV1 (test) | BiDAF (WORD+CHAR+PHONEME+SYLLABLE) | EM29.73 | 5 | 4d ago | |
| Spoken SQuAD No noise (test) | BiDAF (WORD+CHAR+PHONEME+SYLLABLE) | EM45.78 | 5 | 4d ago | |
| VoiceBench S2T | Longcat-Flash-Omni-Instruct | AlpacaEval4.94 | 4 | 4d ago | |
| OpenAudioBench S2T | Fun-Audio-Chat-30B-A3B | AlpacaEval Score88.89 | 4 | 4d ago | |
| Spoken SQuAD (A) Text (test) | BERT | EM76.9 | 3 | 4d ago | |
| SQuAD lost (test) | SpeechBERT | F1 Score37.31 | 2 | 4d ago |