| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| AirDialog v1 (test) | Avg Cost per Query0 | 20 | 4d ago | ||
| Clevr v1 (test) | Avg Cost per Query0 | 10 | 4d ago | ||
| ABCD v1 (test) | Avg Cost per Query0 | 10 | 4d ago | ||
| AudioMnist | UQE | F1 Score92.2 | 9 | 4d ago | |
| IMDB v1 (test) | Avg Cost per Query0 | 5 | 4d ago | ||
| Airdialog (test) | lc-gpt-4-turbo | Runtime (s)23.06 | 2 | 4d ago | |
| IMDB (test) | lc-gpt-4-turbo | Runtime (s)20.61 | 2 | 4d ago | |
| ABCD (test) | lc-gpt-4-turbo | Runtime (s)10.1 | 2 | 4d ago | |
| Clevr (test) | UQE-claude-3-haiku | Semantic Retrieval Latency (s)46 | 2 | 4d ago |