| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Evaluation Dataset (Unseen Average) | Mistral-7B | Score42.86 | 18 | 4d ago | |
| Evaluation Dataset Seen Average | Mistral-7B | Score62.34 | 18 | 4d ago | |
| Evaluation Dataset Unseen (Fold 3) | Qwen2.5-7B | Score0.4022 | 18 | 4d ago | |
| Evaluation Dataset (Fold 3 Seen) | LLaMA-3-8B + COGLM | Score66.69 | 18 | 4d ago | |
| Evaluation Dataset Unseen (Fold 2) | Mistral-7B | Score50 | 18 | 4d ago | |
| Evaluation Dataset (Fold 2 Seen) | Gemma-7B + COGLM | Score63.63 | 18 | 4d ago | |
| Evaluation Dataset Unseen (Fold 1) | DeepSeek V3 | Score0.4818 | 18 | 4d ago | |
| Evaluation Dataset (Fold 1 Seen) | Mistral-7B | Score0.6191 | 18 | 4d ago | |
| Evaluation Dataset (Full) | Gemma-7B + COGLM | Score0.6379 | 18 | 4d ago |