| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Open-Ended Question Answering (with Context) | Earth Observation | Judge Score86.65 | 7 | |
| Open-Ended Question Answering | Earth Observation | Judge Score97.05 | 7 | |
| Hallucination Detection | Earth Observation | F1 Score90.94 | 7 | |
| Multiple Choice Question Answering (Single) | Earth Observation | Accuracy96.35 | 7 | |
| Multiple Choice Question Answering (Multiple) | Earth Observation | IoU87.56 | 7 |