Share your thoughts, 1 month free Claude Pro on usSee more

High-level Planning on ReasonMap L (long questions)

0.0747Weighted Accuracy

Ariadne

Updated 3mo ago

Evaluation Results

Method	Links
Ariadne 2025.11		0.0747	121	5.15
Base VLM 2025.11		0.06	61	4.51
SFT VLM 2025.11		0.041	50	3.71