Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Synthetic Tasks on LongBench Synthetic Tasks

67.78Accuracy

baseline

44.109650.254856.462.5452Mar 25, 2025
Updated 15d ago

Evaluation Results

MethodLinks
2025.03
67.78
2025.03
67.74
2025.03
67.68
2025.03
52.51
2025.03
45.02