Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

coding-reasoning on Tonsil (test)

58.55Success Rate

CodeCytos + Few Shot

-1.416414.151829.7245.2882May 30, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
58.5581.687.1288.5382
2026.05
52.2578.383.7584.7578
2026.05
47.6576.8882.5283.1776
2026.05
44.4477.1582.8983.1576
2026.05
42.6276.2682.7183.2375
2026.05
29.0668.7280.8778.4469
2026.05
24.6861.4472.9274.7262
2026.05
7.1723.2233.1335.9625
2026.05
5.6220.5132.1235.8223
2026.05
4.6518.3329.0432.0121
2026.05
2.479.0615.2917.5111
2026.05
2.057.5413.2415.349
2026.05
1.76.2210.0611.467
2026.05
1.656.1910.1211.457
2026.05
1.537.414.2216.779
2026.05
1.343.394.989.166
2026.05
1.213.285.0810.696
2026.05
0.894.458.9110.76