Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ML Research Engineering on PaperBench

33.26Adaptive Pruning Score

AiScientist

1.84169.998318.15526.3117Apr 14, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.04
33.2649.4761.1130.0226.4628.988.3430.128.5340.5558.7610.1828.6644.1331.6949.3214.8142.127.3330.8233.7312.2
2026.04
30.8233.7851.4523.559.821.63.7120.6822.5532.4922.186.5622.4336.936.9927.23.6932.1817.8125.1422.584.9
2026.04
27.2546.2956.5933.7923.0935.2110.1127.930.2329.9549.9210.8718.2836.7719.8564.9420.1318.8132.4517.8730.5215.67
2026.04
24.5320.8648.4615.4312.5921.675.8717.7512.9714.8626.6310.4315.4525.3911.4553.518.3717.0415.276.6119.266.25
2026.04
11.9344.4347.9119.2812.516.676.722.7426.1534.9625.770.2727.5641.264.9535.538.8128.0621.1910.7522.3754.9
2026.04
3.0545.1345.048.312.4423.894.1515.2618.321.8930.818.8810.4333.3412.6560.2413.6917.3718.138.9920.627.44