Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Materials Science Scientific Reasoning on MatSciBench text-only

0.829Overall Score

S1-NexusAgent

0.6162160.6714580.72670.781942Feb 2, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
0.8290.8170.8190.8320.8280.8240.843
2026.02
0.7960.8180.7650.7960.7830.7880.809
2026.02
0.77370.78490.76890.77150.75760.7450.7869
2026.02
0.7560.7660.7250.7560.7140.7520.763
2026.02
0.7260.7280.6910.7210.7120.7380.735
2026.02
0.71510.72080.66180.70750.64650.70890.7235
2026.02
0.70730.65660.68910.70420.61620.67580.718
2026.02
0.680.66790.63870.67110.64650.65420.6951
2026.02
0.67320.65660.65970.65890.63640.64840.6874
2026.02
0.66150.62640.61970.65670.63640.63260.6689
2026.02
0.64390.67170.6050.62910.63640.63260.6546
2026.02
0.62440.61510.59030.62690.62630.57930.6415