Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LAB-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Laboratory Science Knowledge EvaluationLAB-Bench 2 821 (Evaluation)
Accuracy82.3
25
Biomedical Multimodal ReasoningLAB-Bench
Cloning Score38.4
7
Showing 2 of 2 rows