Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ID

Benchmarks

Task NameDataset NameSOTA ResultTrend
Document-level phenotype concept recognitionID-68
Precision94.11
12
Open-ended DialogueID Average
Win Rate72.2
4
LLM response quality predictionID Claude 3.5 Haiku 20241022 (test)
RMSE0.45
3
Showing 3 of 3 rows