Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Open PL LLM Leaderboard

Benchmarks

Task NameDataset NameSOTA ResultTrend
Large Language Model EvaluationOpen PL LLM Leaderboard instruction-tuned
Overall Average Score69.84
44
Linguistic Implicatures DecodingOpen PL LLM Leaderboard Implicatures component base models
Average Score67.38
30
Showing 2 of 2 rows