Share your thoughts, 1 month free Claude Pro on usSee more

SOTA Overall Language Model Evaluation benchmarks and papers with code | Wizwand

Share your thoughts, 1 month free Claude Pro on usSee more

Overall Language Model Evaluation

Benchmarks

Dataset Name	SOTA Method	Metric	Trend
Aggregated Benchmarks STEM Code IF General	GenRM-R-Align-14B	Average Score61.7		7	3mo ago

Showing 1 of 1 rows