Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GPT-OSS

Benchmarks

Task NameDataset NameSOTA ResultTrend
Red-teamingGPT-OSS 20B
Coverage63.2
5
Language ModelingGPT-OSS 20B held-out (test)
Perplexity34.56
5
Retrieval-Augmented Generationopenai/gpt-oss-20b Long prompt
TTFT (s)7.72
3
Retrieval-Augmented Generationopenai/gpt-oss-20b Medium prompt
Time To First Byte (s)2.45
3
Retrieval-Augmented Generationopenai/gpt-oss-20b Short prompt
TTFT (s)1.39
3
Training ThroughputGPT-OSS-20B workload
Throughput (tokens/s)140,900
2
Language ModelingMini-GPT-OSS (val)
Validation Loss2.94
2
Showing 7 of 7 rows