Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM

Benchmarks

Task NameDataset NameSOTA ResultTrend
Binary Inconsistency DetectionLLM
Accuracy70.27
10
Robust SteganographyLLM Generative Text
Embedding Capacity (bits / 1k tokens)84.08
5
Span DetectionLLM
F1 Score0.3322
5
Language ModelingLLM (val)
Loss1.3364
4
LanguageLLM-329M
Peak Performance (FP4/FP8)205
1
Showing 5 of 5 rows