Release Strategies and the Social Impacts of Language Models

About

Large language models have a range of beneficial uses: they can assist in prose, poetry, and programming; analyze dataset biases; and more. However, their flexibility and generative capabilities also raise misuse concerns. This report discusses OpenAI's work related to the release of its GPT-2 language model. It discusses staged release, which allows time between model releases to conduct risk and benefit analyses as model sizes increased. It also discusses ongoing partnership-based research and provides recommendations for better coordination and responsible publication in AI.

Irene Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-Voss, Jeff Wu, Alec Radford, Gretchen Krueger, Jong Wook Kim, Sarah Kreps, Miles McCain, Alex Newhouse, Jason Blazakis, Kris McGuffie, Jasmine Wang• 2019

Related benchmarks

Task	Dataset	Result
Machine-generated text detection	MGT benchmark Essay	AUROC98.3	129
LGT Detection	Fast-DetectGPT PubMed (test)	AUROC0.884	96
LGT Detection	Fast-DetectGPT XSum (test)	AUROC97.9	96
AI-generated text detection	READ (test)	Accuracy81.8	55
LGT Detection	XSum Fast-DetectGPT benchmark	AUROC97.9	54
LGT Detection	WritingPrompts-small Fast-DetectGPT benchmark	AUROC97.6	54
LGT Detection	WritingPrompts small Fast-DetectGPT benchmark (test)	AUROC97.6	54
LGT Detection	PubMed Fast-DetectGPT benchmark	AUROC0.878	54
Machine-generated text detection	TruthfulQA	TPR@FPR-1% (ChatGLM)97.08	54
LGT Detection	MGTBench WritingPrompts	AUROC97.3	45

Showing 10 of 220 rows

...

Other info

Follow for update

@wizwand_team Discord