GLTR: Statistical Detection and Visualization of Generated Text

About

The rapid improvement of language models has raised the specter of abuse of text generation systems. This progress motivates the development of simple methods for detecting generated text that can be used by and explained to non-experts. We develop GLTR, a tool to support humans in detecting whether a text was generated by a model. GLTR applies a suite of baseline statistical methods that can detect generation artifacts across common sampling schemes. In a human-subjects study, we show that the annotation scheme provided by GLTR improves the human detection-rate of fake text from 54% to 72% without any prior training. GLTR is open-source and publicly deployed, and has already been widely used to detect generated outputs

Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush• 2019

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval (test)	Pass@165.42	701
Code Generation	MBPP (test)	Pass@143.35	411
Machine-generated text detection	MGT benchmark Essay	--	129
Classification	IMDB	Accuracy100	62
AI-generated text detection	READ (test)	Accuracy84	55
Machine-generated text detection	TruthfulQA	TPR@FPR-1% (ChatGLM)97.03	54
Machine-generated text detection	MGT benchmark Reuters	--	45
AI-generated text detection	M4	AUROC87.5	41
Machine-generated text detection	Xsum	AUROC75	40
Machine-generated text detection	Essay (test)	GPT4All Score62.69	39

Showing 10 of 265 rows

...

Other info

Follow for update

@wizwand_team Discord