Let the Abyss Stare Back Adaptive Falsification for Autonomous Scientific Discovery

About

Autonomous scientific discovery is entering a more dangerous regime: once the evaluator is frozen, a sufficiently strong search process can learn to win the exam without learning the mechanism the task was meant to reveal. This is the idea behind our title. To let the abyss stare back is to make evaluation actively push against the candidate through adaptive falsification, rather than passively certify it through static validation. We introduce DASES, a falsification-driven framework in which an Innovator, an Abyss Falsifier, and a Mechanistic Causal Extractor co-evolve executable scientific artifacts and scientifically admissible counterexample environments under a fixed scientific contract. In a controlled loss-discovery problem with a single editable locus, DASES rejects artifacts that static validation would have accepted, identifies the first candidate that survives the admissible falsification frontier, and discovers FNG-CE, a loss that transfers beyond the synthetic discovery environment and consistently outperforms CE and CE+L2 under controlled comparisons across standard benchmarks, including ImageNet.

Peiran Li, Fangzhou Lin, Shuo Xing, Jiashuo Sun, Dylan Zhang, Siyuan Yang, Chaoqun Ni, Zhengzhong Tu• 2026

Related benchmarks

Task	Dataset	Result
Image Classification	DTD	Accuracy19.1	610
Image Classification	CIFAR10	Top-1 Accuracy87.94	114
Image Classification	CIFAR100	Top-1 Accuracy60.04	45
Image Classification	CUB-200-2011 (Birds)	Accuracy14.11	18
Classification	VGG-Flowers	Accuracy69.17	12
Image Classification	TrafficSigns	Accuracy94.94	6

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord