Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Let the Abyss Stare Back Adaptive Falsification for Autonomous Scientific Discovery

About

Autonomous scientific discovery is entering a more dangerous regime: once the evaluator is frozen, a sufficiently strong search process can learn to win the exam without learning the mechanism the task was meant to reveal. This is the idea behind our title. To let the abyss stare back is to make evaluation actively push against the candidate through adaptive falsification, rather than passively certify it through static validation. We introduce DASES, a falsification-driven framework in which an Innovator, an Abyss Falsifier, and a Mechanistic Causal Extractor co-evolve executable scientific artifacts and scientifically admissible counterexample environments under a fixed scientific contract. In a controlled loss-discovery problem with a single editable locus, DASES rejects artifacts that static validation would have accepted, identifies the first candidate that survives the admissible falsification frontier, and discovers FNG-CE, a loss that transfers beyond the synthetic discovery environment and consistently outperforms CE and CE+L2 under controlled comparisons across standard benchmarks, including ImageNet.

Peiran Li, Fangzhou Lin, Shuo Xing, Jiashuo Sun, Dylan Zhang, Siyuan Yang, Chaoqun Ni, Zhengzhong Tu• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationDTD
Accuracy19.1
542
Image ClassificationCIFAR10
Top-1 Accuracy87.94
112
Image ClassificationCIFAR100
Top-1 Accuracy60.04
45
Image ClassificationCUB-200-2011 (Birds)
Accuracy14.11
18
ClassificationVGG-Flowers
Accuracy69.17
12
Image ClassificationTrafficSigns
Accuracy94.94
6
Showing 6 of 6 rows

Other info

Follow for update