Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning

About

We study event-graph substrates: a class of world models that represent agent state as an append-only log of typed RDF triples and answer counterfactual queries by forking the log under a structured intervention vocabulary. Substrates are inspectable at the triple level, support exact counterfactuals, and transfer across domains without learned components. We formalize the class, prove a duality between explanatory and counterfactual queries that reduces both to the same causal-ancestor traversal, and evaluate a 1,400-line CLEVRER-DSL interpreter atop a domain-agnostic substrate runtime at full CLEVRER validation scale (n=75,618). The substrate exceeds the NS-DR symbolic oracle on all four per-question categories (by 9.89, 20.26, 17.65, and 0.80 percentage points), and exceeds the parametric ALOE baseline on descriptive and explanatory while lagging on predictive and counterfactual. We also introduce twin-EventLog, a 500-specification Park-canonical Smallville counterfactual benchmark on which the substrate exceeds Llama-3.1-8B with full context by 18.80 points joint accuracy.

Fabio Rovai• 2026

Related benchmarks

Task	Dataset	Result
Counterfactual Video Reasoning	CLEVRER (val)	Accuracy86.69	5
Explanatory Video Reasoning	CLEVRER (val)	Accuracy99.94	5
Predictive Video Reasoning	CLEVRER (val)	Accuracy84.07	5
Counterfactual reasoning	Twin-EventLog Smallville context	Joint Accuracy (A ∧ B)100	3
Descriptive Video Reasoning	CLEVRER (val)	Accuracy97.99	3

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord