Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning

About

We study event-graph substrates: a class of world models that represent agent state as an append-only log of typed RDF triples and answer counterfactual queries by forking the log under a structured intervention vocabulary. Substrates are inspectable at the triple level, support exact counterfactuals, and transfer across domains without learned components. We formalize the class, prove a duality between explanatory and counterfactual queries that reduces both to the same causal-ancestor traversal, and evaluate a 1,400-line CLEVRER-DSL interpreter atop a domain-agnostic substrate runtime at full CLEVRER validation scale (n=75,618). The substrate exceeds the NS-DR symbolic oracle on all four per-question categories (by 9.89, 20.26, 17.65, and 0.80 percentage points), and exceeds the parametric ALOE baseline on descriptive and explanatory while lagging on predictive and counterfactual. We also introduce twin-EventLog, a 500-specification Park-canonical Smallville counterfactual benchmark on which the substrate exceeds Llama-3.1-8B with full context by 18.80 points joint accuracy.

Fabio Rovai• 2026

Related benchmarks

TaskDatasetResultRank
Counterfactual Video ReasoningCLEVRER (val)
Accuracy86.69
5
Explanatory Video ReasoningCLEVRER (val)
Accuracy99.94
5
Predictive Video ReasoningCLEVRER (val)
Accuracy84.07
5
Counterfactual reasoningTwin-EventLog Smallville context
Joint Accuracy (A ∧ B)100
3
Descriptive Video ReasoningCLEVRER (val)
Accuracy97.99
3
Showing 5 of 5 rows

Other info

Follow for update