Out-of-distribution Generalization with Causal Invariant Transformations

About

In real-world applications, it is important and desirable to learn a model that performs well on out-of-distribution (OOD) data. Recently, causality has become a powerful tool to tackle the OOD generalization problem, with the idea resting on the causal mechanism that is invariant across domains of interest. To leverage the generally unknown causal mechanism, existing works assume a linear form of causal feature or require sufficiently many and diverse training domains, which are usually restrictive in practice. In this work, we obviate these assumptions and tackle the OOD problem without explicitly recovering the causal feature. Our approach is based on transformations that modify the non-causal feature but leave the causal part unchanged, which can be either obtained from prior knowledge or learned from the training data in the multi-domain scenario. Under the setting of invariant causal mechanism, we theoretically show that if all such transformations are available, then we can learn a minimax optimal model across the domains using only single domain data. Noticing that knowing a complete set of these causal invariant transformations may be impractical, we further show that it suffices to know only a subset of these transformations. Based on the theoretical findings, a regularized training procedure is proposed to improve the OOD generalization capability. Extensive experimental results on both synthetic and real datasets verify the effectiveness of the proposed algorithm, even with only a few causal invariant transformations.

Ruoyu Wang, Mingyang Yi, Zhitang Chen, Shengyu Zhu• 2022

Related benchmarks

Task	Dataset	Result
Domain Generalization	PACS (test)	Average Accuracy88.4	225
Domain Generalization	VLCS (test)	Average Accuracy79.3	62
Graph Classification	Motif (size)	Accuracy92.2	43
Graph Classification	Motif base	Accuracy92.4	43
Graph Classification	GOOD-SST2 length	Accuracy90	28
Graph Classification	AMotif basis OOD	Accuracy36.2	28
Graph Classification	Motif (size) OOD	Accuracy55.1	14
Graph Classification	Synth basis+std r = 1 ID	Accuracy54.5	14
Graph Classification	SST2 length OOD	Accuracy80.7	14
Graph Classification	Synth (basis+std, r = 1) OOD	Accuracy54	14

Showing 10 of 16 rows

Other info

Follow for update

@wizwand_team Discord