Why we should prefer simple causal models
- Dominik Janzing (Amazon Research, Tuebingen, Germany)
It is well-known that learning statistical associations from finite data requires regularization to avoid overfitting. In other words, regularization terms penalize too complex functions to lower the risk that the functions capture random noise in the data. However, in the limit of infinite sample size, one can still learn arbitrarily complex statistical relations. I argue that regularization is even recommended in the population limit if one is interested in a causal model rather than a statistical model. This is because regularization can also mitigate bias from hidden common causes. This can be seen for a simple linear and non-linear regression task, where I show a very explicit formal analogy between finite sample and confounding bias. My theoretical results suggest that learning causal relations in the presence of hidden common causes should use particularly simple models.
Paper: D. Janzing: Causal regularization, NeurIPS 2019.