Paper Schulam2018 #
We note, however, that these issues stem from a deeper limitation: when training data is affected by actions, supervised learning algorithms capture relationships caused by action policies, and these relationships do not generalize when the policy changes.

For instance, an existing policy to send asthmatics to ICU may appear as asthmatics have lower risk of dying from pneumonia.

