| Authors | Junzhe Zhang, Elias Bareinboim |
| Journal | NeurIPS |
| Year | 2019 |
What Problem It Solves
Clinical and behavioral treatment policies require sequential decisions under causal constraints, not just high-reward policies in simulator MDPs.
Clinical and behavioral treatment policies require sequential decisions under causal constraints, not just high-reward policies in simulator MDPs.
The method frames DTR learning as an RL problem while preserving causal identification requirements for treatment effects.
Use for health, medicine, and personalization settings where policies adapt to patient or user history.
Applicability depends on measured histories being rich enough, or on additional causal structure when confounding is hidden.
Related papers
Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
Stefan Wager, Susan Athey · 2017
PaperCausal inference in statistics: An overview
Judea Pearl · 2009
PaperTowards Causal Representation Learning
Bernhard Scholkopf, Francesco Locatello, Stefan Bauer +4 more · 2021
PaperElements of Causal Inference: Foundations and Learning Algorithms
Jonas Peters, Dominik Janzing, Bernhard Scholkopf · 2017