| Authors | Junzhe Zhang |
| Journal | International Conference on Machine Learning |
| Year | 2020 |
| Citations | 16 |
What Problem It Solves
Policy optimization in treatment settings can be invalid if it ignores how treatments were assigned and how confounders evolve.
Policy optimization in treatment settings can be invalid if it ignores how treatments were assigned and how confounders evolve.
It combines causal estimands for treatment regimes with reinforcement-learning-style policy optimization.
Use as the main citation for CRL in adaptive treatment design and health decision support.
The paper targets settings where the needed causal quantities are identifiable; data gaps or unmodeled hidden confounding still limit conclusions.
Related papers
Reinforcement Learning: An Introduction
Richard S. Sutton, Andrew G. Barto · 2018
PaperA Survey of Constraint Formulations in Safe Reinforcement Learning
Akifumi Wachi, Xun Shen, Yanan Sui · 2024
PaperOff-Policy Policy Evaluation for Sequential Decisions under Unobserved Confounding
Hongseok Namkoong, Ramtin Keramati, Steve Yadlowsky +1 more · 2020
PaperMarkov Decision Processes with Unobserved Confounders: A Causal Approach
Junzhe Zhang, Elias Bareinboim · 2016