Designing Optimal Dynamic Treatment Regimes: A Causal Reinforcement Learning Approach — DoOperator Research

Authors	Junzhe Zhang
Journal	International Conference on Machine Learning
Year	2020
Citations	16

What Problem It Solves

Policy optimization in treatment settings can be invalid if it ignores how treatments were assigned and how confounders evolve.

What problem it solves

Policy optimization in treatment settings can be invalid if it ignores how treatments were assigned and how confounders evolve.

It combines causal estimands for treatment regimes with reinforcement-learning-style policy optimization.

Use as the main citation for CRL in adaptive treatment design and health decision support.

The paper targets settings where the needed causal quantities are identifiable; data gaps or unmodeled hidden confounding still limit conclusions.