





















Abstract:We develop methods for estimating how infinitesimal policy changes affect long-term outcomes in dynamic systems. We show that dynamic marginal policy effects (MPEs) can be identified via tractable reduced-form expressions, and can be estimated under a general sequential unconfoundedness assumption. We also propose a doubly robust estimator for dynamic MPEs. Our approach does not require observing full dynamic state information (as is typically assumed for off-policy evaluation in Markov decision processes), and does not incur an exponential curse of horizon (as is typical in non-Markovian off-policy evaluation). We demonstrate practicality and robustness of our approach in a number of simulations, including one motivated by a dynamic pricing application where people use past prices to form a reference level for current prices.
| Comments: | Fix typos |
| Subjects: | Methodology (stat.ME) |
| Cite as: | arXiv:2604.05639 [stat.ME] |
| (or arXiv:2604.05639v3 [stat.ME] for this version) | |
| https://doi.org/10.48550/arXiv.2604.05639 arXiv-issued DOI via DataCite |
From: I-han Lai [view email]
[v1]
Tue, 7 Apr 2026 09:41:11 UTC (281 KB)
[v2]
Fri, 15 May 2026 03:25:16 UTC (491 KB)
[v3]
Mon, 25 May 2026 17:21:41 UTC (491 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。