统计研究 ›› 2020, Vol. 37 ›› Issue (4): 114-128.doi: 10.19343/j.cnki.11-1302/c.2020.04.009

• • 上一篇    

基于协变量平衡加权的平均处理效应的稳健有效估计

吴浩 彭非   

  • 出版日期:2020-04-25 发布日期:2020-04-15

A Robust and Efficient Estimation of Average Treatment Effects Based on Covariate Balance Weighting

Wu Hao & Peng Fei   

  • Online:2020-04-25 Published:2020-04-15

摘要: 倾向性得分是估计平均处理效应的重要工具。但在观察性研究中,通常会由于协变量在处理组与对照组分布的不平衡性而导致极端倾向性得分的出现,即存在十分接近于0或1的倾向性得分,这使得因果推断的强可忽略假设接近于违背,进而导致平均处理效应的估计出现较大的偏差与方差。Li等(2018a)提出了协变量平衡加权法,在无混杂性假设下通过实现协变量分布的加权平衡,解决了极端倾向性得分带来的影响。本文在此基础上,提出了基于协变量平衡加权法的稳健且有效的估计方法,并通过引入超级学习算法提升了模型在实证应用中的稳健性;更进一步,将前一方法推广至理论上不依赖于结果回归模型和倾向性得分模型假设的基于协变量平衡加权的稳健有效估计。蒙特卡洛模拟表明,本文提出的两种方法在结果回归模型和倾向性得分模型均存在误设时仍具有极小的偏差和方差。实证部分将两种方法应用于右心导管插入术数据,发现右心导管插入术大约会增加患者6.3%死亡率。

关键词: 因果推断, 观察性研究, 极端倾向性得分, 协变量平衡加权, 模型误设

Abstract: Propensity score is a useful approach in estimating average treatment effects. However, the imbalance of covariate distribution between treatment group and control group usually leads to the extreme propensity score, i.e. some propensity scores will be very close to 0 or 1, which makes the ignorable assumption of causal inference near to false, and brings large bias and variance in the estimation of average treatment effects. Li et al. (2018a) advocate covariate balance weighting method to realize weighted balance of covariate distribution under the assumption of unconfoundedness, which resolves the impact by the extreme propensity scores. Based on the covariate balance weighting method, this article propose a more robust and efficient method, and reduces the trouble of model misspecification by super learner algorithm. Furthermore, we generalize the former method to model-free situations, which is also a doubly robust and efficient estimator. In Monte-Carlo simulation, the two proposed methods both have very small bias and variance when both outcome regression model and propensity score model are misspecified. We use the two methods in right heart catheterization data, and find that right heart catheterization will increase mortality by 6.3%.

Key words: Causal Inference, Observational Study, Extreme Propensity Score, Covariate Balance Weighting, Model Misspecification