统计研究 ›› 2024, Vol. 41 ›› Issue (2): 139-148.doi: 10.19343/j.cnki.11–1302/c.2024.02.012

• • 上一篇    下一篇

基于核机器的加速失效时间模型及其应用

荣耀华 王江慧 程维虎 曹美雅   

  • 出版日期:2024-02-25 发布日期:2024-02-25

Accelerated Failure Time Model Based on Kernel Machine and Its Application

Rong Yaohua Wang Jianghui Cheng Weihu Cao Meiya   

  • Online:2024-02-25 Published:2024-02-25

摘要: 加速失效时间模型是一种应用广泛的生存分析模型。本文借助LASSO惩罚剔除冗余预测变量,构建基于核机器的加速失效时间模型,用以刻画预测变量与生存期间的复杂关系。此外,提出一种新的正则化Garrotized核机器估计方法,可以较好地刻画预测变量与生存期潜在的非线性关系,实现非参数分量中预测变量间交互作用的自动建模,提升模型预测精度。模拟研究表明,与已有的代表性方法相比,本文提出的方法对生存期的预测精度更高,特别是在复杂关系情形下优势更为显著。最后,将该方法应用于胃癌数据分析,利用临床信息和基因表达预测生存期和风险评分。实证结果显示,该方法能为病例基于风险分层的临床精准诊疗方案设计提供有益的参考。

关键词: 加速失效时间模型, 核机器, 风险预测, 正则化, 再生核希尔伯特空间

Abstract: Accelerated failure time model is a widely used survival analysis model. In this paper, we combine with the LASSO penalty to eliminate the redundant predictors, and construct a kernel-based AFT model to capture the complex relationship between predictors and the response. In addition, we propose a new Regularized Garrotized Kernel Machine estimation method. It can better describe the potential nonlinear relationship between predictors and response, realize the automatic modeling of the interactive effects between predictors in nonparametric parts, and hence improve the predictive accuracy of the model. The simulation studies show that compared with the state-of-the-art methods, the method proposed in this paper has higher predictive accuracy for survival time, especially in the situation of complex relationships. Finally, we apply this method to a gastric cancer data analysis and use the clinical and genetic information to predict survival time and risk score. The empirical results reveal the proposed method can provide a helpful reference for the design of clinical accurate diagnosis and treatment scheme based on risk stratification of patients.

Key words: Accelerated Failure Time Model, Kernel Machine, Risk Prediction, Regularization, Reproducing Kernel Hilbert Space