统计研究 ›› 2018, Vol. 35 ›› Issue (6): 109-116.doi: 10.19343/j.cnki.11-1302/c.2018.06.011

• • 上一篇    下一篇

异质性数据下广义线性模型的Maximin似然比估计及应用

秦磊等   

  • 出版日期:2018-06-25 发布日期:2018-06-22

Maximin Likelihood Ratio Estimation for Generalized Linear Model with heterogeneous Data and Its Application

Qin Lei et al.   

  • Online:2018-06-25 Published:2018-06-22

摘要: 针对具有多个来源的异质性数据,文献中通常提出复杂程度较高的模型用于描述每个数据子总体的特征,而本文着眼于刻画不同数据子总体的共性进而建立一个简单的模型。在参数估计方面,本文借鉴了普通线性模型的Maximin估计思想,提出了适用于广义线性模型的Maximin似然比估计方法及稀疏结构下的惩罚估计。该方法通过最大化所有子总体中似然比统计量的最小值,构建成一个简单而保守的模型,以减少数据来源较多而呈现的复杂性。所提方法适用于因变量服从正态分布、两点分布、泊松分布等指数族分布的情形,丰富了前人的研究成果,具有更好的实践意义。模拟分析显示,相比于经典的估计方法,Maximin似然比估计方法不仅能够有效地探寻子总体的共性,而且具有较高的样本外预测精度。本文提出的方法也适用于政府统计和经济统计中具有异质性的大型数据集。

关键词: 异质性, 指数族分布, Maximin似然比估计, 惩罚估计

Abstract: To explore the different traits of heterogeneous data from multiple sources, most of the literature proposed quite complicated models to describe the characteristics of subgroups. However, this paper is focusing on a simple model depicting the homogeneity of subgroups of data. In view of the maximin estimation for Gaussian linear model, this paper proposes the maximin likelihood ratio estimator suitable to the generalized linear model and the penalized estimator in a sparse structure. The method builds a simple and conservative model to reduce the complexity from various data sources by maximizing the minimum of likelihood ratio estimators in all these subgroups. It is of practical significance and widely applicable to exponential family distributions with dependent variables subject to normal, two-point and Poisson distributions, and enriches the research results from the predecessors. Simulation shows that the proposed method can not only detect the homogeneity from the subgroups in a more efficient way, but also forecast out of sample with a higher precision in comparison with the classical ways. Moreover, the proposed method is also relevant to large-scale data in the governmental and economic statistics.

Key words: Heterogeneity, Exponential Family Distribution, Maximin Likelihood Ratio Estimator, Penalized Estimator