统计研究 ›› 2017, Vol. 34 ›› Issue (12): 110-118.doi: 10.19343/j.cnki.11-1302/c.2017.12.010

• • 上一篇    下一篇

基于异质性数据的Logit变量选择模型研究

斯介生等   

  • 出版日期:2017-12-25 发布日期:2017-12-25

The Study of Variable Selection in Logit Model Based on Heterogeneous Data

Si Jiesheng Li Yang Xie Bangchang   

  • Online:2017-12-25 Published:2017-12-25

摘要: 在大数据时代,数据的异质性和变量的稀疏性是不可回避的两个问题。本文针对上述两个问题构建异质性Logit变量选择模型。模拟研究显示,在不同的异质性条件下,本文提出的方法可以明显区分有效变量和冗余变量。另一方面,通过Gmeans等评价指标可知该模型具有很好的预测效果。在关于上市公司财务预警分析的应用研究中,本文方法得到具有解释意义的结果,说明该方法具有一定的实证价值。

关键词: 异质性, 变量选择, 财务预警

Abstract: The heterogeneity in data and the sparsity of variables are two important problems and can-not be ignored in big data era. In this paper, a new Logit model is proposed when the data is of heterogeneity, sparsity and the dependent variable is binary. The results of Monte Carlo simulation show that the method can effectively distinguish the redundant variables in different groups. On the other hand, It shows that the model can predict well by Gmeans and other evaluation indicators. Finally, the method is applied to the research on financial early warning of listing corporation and some meaningful results are obtained, which shows the method in this paper has some practical value.

Key words: Heterogeneity, Variable Selection, Financial Early Warning