统计研究 ›› 2021, Vol. 38 ›› Issue (7): 140-152.doi: 10.19343/j.cnki.11-1302/c.2021.07.011

• • 上一篇    下一篇

含图结构的GR-LDA 方法及其信用违约预警应用

王小燕 张中艳   

  • 出版日期:2021-07-25 发布日期:2021-07-25

GR-LDA Model with Graph Structure and Its Application in Credit Default Warning

Wang Xiaoyan Zhang Zhongyan   

  • Online:2021-07-25 Published:2021-07-25

摘要: 信用风险管理关乎信贷行业的生存,风险指标筛选是其中的核心内容,已有研究发现指标间的关联信息有利于改进指标选择。为此,本文基于复杂网络理论建立了指标的图结构以体现其相关性信息,并将图结构与L0 惩罚方法相结合,建立一个线性判别分析(GR-LDA)模型实现指标筛选。理论上证明了模型的损失函数可转化为最小二乘函数,因而求解十分便利。模拟分析显示,对比Lasso-LDA 方法、L0-LDA 方法、弹性网Logistic 和Lasso-SVM,模型在变量选择方面和分类精度上具有一定的优势。图结构能够显著改进模型分类预测和指标选择能力,且随着指标间相关性增强,图结构的优势更加明显。最后将模型应用于P2P 网贷数据分析,发现GR-LDA 方法的预测评价表现良好,同时模型识别到了网络图中的重要指标。

关键词: 线性判别分析, 惩罚变量选择, 图结构, 信用违约

Abstract: Credit risk management concerns the survival of the loan industry, and risk indicator selection is the essential content. Existing studies show that the correlation information among indicators can improve indicator selection. Therefore, based on the complex network theory, this paper constructs a graph structure for the indicators to incorporate their correlations. Combined with the L0 penalty, a new linear discriminant analysis (GR-LDA) model is proposed to select indicators. It has theoretically proved that the loss function of the proposed model can be transformed into a least-square function, in which the computation is quite convenient. A simulation study shows that compared with the benchmarks (Lasso-LDA, L0-LDA, Elastic Net Logistic, and Lasso-SVM), the proposed GR-LDA has certain advantages in terms of variable selection and classification accuracy. The graph structure of indicators can significantly improve the performance of the model in classification and indicator selection. Moreover, the advantage of graph structure becomes more and more significant with the increase of the correlation between indicators. The empirical analysis of P2P online loan data shows that the proposed GR-LDA model has satisfactory prediction performance and identifies the important indicators in the graph structure.

Key words: Linear Discriminant Analysis, Penalized Variable Selection, Graph Structure, Credit Default