统计研究 ›› 2019, Vol. 36 ›› Issue (9): 82-.doi: 10.19343/j.cnki.11-1302/c.2019.09.007

• • 上一篇    下一篇

信用评分模型中拒绝推断问题研究:基于半监督协同训练法的改进

黎春 周振宇   

  • 出版日期:2019-09-25 发布日期:2019-09-25

Research on Reject Inference in Credit Scoring Model: Based on the Improvement of Semi-Supervised Co-Training Method

Li Chun & Zhou Zhenyu   

  • Online:2019-09-25 Published:2019-09-25

摘要: 随着我国金融市场的蓬勃发展,信用评价中的拒绝推断问题越来越受到重视。针对信用评分模型中存在的有类别标签的样本占比低,并且样本中的类别分布不平衡等问题,本文在半监督学习技术与集成学习理论的基础上,提出了一种新的算法——BCT算法。该算法通过使用动态Bagging生成多个子分类器,引入分类阈值参数来解决样本类别分布不平衡问题,以及设定早停止条件来避免算法迭代过程中存在的过拟合风险,以此对传统半监督协同训练法进行改进。通过在5个真实数据集上的实证分析发现,在不同数据集与不同拒绝比例下,BCT算法的性能均优于其他6种有监督学习和半监督学习算法的信用评分模型,显示了BCT算法具有良好的模型泛化性能和更高的模型评价能力。

关键词: 拒绝推断, 信用评分, 半监督协同训练, BCT算法

Abstract: With the vigorous development of financial market in China, the problem of reject inference in credit evaluation has been getting more and more attention. Aiming at the problem of low proportion of accepted samples and unbalanced distribution of sample categories existing in credit evaluation, this paper proposes a new algorithm, namely BCT (Bagging Co-Training with Optimized Threshold) algorithm, based on the semi-supervised learning technology and multi-classifier integration theory. The algorithm improves the traditional semi-supervised co-training method by using dynamic Bagging, introducing classification threshold parameters and setting early stop conditions. Through the empirical analysis on five real data sets, the BCT algorithm outperforms the other six supervised learning and semi-supervised learning algorithms in credit scoring models under different data sets and different rejection ratios and proves better performance in extented modeling and modeling evaluation.

Key words: Reject Inference, Credit Scoring, Semi-Supervised Co-Training Method, BCT Algorithm