统计研究 ›› 2022, Vol. 39 ›› Issue (7): 137-149.doi: 10.19343/j.cnki.11–1302/c.2022.07.011

• • 上一篇    下一篇

基于多任务深度神经网络的企业纳税行为甄别研究

李国锋 李祚娟 王哲吉   

  • 出版日期:2022-07-25 发布日期:2022-07-25

Research on Corporate Tax Paying Behavior Identification Based on Multi-Task Learning in Deep Neural Networks

Li Guofeng Li Zuojuan Wang Zheji   

  • Online:2022-07-25 Published:2022-07-25

摘要: 随着数字经济时代的到来,丰富的数据资源有利于全面精准地刻画企业纳税情况,但数据来源广、类别不平衡以及噪音多等问题,也给企业纳税行为的甄别工作带来挑战。本文融合企业报表以及证监会、海关和税务等部门的多来源涉税数据,基于K-S检验和随机森林算法,构建了企业纳税行为甄别指标体系;将不同行业企业纳税行为甄别工作视为不同任务,提出基于多任务深度神经网络的企业纳税行为甄别模型,充分利用了不同行业任务间的相关性和差异性信息;针对样本数据集不平衡问题,引入焦点损失函数进一步改进了甄别模型。研究发现,相对于传统Logistic、支持向量机和神经网络等单任务模型,本文多任务模型的企业纳税行为甄别能力、泛化能力和稳健性更强。当模型预测某企业纳税不遵从的概率超出阈值时,即可判定该企业为重点稽查对象,以辅助税务部门提升稽查效率。本研究为政府智慧税务治理工作提供了新的思路。

关键词: 多源数据, 多任务深度神经网络, 企业纳税行为甄别

Abstract: With the advent of the digital economy era, rich data resources are conducive to comprehensively and accurately depicting the tax payment situation of enterprises. However, multiple sources, imbalances, and noise in the data also pose challenges to the identification of corporate tax paying behavior. This paper integrates tax-related data from multiple sources such as corporate statements, China Securities Regulatory Commission, customs and tax authorities. Based on the K-S test and the random forest algorithm for feature filtering, we construct a set of indicators for screening corporate tax paying behavior. This paper proposes a multi-task deep neural network-based corporate tax paying behavior identification model with tax behavior identification in different industries as different sub-tasks, which takes into account the correlation and heterogeneity between the different industry tasks. Aiming at the imbalance of the sample data set, the focus loss function is introduced to improve the model. The study results show that compared with the traditional single-task models such as logistic, support vector machine and neural network, the multi-task model constructed in this paper has better performance in tax behavior identification, generalization, and robustness. When the model predicts that the probability of tax non-compliance of an enterprise exceeds the threshold, the enterprise can be identified as a key audit target to improve the efficiency of the tax authorities’ audit. This study provides a new idea for the government?s smart taxation governance.

Key words: Multi-Source Data, Multi-Task Learning in Deep Neural Networks, Corporate Tax Paying Behavior Identification