统计研究

• 论文 •    下一篇

大数据时代的高维统计:稀疏建模的发展及其应用

李仲达等   

  • 出版日期:2015-10-15 发布日期:2015-10-26

High-Dimensional Statistics in Big Data Era: Development and Application of Sparse Modeling

Li Zhongda et.al   

  • Online:2015-10-15 Published:2015-10-26

摘要: 高维稀疏建模是当前统计学与计量经济学的理论前沿,是一种处理大数据的统计分析方法,在经济与金融领域有着广泛的应用前景。本文探讨了高维数据与高维模型对传统方法带来的问题和挑战,并梳理了稀疏建模的发展、选择机制的作用及惩罚函数方法的理论性质。在实证方面,本文利用了高维稀疏VAR模型研究了35个大中城市住宅销售价格的预测问题。相比起传统的VAR模型与低维的动态面板数据模型,高维稀疏VAR模型的结构更加精简,能够捕捉重要解释变量与经济信息,预测效果更优。

关键词: 高维稀疏模型, 惩罚函数, 模型选择, 房价预测

Abstract: High-dimensional sparse modeling is one of the cutting-edge issues in contemporary statistics and econometrics, which is a kind of statistical methodology for the analysis of big data and will be widely used in the fields of economics and finance. This paper explores the problems and challenges of high-dimensional data and high-dimensional model for the traditional methods, and reviews the development of sparse modeling, the role of selection mechanism as well as the theoretical properties of penalty function methods. In empirical application, we use a high-dimensional sparse VAR model to study the problem of real estate prices forecasting in 35 cities. Compared to the traditional VAR model and low-dimensional dynamic panel data model, the structure of the high-dimensional sparse VAR model is more simple and able to capture important explanatory variables and economic information. In addition, it also has a satisfactory out-of sample predictive ability, which induces a smaller forecasting bias than the traditional models.

Key words: High-dimensional Sparse Model, Penalty Function, Model Selection, Real Estate Prices Forecasting