统计研究 ›› 2007, Vol. 24 ›› Issue (8): 72-76.

• 论文 • 上一篇    下一篇

稳健主成分分析方法研究及其在经济管理中的应用

王斌会   

  1. 暨南大学经济学院统计系
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-08-15 发布日期:2007-08-15

Robust Principal Component Analysis Method and its Application

Wang Binhui   

  • Received:1900-01-01 Revised:1900-01-01 Online:2007-08-15 Published:2007-08-15

摘要: 传统的多元统计分析方法,如主成分分析方法和因子分析方法等的共同点是计算样本的均值向量和协方差矩阵,并在这两者的基础上计算其他统计量。当样本数据中没有离群值时,这些方法都能得到优良的结果。但是当样本数据中包括离群值时,计算结果就会很容易受到这些离群值的影响,这是因为传统的均值向量和协方差矩阵都不是稳健的统计量。本文对目前较流行的FAST-MCD方法的算法进行研究,构造了稳健的均值向量和稳健的协方差矩阵,应用到主成分分析中,并针对其不足之处提出改进方法。从模拟和实证的结果来看,改进后的的方法和新的稳健估计量确实能够对离群值起到很好的抵抗作用,大幅度地降低它们对计算结果的影响。

Abstract: Traditional multivariable analysis method, for example, principal component analysis (PCA) method and factor analysis method, are common in calculating the mean vector, the covariance matrix of sample and other variables. When there are no outliers in the sample, these methods can get good results. But when there are outliers in the sample, these methods are easily affected by them. This paper focuses on the study of the most popular FAST-MCD method which is improved to overcome its shortcoming, constructs robust mean vector and robust covariance matrix which is applied in PCA method. From the result of simulation and empirical study, the improved method and the new robust estimator are good for resisting outliers and decrease their influence greatly.

 

Key words: Outliers, FAST-MCD algorithm, PCA, Robust PCA