统计研究 ›› 2021, Vol. 38 ›› Issue (2): 114-134.doi: 10.19343/j.cnki.11-1302/c.2021.02.009

• • 上一篇    下一篇

大数据下张量充分降维方法及其应用研究

马少沛 孙庆慧 武雅萱 田茂再   

  • 出版日期:2021-02-25 发布日期:2021-02-25

Research on Tensor Sufficient Dimension Reduction Method and Its Application in Big Data

Ma Shaopei Sun Qinghui Wu Yaxuan Tian Maozai   

  • Online:2021-02-25 Published:2021-02-25

摘要: 在大数据时代,金融学、基因组学和图像处理等领域产生了大量的张量数据。Zhong等(2015)提出了张量充分降维方法,并给出了处理二阶张量的序列迭代算法。鉴于高阶张量在实际生活中的广泛应用,本文将Zhong等(2015)的算法推广到高阶,以三阶张量为例,提出了两种不同的算法:结构转换算法和结构保持算法。两种算法都能够在不同程度上保持张量原有结构信息,同时有效降低变量维度和计算复杂度,避免协方差矩阵奇异的问题。将两种算法应用于人像彩图的分类识别,以二维和三维点图等形式直观展现了算法分类结果。将本文的结构保持算法与K-means聚类方法、t-SNE非线性降维方法、多维主成分分析、多维判别分析和张量切片逆回归共五种方法进行对比,结果表明本文所提方法在分类精度方面有明显优势,因此在图像识别及相关应用领域具有广阔的发展前景。

关键词: 张量, 充分降维, 迭代算法, 图像识别

Abstract: In the era of big data, a large amount of tensor data are generated in many fields such as finance, genomics and image processing. Zhong et al. (2015) propose tensor sufficient dimension reduction and present a sequential iterative algorithm for second-order tensors. In view of the wide application of highorder tensors in real life, this paper extends the iterative algorithm of Zhong et al. (2015) to higher-order tensors. Taking third-order tensors as an example, two different algorithms are proposed: structural transformation algorithm and structural maintenance algorithm. The two algorithms can effectively reduce the variable dimension and computational complexity while maintaining the original structural information of tensors, avoiding the singular problem of covariance matrix. The two algorithms are applied to the classification and recognition of color facial images. Then 2-D and 3-D scatter plots are used to display the classification results of the algorithms. Comparing the structural maintenance algorithm with the K-means clustering method, t-SNE nonlinear dimension reduction method, multilinear principal component analysis, multilinear discriminant analysis and tensor SIR, the paper shows that the proposed method has obvious advantages in classification accuracy, and has broad development prospect in image recognition and related fields of application.

Key words: Tensor, Sufficient Dimension Reduction, Iterative Algorithm, Image Recognition