统计研究 ›› 2020, Vol. 37 ›› Issue (8): 91-103.doi: 10.19343/j.cnki.11-1302/c.2020.08.007

• • 上一篇    下一篇

基于非负矩阵分解的函数型聚类算法

高海燕 黄恒君 王宇辰   

  • 出版日期:2020-08-25 发布日期:2020-08-26

Functional Clustering Algorithm Based on Non-negative Matrix Factorization

Gao Haiyan Huang Hengjun Wang Yuchen   

  • Online:2020-08-25 Published:2020-08-26

摘要: 函数型聚类分析算法涉及投影和聚类两个基本要素。通常,最优投影结果未必能够有效地保留类别信息,从而影响后续聚类效果。为此,本文梳理了函数型聚类的构成要素及运行过程;借助非负矩阵分解的聚类特性,提出了基于非负矩阵分解的函数型聚类算法,构建了“投影与聚类”并行的实现框架,并采用交替迭代方法更新求解,分析了算法的计算时间复杂度。针对随机模拟数据验证和语音识别数据的实例检验结果显示,该函数型聚类算法有助于提高聚类效果;针对北京市二氧化氮(NO2)污染物小时浓度数据的实例应用表明,该函数型聚类算法对空气质量监测点类型的区分能够充分识别站点布局的空间模式,具有良好的实际应用价值。

关键词: 函数型数据分析, 聚类, 非负矩阵分解, 交替迭代方法

Abstract: Functional clustering algorithm involves two basic elements:projecting and clustering.Generally speaking,the optimal projection results may not effectively retain category information,thus affecting the subsequent clustering effect.In this paper,the elements and operating process of functional clustering are reviewed,a functional clustering algorithm is proposed by virtue of the clustering characteristics of non-negative matrix factorization,and a one-step implementation framework of“clustering while projecting”is constructed.Meanwhile,the alternative and interative algorithm is used to update the solution,and the computational complexity of our proposed algorithm is discussed.The test results of random simulation data and speech recognition data show that our functional clustering algorithm is helpful to improve the clustering effect.A case study on the hourly concentration of nitrogen dioxide (NO2) in Beijing shows that our algorithm can distinguish the types of air quality monitoring stations and fully identify the spatial pattern of station layout,which has good application value.

Key words: Functional Data Analysis, Clustering, Non-negative Matrix Factorization, Alternative and Iterative Algorithm