统计研究 ›› 2011, Vol. 28 ›› Issue (2): 87-92.

• 论文 • 上一篇    下一篇

基于聚类关联规则的缺失数据处理研究

方匡南 谢邦昌   

  • 出版日期:2011-02-15 发布日期:2011-02-25

Research on Dealing with Missing Data Based on Clustering and Association Rule

FANG Kuang-南, XIE Bang-Chang   

  • Online:2011-02-15 Published:2011-02-25

摘要: 本文提出了基于聚类和关联规则的缺失数据处理新方法,通过聚类方法将含有缺失数据的数据集相近的记录归到一类,然后利用改进后的关联规则方法对各子数据集挖掘变量间的关联性,并利用这种关联性来填补缺失数据。通过实例分析,发现该方法对缺失数据处理,尤其是海量数据集具有较好的效果。

关键词: 聚类, 关联规则, 缺失数据

Abstract: This paper proposed a new method of dealing with missing data based on clustering and association rule. Firstly, we divided the original data set into several parts by clustering method, and then use the improved association rule to investigate useful rules between the variables on those child data sets, and use these rules to fill the missing data. We found that this method has a good result on handling massive data sets with missing data by empirical study.

Key words: Clustering, Association Rule, Missing Data