统计研究

• 论文 • 上一篇    下一篇

谷歌流感趋势的成功与失误

秦磊 谢邦昌   

  • 出版日期:2016-02-15 发布日期:2016-03-02

Success and Failure of Google Flu Trends

Qin Lei & Xie Bangchang   

  • Online:2016-02-15 Published:2016-03-02

摘要: 大数据时代下机遇与挑战并存,如何基于传统方法去处理大数据引人深思,一味地追求大数据也不一定正确。本文以谷歌流感趋势(GFT)为案例,介绍了大数据在疾病疫情监测方面的主要技术及相关成果,阐述了大数据在使用中的关键问题,并结合复杂的统计学工具给出了一些改进措施。谷歌流感趋势的成功取决于相关关系的应用,其失误却来源于模型的构造、因果关系和相关关系的冲突等问题。谷歌流感趋势案例的分析与启示对政府今后在大数据解决方案中有重要的理论和实践意义。

关键词: 谷歌流感趋势, 大数据, 小数据, 降维, 回归预测

Abstract: Opportunities and challenges coexist in the big data era, how to handle big data based on the traditional methods is thought-provoking, blind pursuit of big data is not necessarily correct. This paper discusses Google Flu Trends (GFT) as an example to introduce the main techniques and achievements in disease surveillance with big data, and provides some improvements with sophisticated statistical tools. The success of GFT depends on the application of correlation, while its failure is derived from model errors, conflict between causation and correlation and other issues. Analysis and enlightenment of GFT are of theoretical and practical significance for government to provide big data solutions in the future.

Key words: Google Flu Trends, Big Data, Small Data, Dimension Reduction, Regression Forecast