统计研究

• 论文 • 上一篇    下一篇

大数据时代统计学发展的若干问题

“大数据中的统计方法”课题组   

  • 出版日期:2017-01-15 发布日期:2017-02-09

Reflections on the Positioning of Statistics in the Big Data Era

“Statistical Methods in Big Data” Working Group   

  • Online:2017-01-15 Published:2017-02-09

摘要: 近年来,计算机和互联网的发展使得人类信息的拥有量达到了前所未有的程度,各类信息被保存流通起来,人类进入了大数据时代。大数据具有规模性、多样性,高速性等特点,给统计学的发展带来了新的机遇,同时也带来了新的挑战。本文回顾了统计学发展历史,剖析了统计学发展特点,并在此基础上讨论了大数据背景下统计学的发展定位;进一步分析统计学与计算机之间的关系,最后针对大数据分析中存在的若干误区提出了自己的观点。

关键词: 大数据计算机, 因果关系, 抽样, 数据质量

Abstract: In the past decades, the development of computer science and internet techniques has enabled researchers to collect, store, and analyze data at an unparalleled speed, with which we have entered the era of big data. Big data have unique characteristics (volume, variety, velocity, and veracity), which bring opportunities as well as challenges to statistics and statisticians. In this article, we examine the history of statistical methodological development and analyze the characteristics of statistical development, based on which we propose the positioning of statistics in the big data era and discuss the interconnections and interactions between statistics and computer science/internet technologies. At the end, we clarify a few misunderstandings in big data analysis.

Key words: Big Data, Computer Science, Causality, Sampling, Data Quality