统计研究 ›› 2017, Vol. 34 ›› Issue (11): 3-14.doi: 10.19343/j.cnki.11-1302/c.2017.11.001

• •    下一篇

关于统计数据的几点认识

李金昌   

  • 出版日期:2017-11-15 发布日期:2017-11-25

Some Understandings on Statistical Data

Li Jinchang   

  • Online:2017-11-15 Published:2017-11-25

摘要: 本文对大家最为熟悉的概念“统计数据”进行了专门的讨论。论文首先基于对数据的理解,讨论了统计数据的内涵特征及其变化过程,认为一切被记录的事实都是数据,一切可用统计方法处理的数据都是统计数据;然后以历史上著名的关于父母酗酒是否影响后代身心健康的皮尔逊论战,来说明在现实统计分析研究中我们所面临的统计数据困惑,认为统计学的构成要素就是问题、数据和方法,方法围着数据转,数据跟着问题走;接着讨论了什么是大数据思维,什么是小数据以及为什么要研究小数据等问题,认为要基于大数据研究小数据、基于小数据挖掘大数据;最后指出充分挖掘数据价值是统计学的发展方向,归纳了大数据分析的基本路径,提出大数据分析要着力克服统计测度、数据孤岛、半随机性、异常值和归纳法与演绎法相结合等一系列复杂的问题。

关键词: 数据, 统计数据, 大数据, 小数据, 统计学

Abstract: This paper is a focused discussion of the widely-known concept of "Statistical Data." Based on its understanding of data, it attempts to fathom the meaning of Statistical Data and its course of evolution, and holds that all recorded events are data and all data that can be processed through the statistical method are Statistical Data. Taking the well-known Pearson Debate on alcoholism and offspring as an example to reflect the confusion brought about by Statistical Data in analysis and research using statistical methods, it argues that the main components of statistics is problem, data and method. Whereas method revolves round data, data is oriented by problem. Then, addressing a few questions - what is Big Data Thinking, what is Small Data and why should Small Date be studied - this paper indicates that while Small Data needs to be studied based on Big Data, Big Data needs to be mined through Small Data. Finally, it concludes that the trend in which statistics is developing is such that the value of data needs to be fully mined. Categorizing the foundamental routes of Big Data analysis, this paper proposes that the analysis of Big Data should focus on a series of complicated questions, such as statistical measurement, data silos, semi-random, outlier and the integration of induction and deduction.

Key words: Data, Statistical Data, Big Data, Small Data, Statistics