统计研究

• 论文 •    下一篇

数据科学的统计学内涵

魏瑾瑞 蒋萍   

  • 出版日期:2014-05-15 发布日期:2014-05-12

The Statistical Connotation of Data Science

Jinrui Wei & Ping Jiang   

  • Online:2014-05-15 Published:2014-05-12

摘要: 数据科学以大数据为研究对象,而大数据对统计分析最直接的冲击莫过于数据收集方式的变革,同时统计分析的视野也不再局限于传统的属性数据,而是包括了关系数据、非结构、半结构数据等其他类型更丰富的数据。伴随着数据开放运动,数据库之间的关联信息的价值逐步得到体现。基于统计学的视角分别从科学理论基础、计算机处理技术和商业应用等三个维度研究了数据科学的统计学内涵,探讨了数据科学范式对统计分析过程的直接影响,以及统计学视角面临的机遇与挑战。

关键词: 数据科学, 大数据, 统计学, 计算机科学

Abstract: Big data is the key in data science. The direct impact of big data on statistical analysis is that it provides a new way of data collection. And the scope of statistics has broaden to include the relational data, unstructured data, semi-structured data and other types of data, no longer limited to traditional attribute data. With the open data movement, the value of the linkage between the databases has been paid much more attention. In this paper, we study the statistical connotation of data science in three dimensions with statistical view, such as theoretical basis, computer sciences and business application. The impact of the paradigm of data science on the process of statistical analysis has been explored, and also the opportunity and challenge for statistics.

Key words: Data Science, Big Data, Statistics, Computer Science