统计研究 ›› 2023, Vol. 40 ›› Issue (3): 151-160.doi: 10.19343/j.cnki.11–1302/c.2023.03.012

• • 上一篇    

人口普查中行业和职业编码智能化——国际经验与中国进程

孙望书 孙 旭   

  • 出版日期:2023-03-25 发布日期:2023-03-25

Research on Automated Coding of Industries and Occupations in the Population Census: International Experience and the Process in China

Sun Wangshu Sun Xu   

  • Online:2023-03-25 Published:2023-03-25

摘要: 行业和职业编码是人口普查中行业和职业信息采集与量化分析之间必要的资料整理环节。随着信息技术的快速发展及其在社会管理工作中的广泛应用,人口普查行业和职业编码走上了智能化探索之路。智能化计算机编码极大减少编码工作对人工的依赖,显著降低编码成本,提高数据的时效性,控制编码环节的再生性误差。本文结合国内外相关研究,总结计算机编码的两种基本实现思路,阐述字典编码方法和模型编码方法的原理、技术及其在人口普查编码实践中的应用。我国在前六次人口普查中,行业和职业编码均由人工完成,第七次全国人口普查采用计算机辅助编码技术,初步实现行业和职业编码智能化。未来可以有针对性地借鉴其他国家社会调查编码工作的先进经验,向全面智能化方向迈进,进一步提高计算机编码方式在我国社会精准化管理中的服务水平。

关键词: 行业和职业编码, 人口普查, 信息技术, 文本检索, 机器学习

Abstract: Industry and occupation coding is the necessary stage between data collection and quantitative analysis in the population census. With the rapid development of information technology and its wide application in social management, the industry and occupation coding in the census is developing towards intelligence. The application of automated coding significantly reduces the human input, coding time and costs, and especially the reproducibility error. Combined with relevant domestic and foreign research, this paper summarizes two basic methods of automated coding, and expounds the principles and techniques of dictionary coding method and model coding method with their application in the census coding. In the first six population censuses of China, industry and occupation coding was all completed manually, till the seventh census which initially applied automated and intelligent coding. In the future, it is beneficiary to draw on the advanced experience of social survey coding in other countries in a targeted manner, move forward in the direction of comprehensive intelligence, and further improve the service level of automated coding in the precise management of China’s society.

Key words: Industry and Occupation Coding, Population Census, Information Technology, Text Retrieval, Machine Learning