统计研究 ›› 2023, Vol. 40 ›› Issue (4): 138-150.doi: 10.19343/j.cnki.11–1302/c.2023.04.011

• • 上一篇    下一篇

多源异常检测的整合单类SVM方法及应用

张庆昭 陈子怡 方匡南   

  • 出版日期:2023-04-25 发布日期:2023-04-25

Integrative One-Class SVM for Multi-Source Anomaly Detection and Its Application

Zhang Qingzhao Chen Ziyi Fang Kuangnan   

  • Online:2023-04-25 Published:2023-04-25

摘要: 异常检测作为一种智能化的数据管控手段,在网络入侵检测、欺诈识别和故障检测等场景中都扮演着重要角色。大数据时代下,数据来源众多,给多源数据集的异常检测建模分析带来了较大挑战。本文将惩罚整合分析的思想应用到异常检测中,通过对不同数据集的模型系数差异进行惩罚,提出了基于多源数据的整合单类SVM异常检测方法。该方法可以同时对多源数据进行异常检测并自动将相似数据集聚为一类,可以大幅减少模型待估参数个数并降低后期维护成本。模拟实验表明,本文提出的方法不仅能准确将数据集聚类,而且模型预测效果优于合并数据集建模和每个数据集单独建模。该方法在某银行网站日志异常检测中也有较好的表现。

关键词: 异常检测, 单类SVM, 多源数据, 整合分析

Abstract: As an intelligent method of data control, anomaly detection plays an important role in network intrusion detection, fraud identification and fault detection. There are many data sources in the age of big data, which brings great challenges to anomaly detection of multi-source datasets. In this paper, the idea of penalty integration analysis is applied to anomaly detection. By pairwise punishment for the difference of model coefficients of different datasets, the integrative one-class SVM anomaly detection model is proposed. The proposed method can detect anomalies of multi-source data and automatically cluster similar data into one group, which can greatly reduce the model parameters to be estimated and reduce later maintenance cost. The simulation experiments show that the proposed method can not only accurately cluster datasets, but also improve the model prediction effects compared with the two cases of directly merging the datasets and modeling each dataset separately. The method in this paper is applied to anomaly detection of bank website logs and achieves good performance.

Key words: Anomaly Detection, One-Class SVM, Multi-Source Dataset, Integrative Analysis