多源异常检测的整合单类SVM方法及应用

doi:10.19343/j.cnki.11–1302/c.2023.04.011

统计研究 ›› 2023, Vol. 40 ›› Issue (4): 138-150.doi: 10.19343/j.cnki.11–1302/c.2023.04.011

多源异常检测的整合单类SVM方法及应用

张庆昭陈子怡方匡南

出版日期:2023-04-25 发布日期:2023-04-25

Integrative One-Class SVM for Multi-Source Anomaly Detection and Its Application

Zhang Qingzhao Chen Ziyi Fang Kuangnan

Online:2023-04-25 Published:2023-04-25

1. 多源异常检测的整合单类SVM方法及应用（附件）.pdf(252KB)

摘要/Abstract

摘要： 异常检测作为一种智能化的数据管控手段，在网络入侵检测、欺诈识别和故障检测等场景中都扮演着重要角色。大数据时代下，数据来源众多，给多源数据集的异常检测建模分析带来了较大挑战。本文将惩罚整合分析的思想应用到异常检测中，通过对不同数据集的模型系数差异进行惩罚，提出了基于多源数据的整合单类SVM异常检测方法。该方法可以同时对多源数据进行异常检测并自动将相似数据集聚为一类，可以大幅减少模型待估参数个数并降低后期维护成本。模拟实验表明，本文提出的方法不仅能准确将数据集聚类，而且模型预测效果优于合并数据集建模和每个数据集单独建模。该方法在某银行网站日志异常检测中也有较好的表现。

关键词: 异常检测, 单类SVM, 多源数据, 整合分析

Abstract: As an intelligent method of data control, anomaly detection plays an important role in network intrusion detection, fraud identification and fault detection. There are many data sources in the age of big data, which brings great challenges to anomaly detection of multi-source datasets. In this paper, the idea of penalty integration analysis is applied to anomaly detection. By pairwise punishment for the difference of model coefficients of different datasets, the integrative one-class SVM anomaly detection model is proposed. The proposed method can detect anomalies of multi-source data and automatically cluster similar data into one group, which can greatly reduce the model parameters to be estimated and reduce later maintenance cost. The simulation experiments show that the proposed method can not only accurately cluster datasets, but also improve the model prediction effects compared with the two cases of directly merging the datasets and modeling each dataset separately. The method in this paper is applied to anomaly detection of bank website logs and achieves good performance.

Key words: Anomaly Detection, One-Class SVM, Multi-Source Dataset, Integrative Analysis

张庆昭等. 多源异常检测的整合单类SVM方法及应用[J]. 统计研究, 2023, 40(4): 138-150.

Zhang Qingzhao et al. Integrative One-Class SVM for Multi-Source Anomaly Detection and Its Application[J]. Statistical Research, 2023, 40(4): 138-150.

[1]	方匡南等. 考虑数据源网络结构的高维数据整合分析与子群识别研究[J]. 统计研究, 2022, 39(7): 125-136.
[2]	李国锋等. 基于多任务深度神经网络的企业纳税行为甄别研究[J]. 统计研究, 2022, 39(7): 137-149.
[3]	吴梦云等. 多源高维数据的多分类纵向整合分析及应用[J]. 统计研究, 2021, 38(8): 132-145.
[4]	孙怡帆等. 基于变系数模型的高维数据异同性识别方法研究[J]. 统计研究, 2021, 38(5): 136-146.
[5]	范新妍等. 基于整合治愈率模型的信贷违约时点预测[J]. 统计研究, 2021, 38(2): 99-113.
[6]	方匡南赵梦峦. 基于多源数据融合的个人信用评分研究 [J]. 统计研究, 2018, 35(12): 92-101.
[7]	田茂再. 大数据时代统计学重构研究中的几个热点问题[J]. 统计研究, 2015, 32(5): 3-12.
[8]	马双鸽等. 大数据的整合分析方法[J]. 统计研究, 2015, 32(11): 3-11.

多源异常检测的整合单类SVM方法及应用

Integrative One-Class SVM for Multi-Source Anomaly Detection and Its Application

赞

补充材料

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 8

Metrics

本文评价

推荐阅读 10