네트워크 침입탐지에서 데이터 불균형을 고려한 균형 랜덤포레스트의 효과 : NSL-KDD 데이터셋을 중심으로
Effects of Balanced Random Forest Considering Data Imbalance in Network Intrusion Detection: Focusing on NSL-KDD Dataset

초록

As a way to respond to external intrusion threats due to the increase in Internet use, research on machine learning methods for network intrusion detection is active. However, the random forester method for intrusion detection has a data imbalance problem caused by minority classes. In general classification, including network intrusion detection, it is often aimed at the accuracy of the entire model rather than problems caused by such minority classes. So, it is not easy to deal with data imbalance. In this paper, we try to show the data imbalance problem in the basic random forest(RF) model used in network intrusion detection, and present the application and effects of balanced random forest(BRF). RF and BRF models were made up using the well-known KDDTrain+ data and evaluated with KDDTest+ data. The difference in the performance of RF and BRF was the tendency for BRF to have higher recall and lower accuracy compared to RF in intrusion types included in minority classes. Despite the decrease in accuracy, this effect of BRF can be expected to reduce serious damage by detecting intrusion types with high probabilities.

키워드

Intrusion DetectionBalanced Random ForestData ImbalanceNSL-KDD
제목
네트워크 침입탐지에서 데이터 불균형을 고려한 균형 랜덤포레스트의 효과 : NSL-KDD 데이터셋을 중심으로
제목 (타언어)
Effects of Balanced Random Forest Considering Data Imbalance in Network Intrusion Detection: Focusing on NSL-KDD Dataset
저자
윤한성
발행일
2024-12
저널명
(사)디지털산업정보학회 논문지
20
4
페이지
17 ~ 26