Phishing Webpage Detection via Multi-Modal Integration of HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks
Citations

WEB OF SCIENCE

6
Citations

SCOPUS

12

초록

Detecting phishing webpages is a critical task in the field of cybersecurity, with significant implications for online safety and data protection. Traditional methods have primarily relied on analyzing URL features, which can be limited in capturing the full context of phishing attacks. In this study, we propose an innovative approach that integrates HTML DOM graph modeling with URL feature analysis using advanced deep learning techniques. The proposed method leverages Graph Convolutional Networks (GCNs) to model the structure of HTML DOM graphs, combined with Convolutional Neural Networks (CNNs) and Transformer Networks to capture the character and word sequence features of URLs, respectively. These multi-modal features are then integrated using a Transformer network, which is adept at selectively capturing the interdependencies and complementary relationships between different feature sets. We evaluated our approach on a real-world dataset comprising URL and HTML DOM graph data collected from 2012 to 2024. This dataset includes over 80 million nodes and edges, providing a robust foundation for testing. Our method demonstrated a significant improvement in performance, achieving a 7.03 percentage point increase in classification accuracy compared to state-of-the-art techniques. Additionally, we conducted ablation tests to further validate the effectiveness of individual features in our model. The results validate the efficacy of integrating HTML DOM structure and URL features using deep learning. Our framework significantly enhances phishing detection capabilities, providing a more accurate and comprehensive solution to identifying malicious webpages. © 2024 by the authors.

키워드

cyberspace securitygraph convolutional networkmulti-modal integrationphishing webpage detectiontransformer network
제목
Phishing Webpage Detection via Multi-Modal Integration of HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks
저자
Yoon, Jun-HoBuu, Seok-JunKim, Hae-Jung
DOI
10.3390/electronics13163344
발행일
2024-08
유형
Article
저널명
Electronics (Basel)
13
16