Detailed Information

Cited 3 time in webofscience Cited 3 time in scopus
Metadata Downloads

Phishing Webpage Detection via Multi-Modal Integration of HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networksopen access

Authors
Yoon, Jun-HoBuu, Seok-JunKim, Hae-Jung
Issue Date
Aug-2024
Publisher
MDPI AG
Keywords
cyberspace security; graph convolutional network; multi-modal integration; phishing webpage detection; transformer network
Citation
Electronics (Basel), v.13, no.16
Indexed
SCIE
SCOPUS
Journal Title
Electronics (Basel)
Volume
13
Number
16
URI
https://scholarworks.gnu.ac.kr/handle/sw.gnu/73988
DOI
10.3390/electronics13163344
ISSN
2079-9292
2079-9292
Abstract
Detecting phishing webpages is a critical task in the field of cybersecurity, with significant implications for online safety and data protection. Traditional methods have primarily relied on analyzing URL features, which can be limited in capturing the full context of phishing attacks. In this study, we propose an innovative approach that integrates HTML DOM graph modeling with URL feature analysis using advanced deep learning techniques. The proposed method leverages Graph Convolutional Networks (GCNs) to model the structure of HTML DOM graphs, combined with Convolutional Neural Networks (CNNs) and Transformer Networks to capture the character and word sequence features of URLs, respectively. These multi-modal features are then integrated using a Transformer network, which is adept at selectively capturing the interdependencies and complementary relationships between different feature sets. We evaluated our approach on a real-world dataset comprising URL and HTML DOM graph data collected from 2012 to 2024. This dataset includes over 80 million nodes and edges, providing a robust foundation for testing. Our method demonstrated a significant improvement in performance, achieving a 7.03 percentage point increase in classification accuracy compared to state-of-the-art techniques. Additionally, we conducted ablation tests to further validate the effectiveness of individual features in our model. The results validate the efficacy of integrating HTML DOM structure and URL features using deep learning. Our framework significantly enhances phishing detection capabilities, providing a more accurate and comprehensive solution to identifying malicious webpages. © 2024 by the authors.
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Seok-Jun, Buu photo

Seok-Jun, Buu
IT공과대학 (컴퓨터공학부)
Read more

Altmetrics

Total Views & Downloads

BROWSE