A Multimodal Voice Phishing Detection System Integrating Text and Audio Analysis
  • Kim, Jiwon
  • Gu, Seuli
  • Kim, Youngbeom
  • Lee, Sukwon
  • Kang, Changgu
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Voice phishing has emerged as a critical security threat, exploiting both linguistic manipulation and advances in synthetic speech technologies. Traditional keyword-based approaches often fail to capture contextual patterns or detect forged audio, limiting their effectiveness in real-world scenarios. To address this gap, we propose a multimodal voice phishing detection system that integrates text and audio analysis. The text module employs a KoBERT-based transformer classifier with self-attention interpretation, while the audio module leverages MFCC features and a CNN-BiLSTM classifier to identify synthetic speech. A fusion mechanism combines the outputs of both modalities, with experiments conducted on real-world call transcripts, phishing datasets, and synthetic voice corpora. The results demonstrate that the proposed system consistently achieves high values regarding the accuracy, precision, recall, and F1-score on validation data while maintaining robust performance in noisy and diverse real-call scenarios. Furthermore, attention-based interpretability enhances trustworthiness by revealing cross-token and discourse-level interaction patterns specific to phishing contexts. These findings highlight the potential of the proposed system as a reliable, explainable, and deployable solution for preventing the financial and social damage caused by voice phishing. Unlike prior studies limited to single-modality or shallow fusion, our work presents a fully integrated text-audio detection pipeline optimized for Korean real-world datasets and robust to noisy, multi-speaker conditions.

키워드

voice phishing detectionmultimodal learningaudio forgery analysistransformer-based text classification
제목
A Multimodal Voice Phishing Detection System Integrating Text and Audio Analysis
저자
Kim, JiwonGu, SeuliKim, YoungbeomLee, SukwonKang, Changgu
DOI
10.3390/app152011170
발행일
2025-10
유형
Article
저널명
Applied Sciences-basel
15
20