Cited 0 time in
A Multimodal Voice Phishing Detection System Integrating Text and Audio Analysis
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Kim, Jiwon | - |
| dc.contributor.author | Gu, Seuli | - |
| dc.contributor.author | Kim, Youngbeom | - |
| dc.contributor.author | Lee, Sukwon | - |
| dc.contributor.author | Kang, Changgu | - |
| dc.date.accessioned | 2025-11-13T08:30:16Z | - |
| dc.date.available | 2025-11-13T08:30:16Z | - |
| dc.date.issued | 2025-10 | - |
| dc.identifier.issn | 2076-3417 | - |
| dc.identifier.issn | 2076-3417 | - |
| dc.identifier.uri | https://scholarworks.gnu.ac.kr/handle/sw.gnu/80791 | - |
| dc.description.abstract | Voice phishing has emerged as a critical security threat, exploiting both linguistic manipulation and advances in synthetic speech technologies. Traditional keyword-based approaches often fail to capture contextual patterns or detect forged audio, limiting their effectiveness in real-world scenarios. To address this gap, we propose a multimodal voice phishing detection system that integrates text and audio analysis. The text module employs a KoBERT-based transformer classifier with self-attention interpretation, while the audio module leverages MFCC features and a CNN-BiLSTM classifier to identify synthetic speech. A fusion mechanism combines the outputs of both modalities, with experiments conducted on real-world call transcripts, phishing datasets, and synthetic voice corpora. The results demonstrate that the proposed system consistently achieves high values regarding the accuracy, precision, recall, and F1-score on validation data while maintaining robust performance in noisy and diverse real-call scenarios. Furthermore, attention-based interpretability enhances trustworthiness by revealing cross-token and discourse-level interaction patterns specific to phishing contexts. These findings highlight the potential of the proposed system as a reliable, explainable, and deployable solution for preventing the financial and social damage caused by voice phishing. Unlike prior studies limited to single-modality or shallow fusion, our work presents a fully integrated text-audio detection pipeline optimized for Korean real-world datasets and robust to noisy, multi-speaker conditions. | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | MDPI | - |
| dc.title | A Multimodal Voice Phishing Detection System Integrating Text and Audio Analysis | - |
| dc.type | Article | - |
| dc.publisher.location | 스위스 | - |
| dc.identifier.doi | 10.3390/app152011170 | - |
| dc.identifier.scopusid | 2-s2.0-105020236570 | - |
| dc.identifier.wosid | 001602491900001 | - |
| dc.identifier.bibliographicCitation | Applied Sciences-basel, v.15, no.20 | - |
| dc.citation.title | Applied Sciences-basel | - |
| dc.citation.volume | 15 | - |
| dc.citation.number | 20 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Chemistry | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalResearchArea | Materials Science | - |
| dc.relation.journalResearchArea | Physics | - |
| dc.relation.journalWebOfScienceCategory | Chemistry, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Materials Science, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Physics, Applied | - |
| dc.subject.keywordAuthor | voice phishing detection | - |
| dc.subject.keywordAuthor | multimodal learning | - |
| dc.subject.keywordAuthor | audio forgery analysis | - |
| dc.subject.keywordAuthor | transformer-based text classification | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
Gyeongsang National University Central Library, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, 52828, Republic of Korea+82-55-772-0532
COPYRIGHT 2022 GYEONGSANG NATIONAL UNIVERSITY LIBRARY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
