Detailed Information

Cited 0 time in webofscience Cited 1 time in scopus
Metadata Downloads

Deep Generative Replay with Denoising Diffusion Probabilistic Models for Continual Learning in Audio Classificationopen access

Authors
Lee, Hyeon-JuBuu, Seok-Jun
Issue Date
Sep-2024
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
Audio classification; Continual learning; Denoising diffusion probabilistic model; Generative replay; Triplet network
Citation
IEEE Access, v.12, pp 134714 - 134727
Pages
14
Indexed
SCIE
SCOPUS
Journal Title
IEEE Access
Volume
12
Start Page
134714
End Page
134727
URI
https://scholarworks.gnu.ac.kr/handle/sw.gnu/74333
DOI
10.1109/ACCESS.2024.3459954
ISSN
2169-3536
2169-3536
Abstract
Accurate classification of audio data is essential in various fields such as speech recognition, safety management, healthcare, security, and surveillance. However, existing deep learning classifiers typically require extensive pre-collected data and struggle to adapt to the emergence of new audio classes over time. To address these challenges, this paper proposes a continual learning method utilizing Diffusion-driven Generative Replay (DDGR). The proposed DDGR method continuously updates the model at each training stage with high-quality generated data from Denoising Diffusion Probabilistic Models (DDPM), preserving existing knowledge. Furthermore, by embedding disentangled representations through a triplet network, the model can effectively recognize new classes as they emerge. This approach overcomes the problem of catastrophic forgetting and effectively resolves the issue of data scalability in a continual learning setup. The proposed method achieved the highest AIA values of 95.45% and 72.99% on the Audio MNIST and ESC-50 datasets, respectively, compared to existing continual learning methods. Additionally, for Audio MNIST, it showed IM-0.01, FWT 0.27, FM 0.06, and BWT-0.06, indicating that it best preserves prior knowledge while learning new data most effectively. For ESC-50, it demonstrated IM of-0.12, FWT of 0.09, FM of 0.17, and BWT of-0.17. These results validate the efficacy of the DDGR method in maintaining prior knowledge while integrating new information and highlight the complementary role of the triplet network in enhancing feature representation. © 2013 IEEE.
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Seok-Jun, Buu photo

Seok-Jun, Buu
IT공과대학 (컴퓨터공학부)
Read more

Altmetrics

Total Views & Downloads

BROWSE