Detailed Information

Cited 0 time in webofscience Cited 1 time in scopus
Metadata Downloads

Deep Generative Replay with Denoising Diffusion Probabilistic Models for Continual Learning in Audio Classification

Full metadata record
DC Field Value Language
dc.contributor.authorLee, Hyeon-Ju-
dc.contributor.authorBuu, Seok-Jun-
dc.date.accessioned2024-12-03T05:30:40Z-
dc.date.available2024-12-03T05:30:40Z-
dc.date.issued2024-09-
dc.identifier.issn2169-3536-
dc.identifier.issn2169-3536-
dc.identifier.urihttps://scholarworks.gnu.ac.kr/handle/sw.gnu/74333-
dc.description.abstractAccurate classification of audio data is essential in various fields such as speech recognition, safety management, healthcare, security, and surveillance. However, existing deep learning classifiers typically require extensive pre-collected data and struggle to adapt to the emergence of new audio classes over time. To address these challenges, this paper proposes a continual learning method utilizing Diffusion-driven Generative Replay (DDGR). The proposed DDGR method continuously updates the model at each training stage with high-quality generated data from Denoising Diffusion Probabilistic Models (DDPM), preserving existing knowledge. Furthermore, by embedding disentangled representations through a triplet network, the model can effectively recognize new classes as they emerge. This approach overcomes the problem of catastrophic forgetting and effectively resolves the issue of data scalability in a continual learning setup. The proposed method achieved the highest AIA values of 95.45% and 72.99% on the Audio MNIST and ESC-50 datasets, respectively, compared to existing continual learning methods. Additionally, for Audio MNIST, it showed IM-0.01, FWT 0.27, FM 0.06, and BWT-0.06, indicating that it best preserves prior knowledge while learning new data most effectively. For ESC-50, it demonstrated IM of-0.12, FWT of 0.09, FM of 0.17, and BWT of-0.17. These results validate the efficacy of the DDGR method in maintaining prior knowledge while integrating new information and highlight the complementary role of the triplet network in enhancing feature representation. © 2013 IEEE.-
dc.format.extent14-
dc.language영어-
dc.language.isoENG-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleDeep Generative Replay with Denoising Diffusion Probabilistic Models for Continual Learning in Audio Classification-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/ACCESS.2024.3459954-
dc.identifier.scopusid2-s2.0-85204088440-
dc.identifier.wosid001327334100001-
dc.identifier.bibliographicCitationIEEE Access, v.12, pp 134714 - 134727-
dc.citation.titleIEEE Access-
dc.citation.volume12-
dc.citation.startPage134714-
dc.citation.endPage134727-
dc.type.docTypeArticle-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaTelecommunications-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryTelecommunications-
dc.subject.keywordAuthorAudio classification-
dc.subject.keywordAuthorContinual learning-
dc.subject.keywordAuthorDenoising diffusion probabilistic model-
dc.subject.keywordAuthorGenerative replay-
dc.subject.keywordAuthorTriplet network-
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Seok-Jun, Buu photo

Seok-Jun, Buu
IT공과대학 (컴퓨터공학부)
Read more

Altmetrics

Total Views & Downloads

BROWSE