음성인식 개선을 위한 Variational U-Net 기반 왜곡된 오디오 재구축

이진희; 부석준

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

음성인식 개선을 위한 Variational U-Net 기반 왜곡된 오디오 재구축

Full metadata record

DC Field	Value	Language
dc.contributor.author	이진희	-
dc.contributor.author	부석준	-
dc.date.accessioned	2024-12-03T07:30:50Z	-
dc.date.available	2024-12-03T07:30:50Z	-
dc.date.issued	2024-11	-
dc.identifier.issn	2383-6318	-
dc.identifier.issn	2383-6326	-
dc.identifier.uri	https://scholarworks.gnu.ac.kr/handle/sw.gnu/74705	-
dc.description.abstract	부적절한 녹음 조건과 장비로 인한 오디오 신호의 왜곡은 정확한 음성 인식 및 양질의 오디오 분석에 어려움을 야기한다. 오디오 왜곡을 복구할 때는 그 다양성과 복잡성, 그리고 예측 불가능한 특성을 효과적으로 관리할 수 있는 정교한 방법이 요구된다. 본 논문에서는 변분 추론의 확률적 모델링 능력과 Wave U-Net 신경망의 공간 및 시간적 정보 보존된 재구축 강점을 결합한 Variational U-Net 신경망을 제안한다. 모델은 변분 추론을 통해 깨끗한 오디오 신호의 분포를 효율적으로 포착하여 다양한 왜곡에 영향받는 오디오의 확률적인 생성을 용이하게 한다. 동시에, Wave U-Net 구조를 본 모델에 적용하여 유효한 오디오 세부 특징의 보존을 보장하여 복원된 신호의 품질을 향상시킨다. 제안하는 방법은 실제 세계의 오디오 왜곡 시나리오를 통제된 설정에서 반영할 수 있는 Audio MNIST 데이터셋에서 엄격하게 평가되었다. 성능은 네 가지 최신 딥러닝 알고리즘과의 비교를 포함하여 공정하게 비교되었으며, 제안하는 방법은 다른 방법에 대비하여 SI-SDR 기준으로 +4.0dB의 유의미한 개선을 보였다.	-
dc.format.extent	9	-
dc.language	한국어	-
dc.language.iso	KOR	-
dc.publisher	한국정보과학회	-
dc.title	음성인식 개선을 위한 Variational U-Net 기반 왜곡된 오디오 재구축	-
dc.title.alternative	Reconstructing Distorted Audio Signal based on Variational U-Net for Speech Enhancement	-
dc.type	Article	-
dc.publisher.location	대한민국	-
dc.identifier.bibliographicCitation	정보과학회 컴퓨팅의 실제 논문지, v.30, no.11, pp 562 - 570	-
dc.citation.title	정보과학회 컴퓨팅의 실제 논문지	-
dc.citation.volume	30	-
dc.citation.number	11	-
dc.citation.startPage	562	-
dc.citation.endPage	570	-
dc.identifier.kciid	ART003136976	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	kci	-
dc.subject.keywordAuthor	웨이브 유-넷 신경망	-
dc.subject.keywordAuthor	오토인코더	-
dc.subject.keywordAuthor	딥 러닝	-
dc.subject.keywordAuthor	음성인식 개선	-
dc.subject.keywordAuthor	오디오 왜곡 복구	-
dc.subject.keywordAuthor	wave U-Net	-
dc.subject.keywordAuthor	autoencoder	-
dc.subject.keywordAuthor	deep learning	-
dc.subject.keywordAuthor	speech enhancement	-
dc.subject.keywordAuthor	distortion restoration	-

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Seok-Jun, Buu photo

Seok-Jun, Buu: IT공과대학 (컴퓨터공학부)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

Gyeongsang National University Central Library, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, 52828, Republic of Korea+82-55-772-0534

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE