Weighted Loss Function Utilizing SNR Information for Monaural Phase-Aware Speech Enhancement
- Author(s)
- Jungwon Park
- Type
- Thesis
- Degree
- Master
- Department
- 대학원 AI대학원
- Advisor
- Shin, Jong Won
- Abstract
- Speech enhancement is a task that suppresses background noise to improve speech quality and clarity for robust automatic speech recognition (ASR). In recent studies, deep learning-based approaches, which train the model from large amounts of data, have been extensively investigated. Recent studies have been proposed to estimate clean speech in complex and time domain to avoid the difficulty of accurately estimating the phase. In the case of phase-aware speech enhancement techniques in the complex domain, they can be divided into two types: mapping-based [5, 8] and masking-based methods. In our study, we use two convolutional recurrent network (CRN)-based [6-8] models of monaural speech enhancement and propose method to modify conventional loss function. In addition, we demonstrate that our proposed method performs better than the existing method by assigning different weights based on the signal-to-noise ratio (SNR) value.
- URI
- https://scholar.gist.ac.kr/handle/local/19895
- Fulltext
- http://gist.dcollection.net/common/orgView/200000883606
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.