Soft-masking based denoising and dereverberation for binaural speech separation in reverberant environments
- Abstract
- In this paper, we propose a soft-masking based denoising and dereverberation method for binaural speech separation in order to improve the performance of speech recognition under reverberant conditions. For each time-frequency bin, the interaural time difference (ITD) is first computed, and then the signal-to-noise ratio (SNR) is estimated as the ratio of the powers of the target speech and noise signals from the ITD. Next, a denoising mask is estimated from the estimated SNR. Subsequently, a dereverberation mask is also obtained according to an estimate of the direct-to-reverberant energy ratio (DRR). In particular, to estimate the DRR of the current frame, the reverberant power is computed by summing the exponentially down-weighted powers of previous frames. It is demonstrated here that a binaural speech separation system with the proposed denoising and dereverberation masks outperforms a system with a conventional spatial and temporal mask (STM) in reverberant and noisy environments, in terms of speech recognition performance. © 2013 ICIC International.
- Author(s)
- Park, J.H.; Kim, Hong Kook
- Issued Date
- 2013-03
- Type
- Article
- URI
- https://scholar.gist.ac.kr/handle/local/15631
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.