OAK

GIST Library Login

Metadata Downloads

Abstract: In this paper, we propose a soft-masking based denoising and dereverberation method for binaural speech separation in order to improve the performance of speech recognition under reverberant conditions. For each time-frequency bin, the interaural time difference (ITD) is first computed, and then the signal-to-noise ratio (SNR) is estimated as the ratio of the powers of the target speech and noise signals from the ITD. Next, a denoising mask is estimated from the estimated SNR. Subsequently, a dereverberation mask is also obtained according to an estimate of the direct-to-reverberant energy ratio (DRR). In particular, to estimate the DRR of the current frame, the reverberant power is computed by summing the exponentially down-weighted powers of previous frames. It is demonstrated here that a binaural speech separation system with the proposed denoising and dereverberation masks outperforms a system with a conventional spatial and temporal mask (STM) in reverberant and noisy environments, in terms of speech recognition performance. © 2013 ICIC International.

Appears in Collections:: Department of Electrical Engineering and Computer Science > 1. Journal Articles

공개 및 라이선스

qrcode

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.