OAK

GIST Library Login

GIST Scholar College of Information and Computing Department of Electrical Engineering and Computer Science 1. Journal Articles

Target exaggeration for deep learning-based speech enhancement

Metadata Downloads

Author(s): Kim, Hansol; Shin, Jong Won

Type: Article

Citation: DIGITAL SIGNAL PROCESSING, v.116

Issued Date: 2021-09

Abstract: Deep learning has been actively utilized for speech enhancement. However, deep learning-based speech enhancement usually produces over-smoothed speech, resulting in speech distortion and degraded intelligibility. In this paper, we propose the exaggeration of the training target so that the dynamic range of the enhanced speech becomes more similar to that of the clean speech. Target exaggeration can be implemented in two ways. The first approach is to exaggerate the target feature in the cost function of a deep learning-based speech enhancement system. This method can be implemented without additional parameters or computation, but can only be applied to schemes working in the time-frequency domain with the mean-square error cost function. The second approach is to introduce an additional deep neural network (DNN) that estimates the residual error in the output of a deep learning-based speech enhancement. This requires more computation, but can be applied even to time-domain approaches. To evaluate the performance of the proposed target exaggeration, it is applied to a feed-forward DNN-and long short-term memory (LSTM)-based speech enhancement scheme in the time-frequency domain, and the convolutional time-domain audio separation network (Conv-TasNet)-based speech enhancement scheme in the time domain. Experimental results showed that the proposed method improved the quality of speech produced by the deep learning-based speech enhancement system in terms of the perceptual evaluation of speech quality (PESQ) scores and outperformed other approaches, including global variance equalization and a perceptually optimized speech denoising autoencoder, to alleviate the over-smoothing problem. (C) 2021 Elsevier Inc. All rights reserved.

Publisher: Elsevier Inc.

ISSN: 1051-2004

DOI: 10.1016/j.dsp.2021.103109

URI: https://scholar.gist.ac.kr/handle/local/11327

Appears in Collections:: Department of Electrical Engineering and Computer Science > 1. Journal Articles

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.