Generalized Eigenvalue Beamforming Using Covariance Matrices Estimated by Complex Multi-Task U-Net
- Author(s)
- Tae Woo Kim
- Type
- Thesis
- Degree
- Master
- Department
- 대학원 전기전자컴퓨터공학부
- Advisor
- Kim, Hong Kook
- Abstract
- Acoustic beamforming is a signal processing technique that focuses on sound sources using a multi-channel microphone, and is effective in suppressing background noise, interference and reverberation. However, design parameters for the direction of arrival (DoA) based beamforming systems have disadvantages that must be used in known configurations of microphone arrays. As an alternative, the generalized eigenvalue beamforming has been used, requiring only spatial covariance matrices of speech and noise without spatial information. Recently, many studies have been conducted to separate speech and noise by estimating masks using a deep neural network. However, since only real-valued mask was investigated in these studies, spatial covariance matrix estimation errors may occur.
In this dissertation, the GEV beamforming using covariance matrices estimated by complex multi-task U-Net (CMTU-Net) is proposed. The CMTU-Net is a phase-aware speech separation technique, which is an extension of MTU-Net, the magnitude separation technique. In addition, we shows an attempt to reduce error of the parameter estimation using convolutional neural network (CNN).
In performance evaluation, the proposed methods are evaluated using two datasets and compared to conventional mask and MTU-Net based GEV beamforming. As a result, in simulated environments, the CMTU-Net based GEV beamforming performed better than the conventional one in the scale-invariant source-to-distortion ratio (SI-SDR), perceptual of speech quality (PESQ), cepstral distances (CD), and frequency-weighted segmental signal-to-noise ratio (fwSegSNR) on average.
- URI
- https://scholar.gist.ac.kr/handle/local/32889
- Fulltext
- http://gist.dcollection.net/common/orgView/200000908608
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.