OAK

HMM-Based mask estimation for a speech recognition front-end using computational auditory scene analysis

Metadata Downloads
Abstract
In this paper, we propose a new mask estimation method for the computational auditory scene analysis (CASA) of speech using two microphones. The proposed method is based on a hidden Markov model (HMM) in order to incorporate an observation that the mask information should be correlated over contiguous analysis frames. In other words, HMM is used to estimate the mask information represented as the interaural time difference (ITD) and the interaural level difference (ILD) of two channel signals, and the estimated mask information is finally employed in the separation of desired speech from noisy speech. To show the effectiveness of the proposed mask estimation, we then compare the performance of the proposed method with that of a Gaussian kernel-based estimation method in terms of the performance of speech recognition. As a result, the proposed HMM-based mask estimation method provided an average word error rate reduction of 61.4% when compared with the Gaussian kernel-based mask estimation method.
Author(s)
Park, Ji HunYoon, Jae SamKim, Hong Kook
Issued Date
2008-09
Type
Article
DOI
10.1093/ietisy/e91-d.9.2360
URI
https://scholar.gist.ac.kr/handle/local/17285
Publisher
IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG
Citation
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, v.E91D, no.9, pp.2360 - 2364
ISSN
0916-8532
Appears in Collections:
Department of Electrical Engineering and Computer Science > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.