OAK

GMM-based dual-channel voice activity detection with a correction scheme

Metadata Downloads
Abstract
In this paper, a voice activity detection (VAD) method is proposed on the basis of Gaussian mixture models (GMMs) constructed by spatial cues and a logarithmic root mean squared energy in a dual-channel environment. Each GMM is constructed according to the direction-of-arrival to detect speech intervals under the assumption that the target speech is located in front of dual-channel microphones. In addition, to reduce VAD errors, especially for unvoiced intervals of target speech, a VAD correction scheme is incorporated using the ratio between the energy of high and low frequency bands (HILO). In order to evaluate the performance of the proposed VAD method, the false rejection rates and false alarm rates are measured by comparing the VAD results of the proposed VAD with those of manual segmentation. As a result, it is shown that the proposed GMM-based VAD method with HILO-based VAD correction outperforms a Gaussian kernel density-based VAD method and a GMM-based VAD method without VAD correction. © 2012 ICIC International.
Author(s)
Park, J.H.Kim, Hong Kook
Issued Date
2012-02
Type
Article
URI
https://scholar.gist.ac.kr/handle/local/16038
Publisher
ICIC Express Letters Office
Citation
ICIC Express Letters, v.6, no.2, pp.371 - 376
ISSN
1881-803X
Appears in Collections:
Department of Electrical Engineering and Computer Science > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.