OAK

GIST Library Login

Metadata Downloads

Citation: 8th International Conference on Spoken Language Processing (ICSLP 2004), pp.1645 - 1648

Abstract: The performance of large vocabulary speech recognizers often varies depending on the input speech and the quality of the trained models. The particular attributes that cause recognition errors are a research area that has not been well studied. This paper addresses this issue from a robustness perspective using a large amount of field data collected from natural language dialog services. In particular, we present a method for tracking time-varying or nonstationary extraneous events, such as music, background noise, etc. We show that this measure is a better predictor of recognition errors than a standard measure of stationary signal-to-noise ratio (SNR). Combining the two measures provides a data selection algorithm for detecting problematic speech.

Appears in Collections:: Department of Electrical Engineering and Computer Science > 2. Conference Papers

공개 및 라이선스

qrcode

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.