타언어권 화자 음성인식을 위한 혼잡도에 기반한 다중발음사전의 최적화 기법
- Alternative Title
- Optimizing Multiple Pronunciation Dictionary Based on a Confusability Measure for Non-native Speech Recognition
- Abstract
- In this paper, we propose a method for optimizing a multiple pronunciation dictionary used for modeling pronunciation variations of non-native speech. The proposed method removes some confusable pronunciation variants in the dictionary, resulting in a reduced dictionary size and less decoding time for automatic speech recognition (ASR). To this end, a confusability measure is first defined based on the Levenshtein distance between two different pronunciation variants. Then, the number of phonemes for each pronunciation variant is incorporated into the confusability measure to compensate for ASR errors due to words of a shorter length. We investigate the effect of the proposed method on ASR performance, where Korean is selected as the target language and Korean utterances spoken by Chinese native speakers are considered as non-native speech. It is shown from the experiments that an ASR system using the multiple pronunciation dictionary optimized by the proposed method can provide a relative average word error rate reduction of 6.25%, with 11.67% less ASR decoding time, as compared with that using a multiple pronunciation dictionary without the optimization.
- Author(s)
- 김민아; 이성로; 김홍국; 조성의; 이연우; 오유리
- Issued Date
- 2008-03
- Type
- Article
- URI
- https://scholar.gist.ac.kr/handle/local/17419
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.