OAK

타언어권 화자 음성인식을 위한 혼잡도에 기반한 다중발음사전의 최적화 기법

Metadata Downloads
Alternative Title
Optimizing Multiple Pronunciation Dictionary Based on a Confusability Measure for Non-native Speech Recognition
Abstract
In this paper, we propose a method for optimizing a multiple pronunciation dictionary used for modeling pronunciation variations of non-native speech. The proposed method removes some confusable pronunciation variants in the dictionary, resulting in a reduced dictionary size and less decoding time for automatic speech recognition (ASR). To this end, a confusability measure is first defined based on the Levenshtein distance between two different pronunciation variants. Then, the number of phonemes for each pronunciation variant is incorporated into the confusability measure to compensate for ASR errors due to words of a shorter length. We investigate the effect of the proposed method on ASR performance, where Korean is selected as the target language and Korean utterances spoken by Chinese native speakers are considered as non-native speech. It is shown from the experiments that an ASR system using the multiple pronunciation dictionary optimized by the proposed method can provide a relative average word error rate reduction of 6.25%, with 11.67% less ASR decoding time, as compared with that using a multiple pronunciation dictionary without the optimization.
Author(s)
김민아이성로김홍국조성의이연우오유리
Issued Date
2008-03
Type
Article
URI
https://scholar.gist.ac.kr/handle/local/17419
Publisher
대한음성학회
Citation
말소리, v.1, no.65, pp.93 - 103
ISSN
1226-1173
Appears in Collections:
Department of Electrical Engineering and Computer Science > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.