Confusability measure based lexicon optimization for fast LVCSR decoding
- Abstract
- In this paper, we propose a lexicon optimization method based on
confusability measure (CM) in order to reduce the decoding time for a large vocabulary
continuous speech recognition (LVCSR) system. When lexicon is built
or expanded for unseen words by using grapheme-to-phoneme (G2P) conversion,
the lexicon size increases since G2P is generally realized by 1-to-N-best
mapping. Thus, the proposed method prunes the confusable words in the lexicon
by a CM that is defined a linguistic distance between two phonemic sequences.
It is demonstrated from LVCSR experiments that the proposed lexicon
optimization method achieves a relative real-time factor reduction of 23.13% on
a task on the Wall Street Journal, compared to the 1-to-4-best G2P converted
lexicon approach.
- Author(s)
- Nam Kyun Kim; Woo Kyung Seong; Kim, Hong Kook
- Issued Date
- 2014-08
- Type
- Article
- DOI
- 10.14257/astl.2014.58.22
- URI
- https://scholar.gist.ac.kr/handle/local/15056
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.