OAK

Dysarthric speech recognition error correction using weighted finite state transducers based on context-dependent pronunciation variation

Metadata Downloads
Abstract
In this paper, we propose a dysarthric speech recognition error correction method based on weighted finite state transducers (WFSTs). First, the proposed method constructs a context-dependent (CD) confusion matrix by aligning a recognized word sequence with the corresponding reference sequence at a phoneme level. However, because the dysarthric speech database is too insufficient to reflect all combinations of context-dependent phonemes, the CD confusion matrix can be underestimated. To mitigate this underestimation problem, the CD confusion matrix is interpolated with a context-independent (CI) confusion matrix. Finally, WFSTs based on the interpolated CD confusion matrix are built and integrated with a dictionary and language model transducers in order to correct speech recognition errors. The effectiveness of the proposed method is demonstrated by performing speech recognition using the proposed error correction method incorporated with the CD confusion matrix. It is shown from the speech recognition experiment that the average word error rate (WER) of a speech recognition system employing the proposed error correction method with the CD confusion matrix is relatively reduced by 13.68% and 5.93%, compared to those of the baseline speech recognition system and the error correction method with the CI confusion matrix, respectively. © 2012 Springer-Verlag.
Author(s)
Seong, Woo KyeongPark, Ji HunKim, Hong Kook
Issued Date
2012-07
Type
Conference Paper
DOI
10.1007/978-3-642-31534-3_70
URI
https://scholar.gist.ac.kr/handle/local/23707
Publisher
-
Citation
13th International Conference on Computers Helping People with Special Needs, ICCHP 2012, pp.475 - 482
Conference Place
AT
Appears in Collections:
Department of Electrical Engineering and Computer Science > 2. Conference Papers
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.