OAK

GIST Library Login

검색

GIST Repository College of Information and Computing Department of AI Convergence 3. Theses(Master)

Attention-based Language Model Fusion for End-to-End Speech Recognition

Metadata Downloads

Author(s): Hyunju Park

Type: Thesis

Degree: Master

Department: 대학원 AI대학원

Advisor: Kim, Hong Kook

Abstract: 본 논문에서는 도메인 적응을 위한 심층 신경망 음향 음성 인식(ASR) 어텐션 기반
언어 모델(LM) 융합을 제안한다. 제안된 방법은 LM과 ASR의 특징을 입력으로 취하는
어텐션 계층과 셀프어텐션으로 구성된다. 어텐션 모듈,셀프어텐션, LM 및 ASR의 피
쳐가 합쳐저 게이티 메카니즘에 인풋으로 들어간다 (인터폴레이션 레이어). 그런 다음,
인터폴레이션 레이어와 ASR의 출력은 함께 병합되고, 그 다음에 프로젝션 레이어과 소
프트맥스 이어집니다.
제안된 모델의 성능을 평가하기 위해 객관적인 성능 평가가 수행된다. ASR에 사용
되는 객관적인 평가는 워드 오류율(WER)이다. WER은 ASR TASK의 일반적인 메트
릭입니다. WER은 대체, 삭제, 삽입, 수정을 이용하여 계산됩니다. 우리 모델은 기존
방법보다 1.7 % 낮은 WER 성능을 보였다. 이는 우리가 제안한 어텐션 모듈이 도메인
적응에 용이하게 한다는 것을 보여준다.|In this paper, we propose a deep neural network attention-based Language Model
fusion for Acoustic Speech Recognition (ASR) . The proposed method consists of the
attention layer that takes features of LM and ASR as inputs and self-attention layer of
language model. Then, features of attention module, LM, self-attention, and ASR are
concatenated for gating mechanism (interpolation layer). Lastly, outputs of interpola-
tion layer and ASR are merged together, followed by a projection layer with a softmax
for decoding.
To evaluate the performance of the proposed model, an objective performance eval-
uation is performed. The objective evaluation used for ASR is Word Error Rate (WER).
WER is the common metric for ASR TASK. WER is calculated with Substitution (S),
Deletion (D), Insertion (I), Correction (C). Our proposed model performs 0.5 % WER
and 0.4% WER better than the conventional fusion method’s in out-of-domain and
in-domain scenario. This proves that our proposed attention-based module facilitates
performance of mode

URI: https://scholar.gist.ac.kr/handle/local/18951

Fulltext: http://gist.dcollection.net/common/orgView/200000883570

Alternative Author(s): 박현주

Appears in Collections:: Department of AI Convergence > 3. Theses(Master)

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Repository는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.