OAK

GIST Library Login

검색

GIST Repository College of Information and Computing Department of AI Convergence 3. Theses(Master)

Improving Back-Translation with Denoising Auto-Encoding

Metadata Downloads

Author(s): Seokhyun Oh

Type: Thesis

Degree: Master

Department: 대학원 AI대학원

Advisor: Jeon, Moongu

Abstract: The shift from recurrent neural network models to transformer models in neural machine translation has significantly boosted translation quality. However, most neural machine translation models require a substantial parallel corpus data, posing challenges in acquisition. To enhance translation quality, researchers have explored data augmentation methods extensively, especially leveraging easier-to-obtain monolingual corpus data in various studies, including dual learning and back-translation. Back-translation, a widely-used data augmentation technique in neural machine translation, creates synthetic parallel data by translating the target language back into the source language using monolingual corpus data. Neural machine translation models employing back-translation are typically trained on three data types: (1) original data, (2) reference translation, and (3) translated data. Reference translation and translated data, human- and translation model-generated, respectively, usually exhibit similar characteristics but differ from original data. Back-translation primarily improves translations for inputs like reference translation and translated data. However, when the input is original data, impact of back-translation may be limited or even negative. To address this limitation, this dissertation aims to enhance performance of back-translation for original data inputs. The proposal involves using back-translation with denoising auto-encoding to make synthetic data characteristics resemble original data, improving effectiveness of back-translation.

URI: https://scholar.gist.ac.kr/handle/local/19397

Fulltext: http://gist.dcollection.net/common/orgView/200000880214

Alternative Author(s): 오석현

Appears in Collections:: Department of AI Convergence > 3. Theses(Master)

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Repository는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.