OAK

Primitive Basis Learning for Alignment on Universal Representation in Unsupervised Machine Translation

Metadata Downloads
Abstract
Neural machine translation is dependent on the need for large-scale parallel corpora. Parallel corpora are costly to build as they require specialized expertise and are often nonexistent for low-resource languages. The unsupervised neural machine translation addresses this issue by leveraging monolingual corpora through aligning across languages. However, the limitation is that they still require an anchoring to align two languages. In this paper, we first propose the concept of a Primitive Basis Vector based on the universal grammar assumption and structure it to learn a universal representation to improve alignment between two languages during back-translation. Specifically, we suggest the Primitive Basis Learning(PBL) framework, which consists of a Vector Quantization Primitive Dictionary (VQPD) and a Universal Structure Module. This framework encourages to learn language-agnostic representations, thereby reducing the semantic discrepancy between cross-lingual sentences. We empirically show that our proposed method outperforms existing approaches by improving cross-language alignment by structuring universal representations during back-translation.
Author(s)
Hyunyoung Bae
Issued Date
2024
Type
Thesis
URI
https://scholar.gist.ac.kr/handle/local/19610
Alternative Author(s)
배현영
Department
대학원 AI대학원
Advisor
Kim, Kangil
Degree
Master
Appears in Collections:
Department of AI Convergence > 3. Theses(Master)
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.