OAK

Improving LSTM CRFs using character-based compositions for Korean named entity recognition

Metadata Downloads
Abstract
Standard approaches to named entity recognition (NER) are based on sequential labeling methods, such as conditional random fields (CRFs), which label each word in a sentence and extract entities from them that correspond to named entities. With the extensive deployment of deep learning methods for sequential labeling tasks, state-of-the-art NER performance has been achieved on long short-term memory (LSTM) architectures using only basic features. In this paper, we address Korean NER tasks and propose an extension of a bidirectional LSTM CRF by investigating character-based representation. Our extension involves deploying a hybrid representation using ConvNet and LSTM for the sequential modeling of characters, namely a character-based LSTM-ConvNet hybrid representation. Using morphemes as processing units for bidirectional LSTM, we apply a proposed hybrid representation composed of morpheme vectors. Experimental results showed that the proposed LSTM-ConvNet hybrid representation yielded improvements over each single representation on standard Korean NER tasks. (C) 2018 Elsevier Ltd. All rights reserved.
Author(s)
Na, Seung-HoonKim, HyunMin, JinwooKangil, Kim
Issued Date
2019-03
Type
Article
DOI
10.1016/j.csl.2018.09.005
URI
https://scholar.gist.ac.kr/handle/local/8900
Publisher
Academic Press
Citation
Computer Speech and Language, v.54, pp.106 - 121
ISSN
0885-2308
Appears in Collections:
Department of AI Convergence > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.