Improving LSTM CRFs using character-based compositions for Korean named entity recognition
- Abstract
- Standard approaches to named entity recognition (NER) are based on sequential labeling methods, such as conditional random fields (CRFs), which label each word in a sentence and extract entities from them that correspond to named entities. With the extensive deployment of deep learning methods for sequential labeling tasks, state-of-the-art NER performance has been achieved on long short-term memory (LSTM) architectures using only basic features. In this paper, we address Korean NER tasks and propose an extension of a bidirectional LSTM CRF by investigating character-based representation. Our extension involves deploying a hybrid representation using ConvNet and LSTM for the sequential modeling of characters, namely a character-based LSTM-ConvNet hybrid representation. Using morphemes as processing units for bidirectional LSTM, we apply a proposed hybrid representation composed of morpheme vectors. Experimental results showed that the proposed LSTM-ConvNet hybrid representation yielded improvements over each single representation on standard Korean NER tasks. (C) 2018 Elsevier Ltd. All rights reserved.
- Author(s)
- Na, Seung-Hoon; Kim, Hyun; Min, Jinwoo; Kangil, Kim
- Issued Date
- 2019-03
- Type
- Article
- DOI
- 10.1016/j.csl.2018.09.005
- URI
- https://scholar.gist.ac.kr/handle/local/8900
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.