OAK

Contrastive representation learning of inorganic materials to overcome lack of training datasets

Metadata Downloads
Abstract
Data representation forms a feature space where forms data distribution that is one of the key factors determining the prediction accuracy of machine learning (ML). In particular, the data representation is crucial to handle small and biased training datasets, which is the main challenge of ML in chemical applications. In this paper, we propose a data-agnostic representation method that automatically and universally generates a vector-shaped and target-specified representation of crystal structures. By employing the new materials representation of the proposed method, the prediction capabilities of ML algorithms were highly improved on small training datasets and transfer learning tasks. Moreover, the prediction accuracies of ML algorithms were improved by 28.89-30.87% in extrapolation problems to predict the physical properties of the materials in unknown material groups. The source code of EMRL is publicly available at https://github.com/ngs00/emrl/tree/master/EMRL.
Author(s)
Na, Gyoung S.Kim, Hyun Woo
Issued Date
2022-06
Type
Article
DOI
10.1039/d2cc01764d
URI
https://scholar.gist.ac.kr/handle/local/10781
Publisher
NLM (Medline)
Citation
Chemical communications (Cambridge, England), v.58, no.47, pp.6729 - 6732
ISSN
1359-7345
Appears in Collections:
Department of Chemistry > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.