OAK

Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning

Metadata Downloads
Abstract
Control intelligence is a typical field where there is a trade-off between target objectives, and researchers in this field have longed for artificial intelligence that achieves the target objectives. Multi-objective deep reinforcement learning was sufficient to satisfy this need. In particular, multi-objective deep reinforcement learning methods based on policy optimization are leading the optimization of control intelligence. However, multi-objective reinforcement learning has difficulties when finding various Pareto optimals of multi-objectives due to the greedy nature of reinforcement learning. We propose a method of policy assimilation to solve this problem. This method was applied to MO-V-MPO, one of preference-based multi-objective reinforcement learning, to increase diversity. The performance of this method has been verified through experiments in a continuous control environment. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.
Author(s)
Kim, M.-J.Park, H.Ahn, Chang Wook
Issued Date
2022-04
Type
Article
DOI
10.3390/electronics11071069
URI
https://scholar.gist.ac.kr/handle/local/10858
Publisher
MDPI
Citation
Electronics (Switzerland), v.11, no.7
ISSN
2079-9292
Appears in Collections:
Department of AI Convergence > 1. Journal Articles
공개 및 라이선스
  • 공개 구분공개
파일 목록

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.