OAK

GIST Library Login

GIST Scholar College of Information and Computing Department of AI Convergence 1. Journal Articles

A Swapping Target Q-Value Technique for Data Augmentation in Offline Reinforcement Learning

Metadata Downloads

Author(s): Joo, Ho-Taek; Baek, In-Chang; Kim, Kyung-Joong

Type: Article

Citation: IEEE ACCESS, v.10, pp.57369 - 57382

Issued Date: 2022-05

Abstract: Offline reinforcement learning (RL) is applied to fixed datasets of logged interactions pertaining to actual applications in healthcare, autonomous vehicles, and robotics. In limited and fixed dataset settings, data augmentation can be beneficial in developing better policies. Several online RL methods for data augmentation have recently been utilized to enhance sampling efficiency and generalization. Here, a novel, simple data-augmentation technique referred to as Swapping Target Q-Value (SQV) is introduced to enhance offline RL algorithms and enable robust pixel-based learning without auxiliary loss. Our method matches the current Q-value of a transformed image to the target Q-value of the next original image, whereby the current Q-value of the original image is matched to the target Q-value of the next transformed image. The proposed method considers similar states as the same and different states as more distinct. Furthermore, the approach ties unseen data (lacking in the dataset) to similar states in the seen data. After training, these effects were observed to increase the performance of the offline RL algorithm. The method was tested on 23 games in the Atari 2600 game domain. As a result, the performance of our method improved in 18 out of 23 games, with an average performance improvement of 144% compared with batch-constrained deep Q-learning (BCQ), which is the latest offline RL method. The implementation can be found at https://github.com/hotaekjoo/SQV.

Publisher: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

ISSN: 2169-3536

DOI: 10.1109/ACCESS.2022.3178194

URI: https://scholar.gist.ac.kr/handle/local/10823

Appears in Collections:: Department of AI Convergence > 1. Journal Articles

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.