OAK

GIST Library Login

GIST Scholar College of Information and Computing Department of AI Convergence 4. Theses(Ph.D)

Method of Learning an Accurate State Transition Dynamics Model by Fitting both a Function and Its Derivative Simultaneously

Metadata Downloads

Author(s): Youngho Kim

Type: Thesis

Degree: Doctor

Department: 대학원 융합기술학제학부(지능로봇프로그램)

Advisor: Ryu, Jeha

Abstract: Accurate state transition dynamics model is essential component for model-based controller such as model-based reinforcement learning (MBRL), model predictive control (MPC). Through an accurate state transition dynamics model, simulation can be performed without real interaction to determine important business decision predicting future states. For example, in a smart factory, the entire manufacturing process can be simulated realistically to design the process sequence and processing time at low cost without time and space constraints. However, if the model is not accurate, it can lead to wrong decisions.
In order to obtain an accurate model, analytic modeling is traditionally used, but the analytic model is difficult to model because complex non-commercial robots have very complex dynamic models. Recently, learning a state transition dynamics model with collected data such as control input and its trajectory has been widely used. However, it is difficult to learn an accurate state transition dynamics model in the real world. For example, if a robot moves randomly in a real environment, it may reach singularities or joint limits causing unexpected behavior such as failure, wear, and collisions. In this case, efficient data acquisition method and data-efficient learning method are necessary.
In the field of function approximation, a derivative learning method that uses function values and their derivatives simultaneously has been studied to improve the accuracy of the model, however, it has never been applied to the field of state transition dynamics model learning. The previous state transition dynamics model learning method predicts the next states given the current states and actions showing poor prediction accuracy with small amount of dataset. Therefore, this thesis proposes a novel MBRL method with derivative information such as velocity and acceleration as ground-truth derivatives to improve the sample-efficiency and prediction accuracy. Then, the proposed method reduced the prediction error of the existing method about 85% in the virtual environment and about 34% in the real environment. Moreover, the more accurate state transition dynamics model actually shows better performance in the goal reaching task experiment with obstacle avoidance or manipulability maximization.
Finally, a type of activation function plays an important role in the derivative learning process in terms of accuracy and convergence. However, the effects of various activation functions on the derivative learning have never been investigated. Therefore, the good derivative characteristics of activation function are analyzed and several experiments are conducted to compare performance according to the activation function. In these experiment, swish activation function shows the best performance.

URI: https://scholar.gist.ac.kr/handle/local/19488

Fulltext: http://gist.dcollection.net/common/orgView/200000883101

Alternative Author(s): 김영호

Appears in Collections:: Department of AI Convergence > 4. Theses(Ph.D)

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.