OAK

GIST Library Login

GIST Scholar College of Information and Computing Department of AI Convergence 3. Theses(Master)

State-wise Safety in Autonomous Driving via Lagrangian-based Constrained Reinforcement Learning

Metadata Downloads

Author(s): Minseok Seo

Type: Thesis

Degree: Master

Department: 대학원 AI대학원

Advisor: Kim, Uehwan

Abstract: For the practical deployment of autonomous driving systems, high levels of safety and adaptability are essential. Accordingly, Deep Reinforcement Learning (DRL), which learns and improves driving strategies through trial and error, has gained attention. However, the reward-driven nature of reinforcement learning may still lead to unsafe or abnormal behavior even after training. To address this limitation, Constrained Reinforcement Learning (CRL) has been proposed to balance safety and performance. While CRL typically defines constraints as expected cumulative costs, this formulation does not consider whether constraints are satisfied at each state, making it difficult to ensure state-wise safety. In this paper, we extend a Lagrangian-based CRL approach by estimating state-wise Lagrangian multipliers, allowing the policy to account for state-level safety. We evaluate the proposed method in OpenAI's Safety Gym environment and compare its performance with existing Lagrangian-based methods.|자율주행 시스템의 실제 적용을 위해서는 높은 안정성과 적응성이 요구된다. 이에 따라, 시행착오를 통해 주행 전략을 학습하며 발전시키는 심층 강화 학습(Deep Reinforcement Learning, DRL)이 주목받고 있다. 하지만 강화 학습은 본질적으로 보상을 극대화하는 방향으로 정책을 학습하기 때문에, 학습 후에도 안전하지 않거나 비정상적인 행동을 할 가능성을 완전히 배제하기 어렵다. 이러한 한계를 해결하기 위해, 정책 학습 시 안정성과 성능 간의 균형을 도모하는 제약 강화 학습(Constrained Reinforcement Learning, CRL)이 제안되었다. 제약 강화 학습은 기댓값 기반 누적 비용 형태의 제약 조건을 만족하도록 정책을 학습하지만, 각 상태에서의 제약 조건 충족 여부를 고려하지 않아 상태별 안정성을 보장하기 어렵다. 본 논문에서는 제약 강화 학습의 한 방식인 라그랑지안 기반의 방법을 확장하여, 상태별 라그랑주 승수를 추정함으로써 정책이 상태별 안정성을 고려하도록 한다. 또한 제안한 방법을 OpenAI의 시뮬레이션 환경인 Safety Gym을 통해 기존 라그랑지안 기반의 방법들과 비교하여 검증하였다.

URI: https://scholar.gist.ac.kr/handle/local/31955

Fulltext: http://gist.dcollection.net/common/orgView/200000901562

Alternative Author(s): 서민석

Appears in Collections:: Department of AI Convergence > 3. Theses(Master)

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.