KL 유도 다중 업데이트 기반의 새로운 손실 인지 근접 정책 최적화 기법

반태원

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

KL 유도 다중 업데이트 기반의 새로운 손실 인지 근접 정책 최적화 기법

Full metadata record

DC Field	Value	Language
dc.contributor.author	반태원	-
dc.date.accessioned	2025-08-06T07:30:09Z	-
dc.date.available	2025-08-06T07:30:09Z	-
dc.date.issued	2025-07	-
dc.identifier.issn	2234-4772	-
dc.identifier.issn	2288-4165	-
dc.identifier.uri	https://scholarworks.gnu.ac.kr/handle/sw.gnu/79639	-
dc.description.abstract	This paper proposes LA-PPO-KL (Loss-Aware Proximal Policy Optimization with KL-guided Retrying), a novel reinforcement learning algorithm that improves the stability and efficiency of PPO. Unlike traditional PPO, which relies on static clipping or KL thresholds, LA-PPO-KL monitors both policy and value loss to determine when to retry policy updates. It also halts retries when KL divergence exceeds a predefined limit, preventing excessive policy shifts. Experiments in the BipedalWalker-v3 environment demonstrate that LA-PPO-KL outperforms baseline PPO by 15~20% in average return, with faster convergence and more robust learning. These results highlight the potential of adaptive retry mechanisms in improving policy optimization under complex and uncertain environments.	-
dc.format.extent	4	-
dc.language	한국어	-
dc.language.iso	KOR	-
dc.publisher	한국정보통신학회	-
dc.title	KL 유도 다중 업데이트 기반의 새로운 손실 인지 근접 정책 최적화 기법	-
dc.title.alternative	A New Loss-Aware Proximal Policy Optimization Based On KL-Guided Multi-Updates	-
dc.type	Article	-
dc.publisher.location	대한민국	-
dc.identifier.bibliographicCitation	한국정보통신학회논문지, v.29, no.7, pp 960 - 963	-
dc.citation.title	한국정보통신학회논문지	-
dc.citation.volume	29	-
dc.citation.number	7	-
dc.citation.startPage	960	-
dc.citation.endPage	963	-
dc.type.docType	Y	-
dc.identifier.kciid	ART003228585	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	kci	-
dc.subject.keywordAuthor	Proximal Policy Optimization	-
dc.subject.keywordAuthor	Adaptive Policy Update	-
dc.subject.keywordAuthor	Reinforcement Learning Stability	-
dc.subject.keywordAuthor	KL Divergence Control.	-

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Ban, Tae Won photo

Ban, Tae Won: IT공과대학 (AI정보공학과)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

Gyeongsang National University Central Library, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, 52828, Republic of Korea+82-55-772-0534

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE