blog
about
projects
tags
cv
submenus
projects
blog

RL

an archive of posts with this tag

Apr 12, 2025	GRPO 대신 DAPO: RL 최적화로 LLM 추론 능력 끌어올리기
Jan 30, 2025	DeepSeek-R1, o1 을 이기는 중국의 reasoning 모델

© Copyright 2026 Charlie Hwang. Powered by Jekyll with al-folio theme. Hosted by GitHub Pages. Photos from Unsplash. Last updated: April 10, 2026.