Yuan Pu

puyuan1996

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning

upvoted a paper about 1 month ago

One Model for All Tasks: Leveraging Efficient World Models in Multi-Task Planning

published a dataset 4 months ago

puyuan1996/knowledge_orm_qwen25vl_72B_20251017

View all activity

Organizations

upvoted a paper 3 days ago

MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning

Paper • 2602.10575 • Published 6 days ago • 4

upvoted a paper about 1 month ago

One Model for All Tasks: Leveraging Efficient World Models in Multi-Task Planning

Paper • 2509.07945 • Published Sep 9, 2025 • 1

published 4 datasets 4 months ago

updated a dataset 4 months ago

puyuan1996/7b_ppo_jericho_1p8k_his4_iter_0_45_90_180

Preview • Updated Oct 16, 2025 • 1

published a dataset 4 months ago

puyuan1996/7b_ppo_jericho_1p8k_his4_iter_0_45_90_180

Preview • Updated Oct 16, 2025 • 1

updated a dataset 5 months ago

puyuan1996/orz_0p5b_ppo_150step

Preview • Updated Sep 29, 2025 • 7

published a dataset 5 months ago

puyuan1996/orz_0p5b_ppo_150step

Preview • Updated Sep 29, 2025 • 7

updated a dataset 6 months ago

puyuan1996/junyu_backup20250903

Updated Sep 3, 2025 • 1

published a dataset 6 months ago

puyuan1996/junyu_backup20250903

Updated Sep 3, 2025 • 1

updated 3 datasets 12 months ago

puyuan1996/unizero_mt_moco_dmc8_concat_task_embed_nlayer8_20250221

Preview • Updated Feb 21, 2025 • 6

puyuan1996/unizero_mt_dmc18_concat_task_embed_nlayer8_20250221

Preview • Updated Feb 21, 2025

puyuan1996/unizero_mt_atari8_concat_task_embed_nlayer8_20250221

Preview • Updated Feb 21, 2025

published 3 datasets 12 months ago

puyuan1996/unizero_mt_moco_dmc8_concat_task_embed_nlayer8_20250221

Preview • Updated Feb 21, 2025 • 6

puyuan1996/unizero_mt_dmc18_concat_task_embed_nlayer8_20250221

Preview • Updated Feb 21, 2025

puyuan1996/unizero_mt_atari8_concat_task_embed_nlayer8_20250221

Preview • Updated Feb 21, 2025

upvoted an article over 1 year ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

•

401

liked a Space almost 2 years ago

ZeroPal

📖

Ask questions about LightZero and get detailed answers

Yuan Pu

AI & ML interests

Recent Activity

Organizations

puyuan1996's activity

Illustrating Reinforcement Learning from Human Feedback (RLHF)

ZeroPal