Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Jin Zhu's picture
2 2 29

Jin Zhu

mamba413
callmespring's profile picture Kyleyee's profile picture Eehan's profile picture
·
https://mamba413.github.io/
  • Mamba413

AI & ML interests

reinforcement learning

Recent Activity

authored a paper 21 days ago
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
upvoted a paper 21 days ago
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
liked a dataset about 2 months ago
bookcorpus/bookcorpus
View all activity

Organizations

Stats-powered AI's profile picture

mamba413 's models 10

mamba413/Qwen2.5-1.5B-PPO-DR-HH-Seed1

2B • Updated Mar 21 • 5

mamba413/Qwen2.5-1.5B-PPO-BENCH-HH-Seed1

2B • Updated Mar 21 • 6

mamba413/Qwen2.5-1.5B-Instruct-Reward-BENCH-HH-Seed1

2B • Updated Mar 21 • 8

mamba413/Qwen2.5-1.5B-Instruct-Reward-BENCH-HH-Seed0

Updated Mar 20

mamba413/Qwen2.5-1.5B-Instruct-Reward-DR-HH-Seed0

Updated Mar 20

mamba413/Qwen2-0.5B-Reward-DR-HH-Seed0

Text Classification • 0.5B • Updated Mar 19 • 10

mamba413/Qwen2.5-1.5B-Reward-DR-IMDB-Seed0

Updated Mar 18

mamba413/Qwen2.5-1.5B-Reward-DR-SIMU-Seed0

Updated Mar 18

mamba413/Qwen2-0.5B-Reward-DR-SIMU-Seed0

Text Classification • 0.5B • Updated Mar 16 • 9

mamba413/Qwen2-0.5B-Reward-DR-SIMU

Text Classification • 0.5B • Updated Mar 15 • 13
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs