3 25 22

Andrew Zhao

andrewzh

https://andrewzh112.github.io/

AI & ML interests

Reinforcement Learning, Agents

Recent Activity

upvoted a paper 23 days ago

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

upvoted a paper 3 months ago

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

upvoted a paper 3 months ago

GEM: A Gym for Agentic LLMs

View all activity

Organizations

None yet

upvoted a paper 23 days ago

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published 27 days ago • 74

upvoted 3 papers 3 months ago

upvoted a paper 4 months ago

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 190

upvoted a paper 5 months ago

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published Jul 28, 2025 • 82

upvoted a paper 6 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30, 2025 • 50

upvoted a paper 7 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 187

upvoted 2 papers 8 months ago

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19, 2025 • 27

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 188

upvoted a collection 8 months ago

Absolute Zero Reasoner

Collection

6 items • Updated May 9, 2025 • 56

upvoted 2 papers 9 months ago

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Paper • 2504.13820 • Published Apr 18, 2025 • 16

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 139

upvoted a paper 10 months ago

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Paper • 2502.18364 • Published Feb 25, 2025 • 36

upvoted a paper 11 months ago

Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity

Paper • 2502.11901 • Published Feb 17, 2025 • 6

upvoted 3 papers about 1 year ago

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

Paper • 2411.02359 • Published Nov 4, 2024 • 13

How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4, 2024 • 34

LLM-based Optimization of Compound AI Systems: A Survey

Paper • 2410.16392 • Published Oct 21, 2024 • 16

upvoted 2 papers over 1 year ago

Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing

Paper • 2407.08770 • Published Jul 11, 2024 • 21

Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models

Paper • 2406.11230 • Published Jun 17, 2024 • 33

Andrew Zhao

AI & ML interests

Recent Activity

Organizations

andrewzh's activity