3 28 4

minghao

Liam-Liu

AI & ML interests

LLM, AD

Recent Activity

upvoted a paper about 2 months ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

upvoted a paper 2 months ago

ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems

authored a paper 2 months ago

OAgents: An Empirical Study of Building Effective Agents

View all activity

Organizations

upvoted a paper about 2 months ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published Oct 16, 2025 • 47

upvoted 2 papers 2 months ago

ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems

Paper • 2510.11652 • Published Oct 13, 2025 • 29

COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes

Paper • 2510.14763 • Published Oct 16, 2025 • 13

upvoted 3 papers 3 months ago

SimKO: Simple Pass@K Policy Optimization

Paper • 2510.14807 • Published Oct 16, 2025 • 10

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Paper • 2509.26346 • Published Sep 30, 2025 • 18

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 141

upvoted 8 papers 4 months ago

Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?

Paper • 2509.04292 • Published Sep 4, 2025 • 57

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2, 2025 • 124

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 83

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 228

A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25, 2025 • 347

upvoted 4 papers 5 months ago

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6, 2025 • 129

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 195

Efficient Agents: Building Effective Agents While Reducing Cost

Paper • 2508.02694 • Published Jul 24, 2025 • 86

VeriGUI: Verifiable Long-Chain GUI Dataset

Paper • 2508.04026 • Published Aug 6, 2025 • 161

upvoted 2 papers 6 months ago

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8, 2025 • 93

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8, 2025 • 44

minghao

AI & ML interests

Recent Activity

Organizations

Liam-Liu's activity