Daixuan Cheng's picture

Daixuan Cheng

daixuancheng

·

https://cdxeve.github.io

DaixuanC45443

AI & ML interests

I study LLMs, from Pre-Training to Agent.

Recent Activity

upvoted a collection 3 days ago

liked a model 4 days ago

daixuancheng/Qwen3-4B-Instruct-2507-LLM-in-Sandbox-RL

liked a dataset 4 days ago

daixuancheng/llm-in-sandbox-rl

View all activity

Organizations

None yet

upvoted a collection 3 days ago

LLM-in-Sandbox

Data and models for the paper: LLM-in-Sandbox Elicits General Agentic Intelligence. Feel free to open an issue if you have any questions or problems! • 3 items • Updated 3 days ago • 1

upvoted 2 papers 23 days ago

SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training

Paper • 2602.03411 • Published 24 days ago • 37

SWE-World: Building Software Engineering Agents in Docker-Free Environments

Paper • 2602.03419 • Published 24 days ago • 40

upvoted a paper 24 days ago

Adaptive Ability Decomposing for Unlocking Large Reasoning Model Effective Reinforcement Learning

Paper • 2602.00759 • Published 27 days ago • 5

upvoted 11 collections about 1 month ago

Agentic

14 items • Updated Jan 24 • 2

Agents

12 items • Updated Jan 25 • 1

AI-papers

5 items • Updated Jan 24 • 1

Ai-general

50 items • Updated 29 days ago • 3

Agent

98 items • Updated about 12 hours ago • 11

2026

174 items • Updated 18 days ago • 3

Coding

3 items • Updated 22 days ago • 1

Agents

13 items • Updated 17 days ago • 4

Training-Free

3 items • Updated 28 days ago • 1

LLM

12 items • Updated 17 days ago • 3

Agent

48 items • Updated about 19 hours ago • 3

upvoted a paper about 1 month ago

LLM-in-Sandbox Elicits General Agentic Intelligence

Paper • 2601.16206 • Published Jan 22 • 84

upvoted 2 papers 4 months ago

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16, 2025 • 106

BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 59

upvoted a paper 5 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 116

upvoted a paper 7 months ago

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26, 2025 • 158