CelesteChen
's Collections
reasoning
updated
Large Language Models Can Self-Improve in Long-context Reasoning
Paper
•
2411.08147
•
Published
•
65
Reverse Thinking Makes LLMs Stronger Reasoners
Paper
•
2411.19865
•
Published
•
23
Training Large Language Models to Reason in a Continuous Latent Space
Paper
•
2412.06769
•
Published
•
92
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper
•
2412.18925
•
Published
•
106
ChemAgent: Self-updating Library in Large Language Models Improves
Chemical Reasoning
Paper
•
2501.06590
•
Published
•
11
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
Paper
•
2501.12570
•
Published
•
28
Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament
Paper
•
2501.13007
•
Published
•
19
Agent-R: Training Language Model Agents to Reflect via Iterative
Self-Training
Paper
•
2501.11425
•
Published
•
109
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary
Feedback
Paper
•
2501.10799
•
Published
•
15
Process Reinforcement through Implicit Rewards
Paper
•
2502.01456
•
Published
•
61
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual
Reasoning in Mathematical LLMs
Paper
•
2502.10454
•
Published
•
7
Large Language Models and Mathematical Reasoning Failures
Paper
•
2502.11574
•
Published
•
3
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
Paper
•
2502.12054
•
Published
•
7
LightThinker: Thinking Step-by-Step Compression
Paper
•
2502.15589
•
Published
•
31
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
Paper
•
2504.01943
•
Published
•
15
MolmoAct: Action Reasoning Models that can Reason in Space
Paper
•
2508.07917
•
Published
•
44
StepWiser: Stepwise Generative Judges for Wiser Reasoning
Paper
•
2508.19229
•
Published
•
20
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains
RLVR
Paper
•
2508.14029
•
Published
•
118