StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation Paper • 2602.16915 • Published 9 days ago
MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation Paper • 2602.14534 • Published 12 days ago • 3
Light4D: Training-Free Extreme Viewpoint 4D Video Relighting Paper • 2602.11769 • Published 16 days ago • 2
Code2Worlds: Empowering Coding LLMs for 4D World Generation Paper • 2602.11757 • Published 16 days ago • 4
GeneralVLA: Generalizable Vision-Language-Action Models with Knowledge-Guided Trajectory Planning Paper • 2602.04315 • Published 24 days ago • 1
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss Paper • 2602.02493 • Published 25 days ago • 42
CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation Paper • 2601.10061 • Published Jan 15 • 31
DocDancer: Towards Agentic Document-Grounded Information Seeking Paper • 2601.05163 • Published Jan 8 • 5
MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics Paper • 2601.02075 • Published Jan 5 • 8
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published Dec 24, 2025 • 13
VABench: A Comprehensive Benchmark for Audio-Video Generation Paper • 2512.09299 • Published Dec 10, 2025 • 8
Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling Paper • 2512.12675 • Published Dec 14, 2025 • 41
From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs Paper • 2512.06776 • Published Dec 7, 2025 • 26