Nested Learning: The Illusion of Deep Learning Architectures Paper • 2512.24695 • Published 8 days ago • 30
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer Paper • 2601.01425 • Published 4 days ago • 37
Masking Teacher and Reinforcing Student for Distilling Vision-Language Models Paper • 2512.22238 • Published 15 days ago • 18
Quantile Rendering: Efficiently Embedding High-dimensional Feature on 3D Gaussian Splatting Paper • 2512.20927 • Published 15 days ago • 6
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 12 days ago • 57
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 9 days ago • 93
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published 9 days ago • 64
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 15 days ago • 60
Spatia: Video Generation with Updatable Spatial Memory Paper • 2512.15716 • Published 21 days ago • 29
Infinite-Homography as Robust Conditioning for Camera-Controlled Video Generation Paper • 2512.17040 • Published 20 days ago • 27
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published 20 days ago • 42
Insight Miner: A Time Series Analysis Dataset for Cross-Domain Alignment with Natural Language Paper • 2512.11251 • Published 27 days ago • 6
StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors Paper • 2512.16915 • Published 20 days ago • 37
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation Paper • 2512.16913 • Published 20 days ago • 33
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published 28 days ago • 78