InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields Paper • 2601.03252 • Published 4 days ago • 93
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 2 days ago • 118
Nested Learning: The Illusion of Deep Learning Architectures Paper • 2512.24695 • Published 11 days ago • 33
A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers Paper • 2512.23380 • Published 13 days ago • 43
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement Paper • 2512.21185 • Published 17 days ago • 28
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents Paper • 2512.22322 • Published 15 days ago • 38
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion Paper • 2512.23709 • Published 12 days ago • 48
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 10 days ago • 94
SpotEdit: Selective Region Editing in Diffusion Transformers Paper • 2512.22323 • Published 15 days ago • 37
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published 14 days ago • 43
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 15 days ago • 57
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 11 days ago • 126
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published 12 days ago • 64
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 12 days ago • 93
InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search Paper • 2512.18745 • Published 20 days ago • 11