DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning Paper • 2603.12257 • Published 2 days ago • 24
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 8 days ago • 104
ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement Paper • 2512.13303 • Published Dec 15, 2025 • 17
Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance Paper • 2510.24711 • Published Oct 28, 2025 • 20
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22, 2025 • 90
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition Paper • 2412.09501 • Published Dec 12, 2024 • 48