view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 6 days ago • 108
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 27 days ago • 145
OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene Understanding Paper • 2601.09575 • Published 19 days ago • 25
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 25 days ago • 215
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published 25 days ago • 165
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields Paper • 2601.03252 • Published 27 days ago • 101
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published Dec 9, 2025 • 119
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos Paper • 2512.10881 • Published Dec 11, 2025 • 30
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models Paper • 2510.04618 • Published Oct 6, 2025 • 129
Composing Concepts from Images and Videos via Concept-prompt Binding Paper • 2512.09824 • Published Dec 10, 2025 • 28
OmniPSD: Layered PSD Generation with Diffusion Transformer Paper • 2512.09247 • Published Dec 10, 2025 • 47
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation Paper • 2512.07831 • Published Dec 8, 2025 • 17
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality Paper • 2512.07951 • Published Dec 8, 2025 • 50
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 132
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards Paper • 2512.00473 • Published Nov 29, 2025 • 26