Fine-Grained Perturbation Guidance via Attention Head Selection Paper • 2506.10978 • Published Jun 12, 2025 • 25
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors Paper • 2602.21778 • Published 7 days ago • 12
WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation Paper • 2503.19065 • Published Mar 24, 2025 • 11
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning Paper • 2504.16080 • Published Apr 22, 2025 • 15
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors Paper • 2602.21778 • Published 7 days ago • 12
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors Paper • 2602.21778 • Published 7 days ago • 12
TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models Paper • 2602.15449 • Published 15 days ago • 6
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision Paper • 2504.04903 • Published Apr 7, 2025
Factuality Matters: When Image Generation and Editing Meet Structured Visuals Paper • 2510.05091 • Published Oct 6, 2025 • 20
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7, 2025 • 55
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20, 2025 • 64
Factuality Matters: When Image Generation and Editing Meet Structured Visuals Paper • 2510.05091 • Published Oct 6, 2025 • 20
Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model Paper • 2509.04548 • Published Sep 4, 2025 • 5
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance Paper • 2508.21016 • Published Aug 28, 2025
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Paper • 2409.15278 • Published Sep 23, 2024 • 24