HanSaem Kim's picture

278 17

HanSaem Kim

kensaem

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 7 hours ago

HyperAlign: Hypernetwork for Efficient Test-Time Alignment of Diffusion Models

upvoted a paper about 7 hours ago

AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation

upvoted a paper about 7 hours ago

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

View all activity

Organizations

None yet

upvoted 3 papers about 7 hours ago

HyperAlign: Hypernetwork for Efficient Test-Time Alignment of Diffusion Models

Paper • 2601.15968 • Published 7 days ago • 4

AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation

Paper • 2601.17761 • Published 4 days ago • 10

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

Paper • 2601.17124 • Published 6 days ago • 30

upvoted 2 papers 2 days ago

SkyReels-V3 Technique Report

Paper • 2601.17323 • Published 5 days ago • 7

Self-Refining Video Sampling

Paper • 2601.18577 • Published 3 days ago • 21

upvoted 3 papers 3 days ago

Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory

Paper • 2601.16296 • Published 7 days ago • 25

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published 6 days ago • 158

HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

Paper • 2601.14724 • Published 8 days ago • 72

upvoted 5 papers 6 days ago

SAMTok: Representing Any Mask with Two Words

Paper • 2601.16093 • Published 7 days ago • 40

Qwen3-TTS Technical Report

Paper • 2601.15621 • Published 7 days ago • 49

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published 8 days ago • 18

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published 7 days ago • 51

LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

Paper • 2601.14251 • Published 9 days ago • 23

upvoted a paper 7 days ago

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Paper • 2601.14250 • Published 9 days ago • 44

upvoted a paper 9 days ago

CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation

Paper • 2601.11096 • Published 13 days ago • 8

upvoted 4 papers 10 days ago

Action100M: A Large-scale Video Action Dataset

Paper • 2601.10592 • Published 14 days ago • 27

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Paper • 2601.10611 • Published 14 days ago • 26

Transition Matching Distillation for Fast Video Generation

Paper • 2601.09881 • Published 15 days ago • 32

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published 15 days ago • 189

upvoted a collection 10 days ago

FLUX.2

Our second generation of FLUX • 17 items • Updated 10 days ago • 115