SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization Paper • 2602.04811 • Published 17 days ago • 2
UM-Text: A Unified Multimodal Model for Image Understanding Paper • 2601.08321 • Published Jan 13 • 10
From RAG to Agentic RAG for Faithful Islamic Question Answering Paper • 2601.07528 • Published Jan 12 • 1
Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics Paper • 2601.04946 • Published Jan 8
ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation Paper • 2601.03955 • Published Jan 7 • 3
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation Paper • 2512.24724 • Published Dec 31, 2025 • 7
Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow Paper • 2512.24766 • Published Dec 31, 2025 • 9
What matters for Representation Alignment: Global Information or Spatial Structure? Paper • 2512.10794 • Published Dec 11, 2025 • 9
ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models Paper • 2512.07843 • Published Nov 24, 2025 • 22
Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation Paper • 2510.06961 • Published Oct 8, 2025 • 11
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 39
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models Paper • 2510.06107 • Published Oct 7, 2025 • 3
view post Post 2286 Gradio 6.0 is launching this year!We're revamping the core to give you performance improvements and unprecedented customization. Build better, faster.Check out the GitHub milestone to learn what's planned under the hood! https://github.com/gradio-app/gradio/issues?q=is:issue%20state:open%20milestone:%22Gradio%206%22 See translation 🔥 5 5 🤗 2 2 👍 1 1 + Reply
view post Post 4107 The new multimodalart/self-forcing model and demo are truly impressive! See translation 🔥 4 4 + Reply
Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images Paper • 2506.13458 • Published Jun 16, 2025