AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
Making Dialogue Grounding Data Rich: A Three-Tier Data Synthesis Framework for Generalized Referring Expression Comprehension
Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models
Organization Card
Hey 👋! Welcome to our team's corner at HuggingFace! We're a bunch of enthusiastic folks who are totally into the exciting world of Multimodal Large Language Models.
Our research explores innovative ways to enhance interactions between language and Image/Vidio/Audio, aiming to advance the capabilities of AI in understanding and generating multimodal content.
We're a curious bunch, always on the lookout for cool ways to make AI systems understand and generate human-like language.
Official models and datasets for paper μ²Tokenizer(https://arxiv.org/abs/2507.00316)
models 12
AlpachinoNLP/u2Qwen3-4B-Thinking
Image-to-Text • Updated
• 40
AlpachinoNLP/QTSplus-LLaVA-Video-7B-Qwen2
Image-Text-to-Text • Updated
AlpachinoNLP/QTSplus-Qwen2.5-VL-7B
Image-Text-to-Text • Updated
AlpachinoNLP/QTSplus-InternVL2.5-8B
Image-Text-to-Text • Updated
• 6
AlpachinoNLP/u2Qwen3-4B-Instruct
Image-to-Text • Updated
• 4 • 1
AlpachinoNLP/u2Qwen3-1.7B-Instruct
Image-to-Text • Updated
• 544 • 1
AlpachinoNLP/LongCLIP-ViT-B-32
Zero-Shot Image Classification • 0.2B • Updated
• 325 • 1
AlpachinoNLP/QTSplus-3B
Image-Text-to-Text • Updated
• 1
AlpachinoNLP/QTSplus-7B
Image-Text-to-Text • Updated
• 2 • 1
AlpachinoNLP/QTSplus-3B-FT
Image-Text-to-Text • Updated
• 1 • 1
datasets 5
AlpachinoNLP/CC_SBU_High_Quality_Single_Choice
Updated
AlpachinoNLP/CC_SBU_High_Quality_Caption
Viewer
• Updated
• 6.46M • 9
AlpachinoNLP/QTSplus-Dataset
Preview
• Updated
• 522 • 1
AlpachinoNLP/CT-RATE-Chinese
Viewer
• Updated
• 50.2k • 8
AlpachinoNLP/CT-RATE-Mini
Updated
• 77 • 1