AI & ML interests
None defined yet.
Recent Activity
Papers
Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models
$μ^2$Tokenizer: Differentiable Multi-Scale Multi-Modal Tokenizer for Radiology Report Generation
Organization Card
Hey there! Welcome to our team's corner at HuggingFace! We're a bunch of enthusiastic folks who are totally into the exciting world of Multimodal Large Language Models.
Our research explores innovative ways to enhance interactions between language and Image/Vidio/Audio, aiming to advance the capabilities of AI in understanding and generating multimodal content.
We're a curious bunch, always on the lookout for cool ways to make AI systems understand and generate human-like language.
Official models and datasets for paper μ²Tokenizer(https://arxiv.org/abs/2507.00316)
models
8
AlpachinoNLP/u2Qwen3-1.7B-Instruct
Image-to-Text
•
Updated
•
21
AlpachinoNLP/u2Qwen3-4B-Instruct
Image-to-Text
•
Updated
•
9
•
1
AlpachinoNLP/LongCLIP-ViT-B-32
Zero-Shot Image Classification
•
0.2B
•
Updated
•
20
AlpachinoNLP/QTSplus-3B
Image-Text-to-Text
•
Updated
•
1
•
1
AlpachinoNLP/QTSplus-7B
Image-Text-to-Text
•
Updated
•
2
•
1
AlpachinoNLP/QTSplus-3B-FT
Image-Text-to-Text
•
Updated
•
3
•
1
AlpachinoNLP/Baichuan-7B-Instruction
Text Generation
•
7B
•
Updated
•
6
•
2
AlpachinoNLP/Baichuan-13B-Instruction
Text Generation
•
Updated
•
11
•
6