Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def β’ 22 items β’ Updated Jan 12 β’ 201
InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams Paper β’ 2601.02281 β’ Published Jan 5 β’ 33
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper β’ 2601.03233 β’ Published Jan 6 β’ 157
XVLA Collection X-VLA is a soft-prompted Transformer for cross-embodiment robot learning β’ 6 items β’ Updated Dec 4, 2025 β’ 12
Treble10 Collection Treble Technologies and Hugging Face have entered in to a long term collaboration. In celebration, we are releasing the Treble10 dataset. β’ 3 items β’ Updated Oct 28, 2025 β’ 5
Persian Models Collection This is the largest collection of Persian models available on Huggingface β’ 739 items β’ Updated 5 days ago β’ 17
Persian Datasets Collection This the largest collection of Persian datasets available on Huggingface β’ 130 items β’ Updated Dec 27, 2025 β’ 15
NaturalVoices - Voice Conversion Datasets Collection This is a collaborative work of JHU Smile Lab and CMU MSP Lab. Please cite https://arxiv.org/abs/2511.00256 β’ 5 items β’ Updated Nov 10, 2025 β’ 4
Evolving Diagnostic Agents in a Virtual Clinical Environment Paper β’ 2510.24654 β’ Published Oct 28, 2025 β’ 12
POWSM: A Phonetic Open Whisper-Style Speech Foundation Model Paper β’ 2510.24992 β’ Published Oct 28, 2025 β’ 4
OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes Paper β’ 2510.26800 β’ Published Oct 30, 2025 β’ 22
Kimi Linear: An Expressive, Efficient Attention Architecture Paper β’ 2510.26692 β’ Published Oct 30, 2025 β’ 127
Emu3.5: Native Multimodal Models are World Learners Paper β’ 2510.26583 β’ Published Oct 30, 2025 β’ 112
Emu3.5 Collection Native Multimodal Models are World Learners π β’ 4 items β’ Updated Feb 4 β’ 74
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper β’ 2510.26697 β’ Published Oct 30, 2025 β’ 117
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning Paper β’ 2510.27492 β’ Published Oct 30, 2025 β’ 86