Reza Sayar's picture

Reza Sayar PRO

Reza2kn

·

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

inspatio/worldfm

liked a dataset 1 day ago

Journey9ni/vstibench

liked a dataset 1 day ago

Journey9ni/SpatialStackData

View all activity

Organizations

upvoted a collection about 2 months ago

Health AI Developer Foundations (HAI-DEF)

Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 22 items • Updated Jan 12 • 201

upvoted 2 papers about 2 months ago

InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams

Paper • 2601.02281 • Published Jan 5 • 33

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 157

upvoted 2 collections 3 months ago

sam-audio

9 items • Updated 5 days ago • 129

XVLA

X-VLA is a soft-prompted Transformer for cross-embodiment robot learning • 6 items • Updated Dec 4, 2025 • 12

upvoted 2 articles 3 months ago

Article

Make and publish your Reachy Mini App

Dec 3, 2025

•

40

Article

Curating datasets directly on the Hub

Nov 27, 2025

•

22

upvoted a collection 3 months ago

Treble10

Treble Technologies and Hugging Face have entered in to a long term collaboration. In celebration, we are releasing the Treble10 dataset. • 3 items • Updated Oct 28, 2025 • 5

upvoted a paper 3 months ago

SAM 3D: 3Dfy Anything in Images

Paper • 2511.16624 • Published Nov 20, 2025 • 113

upvoted 3 collections 4 months ago

Persian Models

This is the largest collection of Persian models available on Huggingface • 739 items • Updated 5 days ago • 17

Persian Datasets

This the largest collection of Persian datasets available on Huggingface • 130 items • Updated Dec 27, 2025 • 15

NaturalVoices - Voice Conversion Datasets

This is a collaborative work of JHU Smile Lab and CMU MSP Lab. Please cite https://arxiv.org/abs/2511.00256 • 5 items • Updated Nov 10, 2025 • 4

upvoted 5 papers 4 months ago

Evolving Diagnostic Agents in a Virtual Clinical Environment

Paper • 2510.24654 • Published Oct 28, 2025 • 12

POWSM: A Phonetic Open Whisper-Style Speech Foundation Model

Paper • 2510.24992 • Published Oct 28, 2025 • 4

OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes

Paper • 2510.26800 • Published Oct 30, 2025 • 22

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 127

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published Oct 30, 2025 • 112

upvoted a collection 4 months ago

Emu3.5

Native Multimodal Models are World Learners 🌍 • 4 items • Updated Feb 4 • 74

upvoted 2 papers 4 months ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30, 2025 • 117

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Paper • 2510.27492 • Published Oct 30, 2025 • 86