2 8 14

Saeedreza Zouashkiani

saeedzou

AI & ML interests

Speech Processing (ASR, TTS, VC), Reinforcement Learning (Wireless Systems Optimization)

Recent Activity

updated a dataset 29 days ago

saeedzou/youtube-fa-cc-dataset

commented on an article about 1 month ago

LLM based Audio models

upvoted an article about 1 month ago

LLM based Audio models

View all activity

Organizations

updated a dataset 29 days ago

saeedzou/youtube-fa-cc-dataset

Updated 29 days ago • 98

commented on LLM based Audio models about 1 month ago

Amazing article, continue the great work!

upvoted an article about 1 month ago

Article

LLM based Audio models

Dec 18, 2025

•

upvoted a paper 3 months ago

MSR-Codec: A Low-Bitrate Multi-Stream Residual Codec for High-Fidelity Speech Generation with Information Disentanglement

Paper • 2509.13068 • Published Sep 16, 2025 • 1

liked 2 models 3 months ago

espnet/powsm

Automatic Speech Recognition • Updated 8 days ago • 111 • 8

nvidia/parakeet-rnnt-110m-da-dk

Automatic Speech Recognition • Updated Oct 19, 2025 • 244 • 15

upvoted an article 4 months ago

Article

From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages

Feb 11, 2025

•

commented on From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages 4 months ago

Hi!
This is a great work. I saw you mentioned that the fine-tuning used 2.5M samples of around 7k hours.
I know that more might be better, but will it work for around 600 hours in another language?
In addition, can you tell us about the limitations of the fine-tuned/original model?
Does it have hallucination problems like those encountered in other TTS models like F5-TTS?