35 1 145

JaheimLee

JaheimLee

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

RESMP-DEV/Qwen3-Next-80B-A3B-Thinking-NVFP4

liked a Space about 2 months ago

HuggingFaceTB/smol-training-playbook

liked a Space 2 months ago

HuggingFaceFW/blogpost-fineweb-v1

View all activity

Organizations

liked a model about 1 month ago

RESMP-DEV/Qwen3-Next-80B-A3B-Thinking-NVFP4

Text Generation • Updated Oct 11, 2025 • 569 • 9

liked a Space about 2 months ago

The Smol Training Playbook

📚

2.77k

The secrets to building world-class LLMs

liked 2 Spaces 2 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.25k

Generate high-quality text data for LLMs using FineWeb

The Ultra-Scale Playbook

🌌

3.62k

The ultimate guide to training LLM on large GPU Clusters

New activity in DevQuasar/Qwen.Qwen3-Next-80B-A3B-Instruct-FP8 4 months ago

VLLM compatibility?

#1 opened 4 months ago by

aidendle94

liked a model 4 months ago

Qwen/Qwen3-Next-80B-A3B-Instruct

Text Generation • 81B • Updated Sep 17, 2025 • 3.56M • • 933

New activity in cyankiwi/GLM-4.5-Air-AWQ-4bit 5 months ago

Does this actually work with VLLM?

#1 opened 5 months ago by

sirus

liked a model 5 months ago

Multiverse4FM/Multiverse-32B

Text Generation • 33B • Updated Jun 13, 2025 • 94 • 10

liked a model 6 months ago

tencent/Hunyuan-A13B-Instruct-GPTQ-Int4

Text Generation • 80B • Updated Jul 11, 2025 • 138 • 49

liked a model 7 months ago

Tongyi-Zhiwen/QwenLong-L1-32B-AWQ

33B • Updated May 29, 2025 • 39 • 10

New activity in Qwen/Qwen3-32B-FP8 8 months ago

Is this a QAT model?

#2 opened 8 months ago by

Downtown-Case

liked a model 8 months ago

RedHatAI/Qwen3-32B-FP8-dynamic

Text Generation • 33B • Updated May 13, 2025 • 5.29k • 15

New activity in Qwen/Qwen3-30B-A3B 8 months ago

Minor changes, big gains -- Huggingface MoE modeling enhancement

👍 👀 5

#13 opened 8 months ago by

xiaowei4ai

liked 4 models 9 months ago

liked a model 10 months ago

tencent/HunyuanVideo-I2V

Image-to-Video • Updated Mar 13, 2025 • 250 • • 345

liked a dataset 11 months ago

Congliu/Chinese-DeepSeek-R1-Distill-data-110k-SFT

Viewer • Updated Feb 19, 2025 • 110k • 607 • 215

New activity in neavo/modern_bert_multilingual 11 months ago

NER任务

#1 opened 11 months ago by

JaheimLee

JaheimLee

AI & ML interests

Recent Activity

Organizations

JaheimLee's activity

The Smol Training Playbook

FineWeb: decanting the web for the finest text data at scale

The Ultra-Scale Playbook

VLLM compatibility?

Does this actually work with VLLM?

Is this a QAT model?

Minor changes, big gains -- Huggingface MoE modeling enhancement

NER任务