2 79 17

Pham Van Linh

phamvanlinh143

AI & ML interests

OCR, AI, DL

Recent Activity

upvoted an article 1 day ago

2. Attention Optimizations: From Standard Attention to FlashAttention

liked a model 4 days ago

meta-llama/Llama-3.2-11B-Vision

liked a dataset 5 days ago

OleehyO/latex-formulas-80M

View all activity

Organizations

None yet

upvoted an article 1 day ago

Article

2. Attention Optimizations: From Standard Attention to FlashAttention

2 days ago

•

liked a model 4 days ago

meta-llama/Llama-3.2-11B-Vision

Image-Text-to-Text • 11B • Updated Sep 27, 2024 • 11.4k • 580

liked 2 datasets 5 days ago

OleehyO/latex-formulas-80M

Viewer • Updated Aug 22, 2025 • 78.2M • 1.78k • 22

OleehyO/latex-formulas

Viewer • Updated Aug 13, 2025 • 1.56M • 1.13k • 99

upvoted 2 articles 7 days ago

Article

VLM-OCR Recipes on GPU Infrastructure

27 days ago

•

Article

My Journey Into Vision Models

Apr 12, 2025

•

liked a model 9 days ago

datalab-to/chandra

Image-Text-to-Text • 9B • Updated Oct 21, 2025 • 298k • 480

upvoted 3 papers 10 days ago

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published 14 days ago • 39

Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 34

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14, 2025 • 140

upvoted an article 10 days ago

Article

Performant local mixture-of-experts CPU inference with GPU acceleration in llama.cpp

12 days ago

•

liked a model 10 days ago

nvidia/NVIDIA-Nemotron-Parse-v1.1

Image-Text-to-Text • Updated 14 days ago • 121k • 141

upvoted an article 16 days ago

Article

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

23 days ago

•

upvoted an article about 1 month ago

Article

The Optimal Architecture for Small Language Models

Dec 26, 2025

•

116

upvoted 3 articles about 2 months ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Dec 18, 2025

•

119

Article

Shrinking Giants: The Quantization Mathematics Making LLMs Accessible

May 3, 2025

•

Article

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

Aug 17, 2022

•

123

liked 2 Spaces about 2 months ago

The Smol Training Playbook

📚

2.97k

The secrets to building world-class LLMs

The Ultra-Scale Playbook

🌌

3.68k

The ultimate guide to training LLM on large GPU Clusters

upvoted an article about 2 months ago

Article

Everything You Need to Know about Knowledge Distillation

Mar 6, 2025

•

Pham Van Linh

AI & ML interests

Recent Activity

Organizations

phamvanlinh143's activity

2. Attention Optimizations: From Standard Attention to FlashAttention

VLM-OCR Recipes on GPU Infrastructure

My Journey Into Vision Models

Performant local mixture-of-experts CPU inference with GPU acceleration in llama.cpp

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

The Optimal Architecture for Small Language Models

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Shrinking Giants: The Quantization Mathematics Making LLMs Accessible

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

The Smol Training Playbook

The Ultra-Scale Playbook

Everything You Need to Know about Knowledge Distillation