Small Language Models for Kazakh: models, tokenizers, and datasets for Kazakh language modeling.
Saken Tukenov PRO
stukenov
AI & ML interests
None yet
Recent Activity
liked
a model 6 days ago
facebook/mms-tts-kaz updated
a model 9 days ago
stukenov/sozkz-llama-150m-200k-kk-base-v1 liked
a dataset 10 days ago
HuggingFaceTB/smollm-corpus Organizations
spaces 4
pinned
Runtime error
Kaz Offline LLM Arena
🥇
Kaz Offline Arena
Running on Zero
SozKZ Kazakh LLM
💬
Generate Kazakh responses to your questions
Runtime error
Kaz LLM Leaderboard
🏆
Evaluate LLMs using Kazakh MC tasks
Running
Sozkz Paper Small Language Models Kazakh
📈
Generate Kazakh text with small open‑source language models
models 26
stukenov/sozkz-llama-150m-200k-kk-base-v1
Text Generation • 0.2B • Updated
• 19
stukenov/kzcalm-baseline-v1
Updated
• 4
stukenov/kzcalm-sp-tokenizer-4k-kk-v1
Updated
stukenov/sozkz-core-gpt2-200k-kk-base-v1
Updated
stukenov/sozkz-core-llama-150m-kk-instruct-v2
Text Generation • 0.2B • Updated
• 87
stukenov/sozkz-core-llama-150m-kk-instruct-v1
Text Generation • 0.2B • Updated
• 30
stukenov/sozkz-core-llama-150m-kk-base-v1
Text Generation • 0.2B • Updated
• 94
stukenov/sozkz-core-llama-50m-kk-base-v4
60.8M • Updated
• 11
stukenov/sozkz-fix-mt5-50m-kk-morph-v1
Text Generation • 50.6M • Updated
• 6
stukenov/sozkz-core-gpt2-60m-kk-base-v1
Updated
datasets 31
stukenov/kzcalm-mimi-codes-kk-v1
Viewer
• Updated
• 232k • 50
stukenov/sozkz-corpus-tokenized-enkk-200k-v1
Viewer
• Updated
• 10.4M • 38
stukenov/kzcalm-mimi-codes-kk-v1-test
Viewer
• Updated
• 2 • 11
stukenov/kzcalm-tts-kk-v1
Viewer
• Updated
• 232k • 294
stukenov/sozkz-corpus-tokenized-kk-multidomain-200k-v1
Viewer
• Updated
• 1.18M • 38
stukenov/sozkz-corpus-tokenized-kk-200k-v1
Viewer
• Updated
• 422k • 34
stukenov/sozkz-corpus-tokenized-enkk-fineweb-edu-v1
Viewer
• Updated
• 9.02M • 26
stukenov/kaz-llm-lb-metainfo
Viewer
• Updated
• 13 • 27
stukenov/s-openbench-eval
Viewer
• Updated
• 5 • 205
stukenov/offline-data-results
Updated
• 46