-
TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar
Paper β’ 2510.14972 β’ Published β’ 34 -
LightMem: Lightweight and Efficient Memory-Augmented Generation
Paper β’ 2510.18866 β’ Published β’ 110 -
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning
Paper β’ 2510.19338 β’ Published β’ 114 -
The Smol Training Playbook
π2.73kThe secrets to building world-class LLMs
Jonatan Borkowski PRO
j14i
AI & ML interests
None yet
Recent Activity
reacted
to
sergiopaniego's
post
with β€οΈ
12 minutes ago
This super detailed tutorial by @Paulescu is pure gold πͺ "Fine-tuning a Small Language Model for browser control with GRPO and OpenEnv"
LFM2-350M (@LiquidAI) + BrowserGym (OpenEnv) + GRPO (TRL) for learning browser control π€
https://paulabartabajo.substack.com/p/fine-tuning-lfm2-350m-for-browser
liked
a Space
1 day ago
hysts/daily-papers
reacted
to
sergiopaniego's
post
with π
1 day ago
Google DeepMind releases FunctionGemma, a 240M model specialized in π§ tool calling, built for fine-tuning
TRL has day-0 support. To celebrate, weβre sharing 2 new resources:
> Colab guide to fine-tune it for π browser control with BrowserGym OpenEnv
> Standalone training script
> Colab notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_functiongemma_browsergym_openenv.ipynb
> Training script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/browsergym_llm.py (command to run it inside the script)
> More notebooks in TRL: https://huggingface.co/docs/trl/example_overview#notebooks