On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published 1 day ago • 75
Nemotron-Terminal Collection We are releasing Nemotron-Terminal models and training datasets. • 7 items • Updated 1 day ago • 18
Revisiting the Platonic Representation Hypothesis: An Aristotelian View Paper • 2602.14486 • Published 10 days ago • 11
Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook Paper • 2602.14299 • Published 11 days ago • 26
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 22 items • Updated Jan 12 • 198
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math Paper • 2602.06291 • Published 20 days ago • 23
Revisiting the Shape Convention of Transformer Language Models Paper • 2602.06471 • Published 20 days ago • 4
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Paper • 2602.05261 • Published 21 days ago • 49
Horizon-LM: A RAM-Centric Architecture for LLM Training Paper • 2602.04816 • Published 22 days ago • 17
Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation Paper • 2601.22813 • Published 27 days ago • 57
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published 29 days ago • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published 28 days ago • 100
CGPT: Cluster-Guided Partial Tables with LLM-Generated Supervision for Table Retrieval Paper • 2601.15849 • Published Jan 22 • 14
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability Paper • 2601.18778 • Published about 1 month ago • 40
view article Article Introducing Waypoint-1: Real-time interactive video diffusion from Overworld +3 Jan 20 • 40