Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

AI & ML interests

metallic intuition

Recent Activity

liked a model about 5 hours ago

ubergarm/Qwen3.5-397B-A17B-GGUF

upvoted a paper about 6 hours ago

On Data Engineering for Scaling LLM Terminal Capabilities

upvoted a paper about 11 hours ago

On the "Induction Bias" in Sequence Models

View all activity

Organizations

upvoted a paper about 6 hours ago

On Data Engineering for Scaling LLM Terminal Capabilities

Paper • 2602.21193 • Published 1 day ago • 75

upvoted a paper about 11 hours ago

On the "Induction Bias" in Sequence Models

Paper • 2602.18333 • Published 6 days ago • 3

upvoted a collection about 11 hours ago

Nemotron-Terminal

Collection

We are releasing Nemotron-Terminal models and training datasets. • 7 items • Updated 1 day ago • 18

upvoted a paper 1 day ago

Agents of Chaos

Paper • 2602.20021 • Published 3 days ago • 25

upvoted 3 papers 2 days ago

upvoted a collection 14 days ago

Health AI Developer Foundations (HAI-DEF)

Collection

Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 22 items • Updated Jan 12 • 198

upvoted a paper 16 days ago

Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

Paper • 2602.06291 • Published 20 days ago • 23

upvoted a paper 17 days ago

Revisiting the Shape Convention of Transformer Language Models

Paper • 2602.06471 • Published 20 days ago • 4

upvoted 2 papers 20 days ago

Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR

Paper • 2602.05261 • Published 21 days ago • 49

Horizon-LM: A RAM-Centric Architecture for LLM Training

Paper • 2602.04816 • Published 22 days ago • 17

upvoted a paper 22 days ago

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation

Paper • 2601.22813 • Published 27 days ago • 57

upvoted 2 papers 23 days ago

Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published 29 days ago • 21

Do Reasoning Models Enhance Embedding Models?

Paper • 2601.21192 • Published 28 days ago • 25

upvoted a paper 27 days ago

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published 28 days ago • 100

upvoted 3 papers 28 days ago

CGPT: Cluster-Guided Partial Tables with LLM-Generated Supervision for Table Retrieval

Paper • 2601.15849 • Published Jan 22 • 14

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published about 1 month ago • 40

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published 30 days ago • 23

upvoted an article 29 days ago

Article

Introducing Waypoint-1: Real-time interactive video diffusion from Overworld

Jan 20

•

Peter Szemraj PRO

AI & ML interests

Recent Activity

Organizations

pszemraj's activity

Introducing Waypoint-1: Real-time interactive video diffusion from Overworld