TORead
updated
MegaScale: Scaling Large Language Model Training to More Than 10,000
GPUs
Paper
• 2402.15627
• Published
• 36
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Paper
• 2402.16822
• Published
• 17
FuseChat: Knowledge Fusion of Chat Models
Paper
• 2402.16107
• Published
• 39
Multi-LoRA Composition for Image Generation
Paper
• 2402.16843
• Published
• 31
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with
Audio2Video Diffusion Model under Weak Conditions
Paper
• 2402.17485
• Published
• 194
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper
• 2402.17753
• Published
• 19
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
• 2402.17764
• Published
• 627
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper
• 2310.11453
• Published
• 106
V3D: Video Diffusion Models are Effective 3D Generators
Paper
• 2403.06738
• Published
• 30
Stealing Part of a Production Language Model
Paper
• 2403.06634
• Published
• 91
Algorithmic progress in language models
Paper
• 2403.05812
• Published
• 19
Chronos: Learning the Language of Time Series
Paper
• 2403.07815
• Published
• 48
Motion Mamba: Efficient and Long Sequence Motion Generation with
Hierarchical and Bidirectional Selective SSM
Paper
• 2403.07487
• Published
• 16
FDGaussian: Fast Gaussian Splatting from Single Image via
Geometric-aware Diffusion Model
Paper
• 2403.10242
• Published
• 11
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
• 2403.10704
• Published
• 60
E5-V: Universal Embeddings with Multimodal Large Language Models
Paper
• 2407.12580
• Published
• 42
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge
Bases
Paper
• 2407.12784
• Published
• 51
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language
Models
Paper
• 2407.12327
• Published
• 79
PaliGemma: A versatile 3B VLM for transfer
Paper
• 2407.07726
• Published
• 72
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large
Multimodal Models
Paper
• 2407.07895
• Published
• 42
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Paper
• 2408.15237
• Published
• 42
Diffusion Models Are Real-Time Game Engines
Paper
• 2408.14837
• Published
• 126
Writing in the Margins: Better Inference Pattern for Long Context
Retrieval
Paper
• 2408.14906
• Published
• 144
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
Teacher
Paper
• 2408.14176
• Published
• 62
Building and better understanding vision-language models: insights and
future directions
Paper
• 2408.12637
• Published
• 133
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Paper
• 2408.10188
• Published
• 52
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Paper
• 2408.07055
• Published
• 68
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper
• 2408.05147
• Published
• 41
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
• 2408.04619
• Published
• 175
LLaVA-OneVision: Easy Visual Task Transfer
Paper
• 2408.03326
• Published
• 61
Language Model Can Listen While Speaking
Paper
• 2408.02622
• Published
• 40
OpenDevin: An Open Platform for AI Software Developers as Generalist
Agents
Paper
• 2407.16741
• Published
• 76
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio
Language Modeling
Paper
• 2408.16532
• Published
• 50
Law of Vision Representation in MLLMs
Paper
• 2408.16357
• Published
• 95
NVLM: Open Frontier-Class Multimodal LLMs
Paper
• 2409.11402
• Published
• 74