Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models Paper • 2508.01908 • Published Aug 3, 2025 • 3
Simple and Scalable Strategies to Continually Pre-train Large Language Models Paper • 2403.08763 • Published Mar 13, 2024 • 51