๐ June 2025 - Open works from the Chinese community Collection 30 items โข Updated 25 days ago โข 7
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper โข 2502.11089 โข Published Feb 16, 2025 โข 167
Retentive Network: A Successor to Transformer for Large Language Models Paper โข 2307.08621 โข Published Jul 17, 2023 โข 172