BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Paper
•
2402.10631
•
Published
•
2
| PPL | arc_easy | arc_challenge | piqa | winogrande | hellaswag | mmlu | QA Avg |
|---|---|---|---|---|---|---|---|
| 16.73 | 37.84 ± 1.00 | 21.50 ± 1.20 | 61.43 ± 1.14 | 49.88 ± 1.41 | 33.52 ± 0.47 | - | 40.83 |
Training method based on BitDistiller Paper
Base model
TinyLlama/TinyLlama_v1.1