sliding window attention (#879)

This commit is contained in:
Sebastian Raschka
2025-10-12 22:13:20 -05:00
committed by GitHub
parent 21f0617ea3
commit 6eb6adfa33
11 changed files with 1456 additions and 1 deletions

View File

@@ -170,6 +170,7 @@ Several folders contain optional materials as a bonus for interested readers:
- [KV Cache](ch04/03_kv-cache)
- [Grouped-Query Attention](ch04/04_gqa)
- [Multi-Head Latent Attention](ch04/05_mla)
- [Sliding Window Attention](ch04/06_swa)
- **Chapter 5: Pretraining on unlabeled data:**
- [Alternative Weight Loading Methods](ch05/02_alternative_weight_loading/)
- [Pretraining GPT on the Project Gutenberg Dataset](ch05/03_bonus_pretraining_on_gutenberg)