Grouped-Query Attention memory (#874)

* GQA memory

* remove redundant code

* update links

* update
This commit is contained in:
Sebastian Raschka
2025-10-11 08:44:19 -05:00
committed by GitHub
parent b8e12e1dd1
commit c814814d72
7 changed files with 1114 additions and 0 deletions

View File

@@ -168,6 +168,7 @@ Several folders contain optional materials as a bonus for interested readers:
- **Chapter 4: Implementing a GPT model from scratch**
- [FLOPS Analysis](ch04/02_performance-analysis/flops-analysis.ipynb)
- [KV Cache](ch04/03_kv-cache)
- [Grouped-Query Attention](ch04/04_gqa)
- **Chapter 5: Pretraining on unlabeled data:**
- [Alternative Weight Loading Methods](ch05/02_alternative_weight_loading/)
- [Pretraining GPT on the Project Gutenberg Dataset](ch05/03_bonus_pretraining_on_gutenberg)