Gated DeltaNet write-up (#901)

* Gated DeltaNet write-up

* Add copyright and source information to script

Added copyright notice and source information.

* Remove unused import of Path in plot_memory_estimates

* Fix url
This commit is contained in:
Sebastian Raschka
2025-11-02 21:03:42 -06:00
committed by GitHub
parent d6c3990c57
commit c6b8332a59
5 changed files with 460 additions and 0 deletions

View File

@@ -172,6 +172,7 @@ Several folders contain optional materials as a bonus for interested readers:
- [Grouped-Query Attention](ch04/04_gqa)
- [Multi-Head Latent Attention](ch04/05_mla)
- [Sliding Window Attention](ch04/06_swa)
- [Gated DeltaNet](ch04/08_deltanet)
- [Mixture-of-Experts (MoE)](ch04/07_moe)
- **Chapter 5: Pretraining on unlabeled data:**
- [Alternative Weight Loading Methods](ch05/02_alternative_weight_loading/)