mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-10 12:33:42 +00:00
Gated DeltaNet write-up (#901)
* Gated DeltaNet write-up * Add copyright and source information to script Added copyright notice and source information. * Remove unused import of Path in plot_memory_estimates * Fix url
This commit is contained in:
committed by
GitHub
parent
d6c3990c57
commit
c6b8332a59
@@ -172,6 +172,7 @@ Several folders contain optional materials as a bonus for interested readers:
|
||||
- [Grouped-Query Attention](ch04/04_gqa)
|
||||
- [Multi-Head Latent Attention](ch04/05_mla)
|
||||
- [Sliding Window Attention](ch04/06_swa)
|
||||
- [Gated DeltaNet](ch04/08_deltanet)
|
||||
- [Mixture-of-Experts (MoE)](ch04/07_moe)
|
||||
- **Chapter 5: Pretraining on unlabeled data:**
|
||||
- [Alternative Weight Loading Methods](ch05/02_alternative_weight_loading/)
|
||||
|
||||
Reference in New Issue
Block a user