mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-10 12:33:42 +00:00
Qwen3 Coder Flash & MoE from Scratch (#760)
* Qwen3 Coder Flash & MoE from Scratch * update * refinements * updates * update * update * update
This commit is contained in:
committed by
GitHub
parent
145322ded8
commit
f92b40e4ab
@@ -158,7 +158,7 @@ Several folders contain optional materials as a bonus for interested readers:
|
||||
- [Building a User Interface to Interact With the Pretrained LLM](ch05/06_user_interface)
|
||||
- [Converting GPT to Llama](ch05/07_gpt_to_llama)
|
||||
- [Llama 3.2 From Scratch](ch05/07_gpt_to_llama/standalone-llama32.ipynb)
|
||||
- [Qwen3 From Scratch](ch05/11_qwen3/standalone-qwen3.ipynb)
|
||||
- [Qwen3 Dense and Mixture-of-Experts (MoE) From Scratch](ch05/11_qwen3/)
|
||||
- [Memory-efficient Model Weight Loading](ch05/08_memory_efficient_weight_loading/memory-efficient-state-dict.ipynb)
|
||||
- [Extending the Tiktoken BPE Tokenizer with New Tokens](ch05/09_extending-tokenizers/extend-tiktoken.ipynb)
|
||||
- [PyTorch Performance Tips for Faster LLM Training](ch05/10_llm-training-speed)
|
||||
|
||||
Reference in New Issue
Block a user