mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-10 12:33:42 +00:00
Reduce Llama 3 RoPE memory requirements (#658)
* Llama3 from scratch improvements * Fix Llama 3 expensive RoPE memory issue * updates * update package * benchmark * remove unused rescale_theta
This commit is contained in:
committed by
GitHub
parent
c278745aff
commit
c4cde1c21b
3
.gitignore
vendored
3
.gitignore
vendored
@@ -51,6 +51,9 @@ ch05/07_gpt_to_llama/Llama-3.2-3B-Instruct
|
||||
ch05/10_llm-training-speed/middlemarch.txt
|
||||
ch05/10_llm-training-speed/loss.pdf
|
||||
ch05/10_llm-training-speed/model.pth
|
||||
ch05/07_gpt_to_llama/Untitled.ipynb
|
||||
ch05/07_gpt_to_llama/llama3.2-1B-instruct.pth
|
||||
ch05/07_gpt_to_llama/tokenizer.model
|
||||
|
||||
ch06/01_main-chapter-code/gpt2
|
||||
ch06/02_bonus_additional-experiments/gpt2
|
||||
|
||||
Reference in New Issue
Block a user