Add alternative weight loading strategy as backup (#82)

2026-04-10 12:33:42 +00:00 · 2024-03-20 08:43:18 -05:00
parent 820d5e3ed1
commit 4582995ced
10 changed files with 621 additions and 6 deletions
--- a/ch05/01_main-chapter-code/README.md
+++ b/ch05/01_main-chapter-code/README.md
@@ -0,0 +1,7 @@
+# Chapter 5: Pretraining on Unlabeled Data
+
+- [ch05.ipynb](ch05.ipynb) contains all the code as it appears in the chapter
+- [previous_chapters.py](previous_chapters.py) is a Python module that contains the `MultiHeadAttention` module from the previous chapter, which we import in [ch05.ipynb](ch05.ipynb) to pretrain the GPT model
+- [train.py](train.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to train the GPT model
+- [generate.py](generate.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to load and use the pretrained model weights from OpenAI
+