Readability and code quality improvements (#959)

* Consistent dataset naming

* consistent section headers
This commit is contained in:
Sebastian Raschka
2026-02-17 19:44:56 -05:00
committed by GitHub
parent 7b1f740f74
commit be5e2a3331
48 changed files with 419 additions and 297 deletions

View File

@@ -73,6 +73,7 @@
"id": "53fe99ab-0bcf-4778-a6b5-6db81fb826ef",
"metadata": {},
"source": [
" \n",
"## 4.1 Coding an LLM architecture"
]
},
@@ -323,6 +324,7 @@
"id": "f8332a00-98da-4eb4-b882-922776a89917",
"metadata": {},
"source": [
" \n",
"## 4.2 Normalizing activations with layer normalization"
]
},
@@ -606,6 +608,7 @@
"id": "11190e7d-8c29-4115-824a-e03702f9dd54",
"metadata": {},
"source": [
" \n",
"## 4.3 Implementing a feed forward network with GELU activations"
]
},
@@ -789,6 +792,7 @@
"id": "4ffcb905-53c7-4886-87d2-4464c5fecf89",
"metadata": {},
"source": [
" \n",
"## 4.4 Adding shortcut connections"
]
},
@@ -950,6 +954,7 @@
"id": "cae578ca-e564-42cf-8635-a2267047cdff",
"metadata": {},
"source": [
" \n",
"## 4.5 Connecting attention and linear layers in a transformer block"
]
},
@@ -1068,6 +1073,7 @@
"id": "46618527-15ac-4c32-ad85-6cfea83e006e",
"metadata": {},
"source": [
" \n",
"## 4.6 Coding the GPT model"
]
},
@@ -1332,6 +1338,7 @@
"id": "da5d9bc0-95ab-45d4-9378-417628d86e35",
"metadata": {},
"source": [
" \n",
"## 4.7 Generating text"
]
},
@@ -1519,11 +1526,20 @@
"id": "a35278b6-9e5c-480f-83e5-011a1173648f",
"metadata": {},
"source": [
" \n",
"## Summary and takeaways\n",
"\n",
"- See the [./gpt.py](./gpt.py) script, a self-contained script containing the GPT model we implement in this Jupyter notebook\n",
"- You can find the exercise solutions in [./exercise-solutions.ipynb](./exercise-solutions.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4821ac83-ef84-42c4-a327-32bf2820a8e5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {

View File

@@ -53,7 +53,8 @@
"id": "5fea8be3-30a1-4623-a6d7-b095c6c1092e",
"metadata": {},
"source": [
"# Exercise 4.1: Parameters in the feed forward versus attention module"
" \n",
"## Exercise 4.1: Parameters in the feed forward versus attention module"
]
},
{
@@ -182,7 +183,8 @@
"id": "0f7b7c7f-0fa1-4d30-ab44-e499edd55b6d",
"metadata": {},
"source": [
"# Exercise 4.2: Initialize larger GPT models"
" \n",
"## Exercise 4.2: Initialize larger GPT models"
]
},
{
@@ -329,7 +331,8 @@
"id": "f5f2306e-5dc8-498e-92ee-70ae7ec37ac1",
"metadata": {},
"source": [
"# Exercise 4.3: Using separate dropout parameters"
" \n",
"## Exercise 4.3: Using separate dropout parameters"
]
},
{
@@ -451,7 +454,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.16"
"version": "3.13.5"
}
},
"nbformat": 4,