Readability and code quality improvements (#959)

* Consistent dataset naming

* consistent section headers
This commit is contained in:
Sebastian Raschka
2026-02-17 19:44:56 -05:00
committed by GitHub
parent 7b1f740f74
commit be5e2a3331
48 changed files with 419 additions and 297 deletions

View File

@@ -85,6 +85,7 @@
"id": "ecc4dcee-34ea-4c05-9085-2f8887f70363",
"metadata": {},
"source": [
" \n",
"## 3.1 The problem with modeling long sequences"
]
},
@@ -127,6 +128,7 @@
"id": "3602c585-b87a-41c7-a324-c5e8298849df",
"metadata": {},
"source": [
" \n",
"## 3.2 Capturing data dependencies with attention mechanisms"
]
},
@@ -168,6 +170,7 @@
"id": "5efe05ff-b441-408e-8d66-cde4eb3397e3",
"metadata": {},
"source": [
" \n",
"## 3.3 Attending to different parts of the input with self-attention"
]
},
@@ -176,6 +179,7 @@
"id": "6d9af516-7c37-4400-ab53-34936d5495a9",
"metadata": {},
"source": [
" \n",
"### 3.3.1 A simple self-attention mechanism without trainable weights"
]
},
@@ -216,7 +220,7 @@
"id": "ff856c58-8382-44c7-827f-798040e6e697",
"metadata": {},
"source": [
"- By convention, the unnormalized attention weights are referred to as **\"attention scores\"** whereas the normalized attention scores, which sum to 1, are referred to as **\"attention weights\"**\n"
"- By convention, the unnormalized attention weights are referred to as **\"attention scores\"** whereas the normalized attention scores, which sum to 1, are referred to as **\"attention weights\"**"
]
},
{
@@ -503,6 +507,7 @@
"id": "5a454262-40eb-430e-9ca4-e43fb8d6cd89",
"metadata": {},
"source": [
" \n",
"### 3.3.2 Computing attention weights for all input tokens"
]
},
@@ -739,6 +744,7 @@
"id": "a303b6fb-9f7e-42bb-9fdb-2adabf0a6525",
"metadata": {},
"source": [
" \n",
"## 3.4 Implementing self-attention with trainable weights"
]
},
@@ -763,6 +769,7 @@
"id": "2b90a77e-d746-4704-9354-1ddad86e6298",
"metadata": {},
"source": [
" \n",
"### 3.4.1 Computing the attention weights step by step"
]
},
@@ -1046,6 +1053,7 @@
"id": "9d7b2907-e448-473e-b46c-77735a7281d8",
"metadata": {},
"source": [
" \n",
"### 3.4.2 Implementing a compact SelfAttention class"
]
},
@@ -1179,6 +1187,7 @@
"id": "c5025b37-0f2c-4a67-a7cb-1286af7026ab",
"metadata": {},
"source": [
" \n",
"## 3.5 Hiding future words with causal attention"
]
},
@@ -1203,6 +1212,7 @@
"id": "82f405de-cd86-4e72-8f3c-9ea0354946ba",
"metadata": {},
"source": [
" \n",
"### 3.5.1 Applying a causal attention mask"
]
},
@@ -1455,6 +1465,7 @@
"id": "7636fc5f-6bc6-461e-ac6a-99ec8e3c0912",
"metadata": {},
"source": [
" \n",
"### 3.5.2 Masking additional attention weights with dropout"
]
},
@@ -1554,6 +1565,7 @@
"id": "cdc14639-5f0f-4840-aa9d-8eb36ea90fb7",
"metadata": {},
"source": [
" \n",
"### 3.5.3 Implementing a compact causal self-attention class"
]
},
@@ -1679,6 +1691,7 @@
"id": "c8bef90f-cfd4-4289-b0e8-6a00dc9be44c",
"metadata": {},
"source": [
" \n",
"## 3.6 Extending single-head attention to multi-head attention"
]
},
@@ -1687,6 +1700,7 @@
"id": "11697757-9198-4a1c-9cee-f450d8bbd3b9",
"metadata": {},
"source": [
" \n",
"### 3.6.1 Stacking multiple single-head attention layers"
]
},
@@ -1776,6 +1790,7 @@
"id": "6836b5da-ef82-4b4c-bda1-72a462e48d4e",
"metadata": {},
"source": [
" \n",
"### 3.6.2 Implementing multi-head attention with weight splits"
]
},
@@ -2032,7 +2047,8 @@
"id": "dec671bf-7938-4304-ad1e-75d9920e7f43",
"metadata": {},
"source": [
"# Summary and takeaways"
" \n",
"## Summary and takeaways"
]
},
{
@@ -2061,7 +2077,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.16"
"version": "3.13.5"
}
},
"nbformat": 4,

View File

@@ -54,7 +54,8 @@
"id": "33dfa199-9aee-41d4-a64b-7e3811b9a616",
"metadata": {},
"source": [
"# Exercise 3.1"
" \n",
"## Exercise 3.1"
]
},
{
@@ -209,7 +210,8 @@
"id": "33543edb-46b5-4b01-8704-f7f101230544",
"metadata": {},
"source": [
"# Exercise 3.2"
" \n",
"## Exercise 3.2"
]
},
{
@@ -266,7 +268,8 @@
"id": "92bdabcb-06cf-4576-b810-d883bbd313ba",
"metadata": {},
"source": [
"# Exercise 3.3"
" \n",
"## Exercise 3.3"
]
},
{
@@ -339,7 +342,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.16"
"version": "3.13.5"
}
},
"nbformat": 4,