add more notes and embed figures externally to save space
@@ -49,7 +49,7 @@
|
||||
"id": "7d4f11e0-4434-4979-9dee-e1207df0eb01",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/mental-model.webp\" width=450px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/01.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -76,7 +76,7 @@
|
||||
"id": "5c5213e9-bd1c-437e-aee8-f5e8fb717251",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/mental-model-2.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/02.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -136,7 +136,7 @@
|
||||
"id": "4adce779-857b-4418-9501-12a7f3818d88",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/chapter-steps.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/03.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -204,7 +204,7 @@
|
||||
"id": "9665e8ab-20ca-4100-b9b9-50d9bdee33be",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/gpt-in-out.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/04.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -294,7 +294,7 @@
|
||||
"id": "314ac47a-69cc-4597-beeb-65bed3b5910f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/layernorm.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/05.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -380,7 +380,7 @@
|
||||
"id": "570db83a-205c-4f6f-b219-1f6195dde1a7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/layernorm2.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/06.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -551,7 +551,7 @@
|
||||
"id": "e136cfc4-7c89-492e-b120-758c272bca8c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/overview-after-ln.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/07.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -696,7 +696,7 @@
|
||||
"id": "fdcaacfa-3cfc-4c9e-b668-b71a2753145a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/ffn.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/09.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -727,7 +727,15 @@
|
||||
"id": "8f8756c5-6b04-443b-93d0-e555a316c377",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/mental-model-3.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/10.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e5da2a50-04f4-4388-af23-ad32e405a972",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/11.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -749,7 +757,7 @@
|
||||
"- This is achieved by adding the output of one layer to the output of a later layer, usually skipping one or more layers in between\n",
|
||||
"- Let's illustrate this idea with a small example network:\n",
|
||||
"\n",
|
||||
"<img src=\"figures/shortcut-example.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/12.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -957,7 +965,7 @@
|
||||
"id": "36b64d16-94a6-4d13-8c85-9494c50478a9",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/transformer-block.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/13.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1000,7 +1008,7 @@
|
||||
"id": "91f502e4-f3e4-40cb-8268-179eec002394",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/mental-model-final.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/15.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1025,7 +1033,7 @@
|
||||
"id": "9b7b362d-f8c5-48d2-8ebd-722480ac5073",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/gpt.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/14.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1288,7 +1296,7 @@
|
||||
"id": "caade12a-fe97-480f-939c-87d24044edff",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/iterative-gen.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/16.webp\" width=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1307,7 +1315,7 @@
|
||||
"id": "7ee0f32c-c18c-445e-b294-a879de2aa187",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"figures/generate-text.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/17.webp\" width=\"600px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1353,7 +1361,7 @@
|
||||
"source": [
|
||||
"- The `generate_text_simple` above implements an iterative process, where it creates one token at a time\n",
|
||||
"\n",
|
||||
"<img src=\"figures/iterative-generate.webp\" width=350px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch04_compressed/18.webp\" width=\"600px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1506,7 +1514,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.4"
|
||||
"version": "3.10.6"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
|
Before Width: | Height: | Size: 29 KiB |
|
Before Width: | Height: | Size: 24 KiB |
|
Before Width: | Height: | Size: 36 KiB |
|
Before Width: | Height: | Size: 21 KiB |
|
Before Width: | Height: | Size: 30 KiB |
|
Before Width: | Height: | Size: 18 KiB |
|
Before Width: | Height: | Size: 24 KiB |
|
Before Width: | Height: | Size: 27 KiB |
|
Before Width: | Height: | Size: 14 KiB |
|
Before Width: | Height: | Size: 15 KiB |
|
Before Width: | Height: | Size: 21 KiB |
|
Before Width: | Height: | Size: 21 KiB |
|
Before Width: | Height: | Size: 25 KiB |
|
Before Width: | Height: | Size: 21 KiB |
|
Before Width: | Height: | Size: 32 KiB |
|
Before Width: | Height: | Size: 26 KiB |
|
Before Width: | Height: | Size: 15 KiB |