Use figure numbers in ch05-7 (#881)

This commit is contained in:
Sebastian Raschka
2025-10-13 16:26:35 -05:00
committed by GitHub
parent bf039ff3dc
commit b969b3ef7a
3 changed files with 54 additions and 55 deletions

View File

@@ -75,7 +75,7 @@
"id": "efd27fcc-2886-47cb-b544-046c2c31f02a", "id": "efd27fcc-2886-47cb-b544-046c2c31f02a",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/chapter-overview.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/01.webp\" width=500px>"
] ]
}, },
{ {
@@ -91,7 +91,7 @@
"id": "f67711d4-8391-4fee-aeef-07ea53dd5841", "id": "f67711d4-8391-4fee-aeef-07ea53dd5841",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/mental-model--0.webp\" width=400px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/02.webp\" width=400px>"
] ]
}, },
{ {
@@ -195,7 +195,7 @@
"id": "741881f3-cee0-49ad-b11d-b9df3b3ac234", "id": "741881f3-cee0-49ad-b11d-b9df3b3ac234",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/gpt-process.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/03.webp\" width=500px>"
] ]
}, },
{ {
@@ -346,7 +346,7 @@
"id": "384d86a9-0013-476c-bb6b-274fd5f20b29", "id": "384d86a9-0013-476c-bb6b-274fd5f20b29",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/proba-to-text.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/04.webp\" width=500px>"
] ]
}, },
{ {
@@ -440,7 +440,7 @@
"id": "ad90592f-0d5d-4ec8-9ff5-e7675beab10e", "id": "ad90592f-0d5d-4ec8-9ff5-e7675beab10e",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/proba-index.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/06.webp\" width=500px>"
] ]
}, },
{ {
@@ -601,7 +601,7 @@
"id": "5bd24b7f-b760-47ad-bc84-86d13794aa54", "id": "5bd24b7f-b760-47ad-bc84-86d13794aa54",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/cross-entropy.webp?123\" width=400px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/07.webp\" width=400px>"
] ]
}, },
{ {
@@ -945,7 +945,7 @@
"id": "46bdaa07-ba96-4ac1-9d71-b3cc153910d9", "id": "46bdaa07-ba96-4ac1-9d71-b3cc153910d9",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/batching.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/09.webp\" width=500px>"
] ]
}, },
{ {
@@ -1210,7 +1210,7 @@
"id": "43875e95-190f-4b17-8f9a-35034ba649ec", "id": "43875e95-190f-4b17-8f9a-35034ba649ec",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/mental-model-1.webp\" width=400px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/10.webp\" width=400px>"
] ]
}, },
{ {
@@ -1231,7 +1231,7 @@
"- In this section, we finally implement the code for training the LLM\n", "- In this section, we finally implement the code for training the LLM\n",
"- We focus on a simple training function (if you are interested in augmenting this training function with more advanced techniques, such as learning rate warmup, cosine annealing, and gradient clipping, please refer to [Appendix D](../../appendix-D/01_main-chapter-code))\n", "- We focus on a simple training function (if you are interested in augmenting this training function with more advanced techniques, such as learning rate warmup, cosine annealing, and gradient clipping, please refer to [Appendix D](../../appendix-D/01_main-chapter-code))\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/train-steps.webp\" width=300px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/11.webp\" width=300px>"
] ]
}, },
{ {
@@ -1464,7 +1464,7 @@
"id": "eb380c42-b31c-4ee1-b8b9-244094537272", "id": "eb380c42-b31c-4ee1-b8b9-244094537272",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/mental-model-2.webp\" width=350px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/13.webp\" width=350px>"
] ]
}, },
{ {
@@ -1849,7 +1849,7 @@
"id": "7ae6fffd-2730-4abe-a2d3-781fc4836f17", "id": "7ae6fffd-2730-4abe-a2d3-781fc4836f17",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/topk.webp\" width=500px>\n", "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/15.webp\" width=500px>\n",
"\n", "\n",
"- (Please note that the numbers in this figure are truncated to two\n", "- (Please note that the numbers in this figure are truncated to two\n",
"digits after the decimal point to reduce visual clutter. The values in the Softmax row should add up to 1.0.)" "digits after the decimal point to reduce visual clutter. The values in the Softmax row should add up to 1.0.)"
@@ -2060,7 +2060,7 @@
"source": [ "source": [
"- Training LLMs is computationally expensive, so it's crucial to be able to save and load LLM weights\n", "- Training LLMs is computationally expensive, so it's crucial to be able to save and load LLM weights\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/mental-model-3.webp\" width=400px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/16.webp\" width=400px>"
] ]
}, },
{ {
@@ -2393,7 +2393,7 @@
"id": "20f19d32-5aae-4176-9f86-f391672c8f0d", "id": "20f19d32-5aae-4176-9f86-f391672c8f0d",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/gpt-sizes.webp?timestamp=123\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/17.webp\" width=500px>"
] ]
}, },
{ {
@@ -2627,7 +2627,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.16" "version": "3.13.5"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@@ -76,7 +76,7 @@
"id": "a445828a-ff10-4efa-9f60-a2e2aed4c87d", "id": "a445828a-ff10-4efa-9f60-a2e2aed4c87d",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/chapter-overview.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/01.webp\" width=500px>"
] ]
}, },
{ {
@@ -113,7 +113,7 @@
"id": "6c29ef42-46d9-43d4-8bb4-94974e1665e4", "id": "6c29ef42-46d9-43d4-8bb4-94974e1665e4",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/instructions.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/02.webp\" width=500px>"
] ]
}, },
{ {
@@ -132,7 +132,7 @@
"id": "0b37a0c4-0bb1-4061-b1fe-eaa4416d52c3", "id": "0b37a0c4-0bb1-4061-b1fe-eaa4416d52c3",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/spam-non-spam.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/03.webp\" width=400px>"
] ]
}, },
{ {
@@ -150,7 +150,7 @@
"id": "5f628975-d2e8-4f7f-ab38-92bb868b7067", "id": "5f628975-d2e8-4f7f-ab38-92bb868b7067",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/overview-1.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/04.webp\" width=500px>"
] ]
}, },
{ {
@@ -712,7 +712,7 @@
"id": "0829f33f-1428-4f22-9886-7fee633b3666", "id": "0829f33f-1428-4f22-9886-7fee633b3666",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/pad-input-sequences.webp?123\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/06.webp\" width=500px>"
] ]
}, },
{ {
@@ -887,7 +887,7 @@
"id": "64bcc349-205f-48f8-9655-95ff21f5e72f", "id": "64bcc349-205f-48f8-9655-95ff21f5e72f",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/batch.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/07.webp\" width=500px>"
] ]
}, },
{ {
@@ -1019,7 +1019,7 @@
"source": [ "source": [
"- In this section, we initialize the pretrained model we worked with in the previous chapter\n", "- In this section, we initialize the pretrained model we worked with in the previous chapter\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/overview-2.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/08.webp\" width=500px>"
] ]
}, },
{ {
@@ -1217,7 +1217,7 @@
"id": "d6e9d66f-76b2-40fc-9ec5-3f972a8db9c0", "id": "d6e9d66f-76b2-40fc-9ec5-3f972a8db9c0",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/lm-head.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/09.webp\" width=500px>"
] ]
}, },
{ {
@@ -1550,7 +1550,7 @@
"id": "0be7c1eb-c46c-4065-8525-eea1b8c66d10", "id": "0be7c1eb-c46c-4065-8525-eea1b8c66d10",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/trainable.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/10.webp\" width=500px>"
] ]
}, },
{ {
@@ -1661,7 +1661,7 @@
"id": "7df9144f-6817-4be4-8d4b-5d4dadfe4a9b", "id": "7df9144f-6817-4be4-8d4b-5d4dadfe4a9b",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/input-and-output.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/11.webp\" width=500px>"
] ]
}, },
{ {
@@ -1704,7 +1704,7 @@
"id": "8df08ae0-e664-4670-b7c5-8a2280d9b41b", "id": "8df08ae0-e664-4670-b7c5-8a2280d9b41b",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/attention-mask.webp\" width=200px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/12.webp\" width=200px>"
] ]
}, },
{ {
@@ -1720,7 +1720,7 @@
"id": "669e1fd1-ace8-44b4-b438-185ed0ba8b33", "id": "669e1fd1-ace8-44b4-b438-185ed0ba8b33",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/overview-3.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/13.webp\" width=300px>"
] ]
}, },
{ {
@@ -1736,7 +1736,7 @@
"id": "557996dd-4c6b-49c4-ab83-f60ef7e1d69e", "id": "557996dd-4c6b-49c4-ab83-f60ef7e1d69e",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/class-argmax.webp\" width=600px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/14.webp\" width=600px>"
] ]
}, },
{ {
@@ -2053,7 +2053,7 @@
"id": "979b6222-1dc2-4530-9d01-b6b04fe3de12", "id": "979b6222-1dc2-4530-9d01-b6b04fe3de12",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/training-loop.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/15.webp\" width=500px>"
] ]
}, },
{ {
@@ -2371,7 +2371,7 @@
"id": "72ebcfa2-479e-408b-9cf0-7421f6144855", "id": "72ebcfa2-479e-408b-9cf0-7421f6144855",
"metadata": {}, "metadata": {},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/overview-4.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/18.webp\" width=500px>"
] ]
}, },
{ {
@@ -2590,7 +2590,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.16" "version": "3.13.5"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@@ -79,7 +79,7 @@
"id": "264fca98-2f9a-4193-b435-2abfa3b4142f" "id": "264fca98-2f9a-4193-b435-2abfa3b4142f"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/overview.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/01.webp\" width=500px>"
] ]
}, },
{ {
@@ -111,7 +111,7 @@
"id": "18dc0535-0904-44ed-beaf-9b678292ef35" "id": "18dc0535-0904-44ed-beaf-9b678292ef35"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/instruction-following.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/02.webp\" width=500px>"
] ]
}, },
{ {
@@ -123,7 +123,7 @@
"source": [ "source": [
"- The topics covered in this chapter are summarized in the figure below\n", "- The topics covered in this chapter are summarized in the figure below\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-1.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/03.webp\" width=500px>"
] ]
}, },
{ {
@@ -312,7 +312,7 @@
"id": "dffa4f70-44d4-4be4-89a9-2159f4885b10" "id": "dffa4f70-44d4-4be4-89a9-2159f4885b10"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/prompt-style.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/04.webp?2\" width=500px>"
] ]
}, },
{ {
@@ -509,7 +509,7 @@
"id": "233f63bd-9755-4d07-8884-5e2e5345cf27" "id": "233f63bd-9755-4d07-8884-5e2e5345cf27"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-2.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/05.webp?1\" width=500px>"
] ]
}, },
{ {
@@ -521,7 +521,7 @@
"source": [ "source": [
"- We tackle this dataset batching in several steps, as summarized in the figure below\n", "- We tackle this dataset batching in several steps, as summarized in the figure below\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/detailed-batching.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/06.webp?1\" width=500px>"
] ]
}, },
{ {
@@ -533,7 +533,7 @@
"source": [ "source": [
"- First, we implement an `InstructionDataset` class that pre-tokenizes all inputs in the dataset, similar to the `SpamDataset` in chapter 6\n", "- First, we implement an `InstructionDataset` class that pre-tokenizes all inputs in the dataset, similar to the `SpamDataset` in chapter 6\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/pretokenizing.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/07.webp?1\" width=500px>"
] ]
}, },
{ {
@@ -627,7 +627,7 @@
"id": "65c4d943-4aa8-4a44-874e-05bc6831fbd3" "id": "65c4d943-4aa8-4a44-874e-05bc6831fbd3"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/padding.webp\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/08.webp?1\" width=500px>"
] ]
}, },
{ {
@@ -710,12 +710,10 @@
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "c46832ab-39b7-45f8-b330-ac9adfa10d1b", "id": "5673ade5-be4c-4a2c-9a9a-d5c63fb1c424",
"metadata": { "metadata": {},
"id": "c46832ab-39b7-45f8-b330-ac9adfa10d1b"
},
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/batching-step-4.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/09.webp?1\" width=400px>"
] ]
}, },
{ {
@@ -736,7 +734,7 @@
"id": "0386b6fe-3455-4e70-becd-a5a4681ba2ef" "id": "0386b6fe-3455-4e70-becd-a5a4681ba2ef"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/inputs-targets.webp?1\" width=400px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/10.webp?1\" width=400px>"
] ]
}, },
{ {
@@ -819,7 +817,7 @@
"source": [ "source": [
"- Next, we introduce an `ignore_index` value to replace all padding token IDs with a new value; the purpose of this `ignore_index` is that we can ignore padding values in the loss function (more on that later)\n", "- Next, we introduce an `ignore_index` value to replace all padding token IDs with a new value; the purpose of this `ignore_index` is that we can ignore padding values in the loss function (more on that later)\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/batching-step-5.webp?1\" width=500px>\n", "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/11.webp?1\" width=400px>\n",
"\n", "\n",
"- Concretely, this means that we replace the token IDs corresponding to `50256` with `-100` as illustrated below" "- Concretely, this means that we replace the token IDs corresponding to `50256` with `-100` as illustrated below"
] ]
@@ -831,7 +829,7 @@
"id": "bd4bed33-956e-4b3f-a09c-586d8203109a" "id": "bd4bed33-956e-4b3f-a09c-586d8203109a"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/ignore-index.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/12.webp?2\" width=500px>"
] ]
}, },
{ {
@@ -1085,7 +1083,7 @@
"id": "fab8f0ed-80e8-4fd9-bf84-e5d0e0bc0a39" "id": "fab8f0ed-80e8-4fd9-bf84-e5d0e0bc0a39"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/mask-instructions.webp?1\" width=600px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/13.webp\" width=600px>"
] ]
}, },
{ {
@@ -1095,6 +1093,7 @@
"id": "bccaf048-ec95-498c-9155-d5b3ccba6c96" "id": "bccaf048-ec95-498c-9155-d5b3ccba6c96"
}, },
"source": [ "source": [
"&nbsp;\n",
"## 7.4 Creating data loaders for an instruction dataset" "## 7.4 Creating data loaders for an instruction dataset"
] ]
}, },
@@ -1115,7 +1114,7 @@
"id": "9fffe390-b226-4d5c-983f-9f4da773cb82" "id": "9fffe390-b226-4d5c-983f-9f4da773cb82"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-3.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/14.webp\" width=500px>"
] ]
}, },
{ {
@@ -1515,7 +1514,7 @@
"id": "8d1b438f-88af-413f-96a9-f059c6c55fc4" "id": "8d1b438f-88af-413f-96a9-f059c6c55fc4"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-4.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/15.webp?1\" width=500px>"
] ]
}, },
{ {
@@ -1746,7 +1745,7 @@
"source": [ "source": [
"- In this section, we finetune the model\n", "- In this section, we finetune the model\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-5.webp?1\" width=500px>\n", "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/16.webp\" width=500px>\n",
"\n", "\n",
"- Note that we can reuse all the loss calculation and training functions that we used in previous chapters" "- Note that we can reuse all the loss calculation and training functions that we used in previous chapters"
] ]
@@ -2015,7 +2014,7 @@
"id": "5a25cc88-1758-4dd0-b8bf-c044cbf2dd49" "id": "5a25cc88-1758-4dd0-b8bf-c044cbf2dd49"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-6.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/18.webp?1\" width=500px>"
] ]
}, },
{ {
@@ -2271,7 +2270,7 @@
"id": "805b9d30-7336-499f-abb5-4a21be3129f5" "id": "805b9d30-7336-499f-abb5-4a21be3129f5"
}, },
"source": [ "source": [
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-7.webp?1\" width=500px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/19.webp?1\" width=500px>"
] ]
}, },
{ {
@@ -2309,7 +2308,7 @@
"\n", "\n",
"- In general, before we can use ollama from the command line, we have to either start the ollama application or run `ollama serve` in a separate terminal\n", "- In general, before we can use ollama from the command line, we have to either start the ollama application or run `ollama serve` in a separate terminal\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/ollama-run.webp?1\" width=700px>" "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/20.webp?1\" width=700px>"
] ]
}, },
{ {
@@ -2854,7 +2853,7 @@
"- This marks the final chapter of this book\n", "- This marks the final chapter of this book\n",
"- We covered the major steps of the LLM development cycle: implementing an LLM architecture, pretraining an LLM, and finetuning it\n", "- We covered the major steps of the LLM development cycle: implementing an LLM architecture, pretraining an LLM, and finetuning it\n",
"\n", "\n",
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/final-overview.webp?1\" width=500px>\n", "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/21.webp?1\" width=500px>\n",
"\n", "\n",
"- An optional step that is sometimes followed after instruction finetuning, as described in this chapter, is preference finetuning\n", "- An optional step that is sometimes followed after instruction finetuning, as described in this chapter, is preference finetuning\n",
"- Preference finetuning process can be particularly useful for customizing a model to better align with specific user preferences; see the [../04_preference-tuning-with-dpo](../04_preference-tuning-with-dpo) folder if you are interested in this\n", "- Preference finetuning process can be particularly useful for customizing a model to better align with specific user preferences; see the [../04_preference-tuning-with-dpo](../04_preference-tuning-with-dpo) folder if you are interested in this\n",
@@ -2929,7 +2928,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.16" "version": "3.13.5"
} }
}, },
"nbformat": 4, "nbformat": 4,