mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-10 12:33:42 +00:00
Use figure numbers in ch05-7 (#881)
This commit is contained in:
committed by
GitHub
parent
bf039ff3dc
commit
b969b3ef7a
@@ -79,7 +79,7 @@
|
||||
"id": "264fca98-2f9a-4193-b435-2abfa3b4142f"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/overview.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/01.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -111,7 +111,7 @@
|
||||
"id": "18dc0535-0904-44ed-beaf-9b678292ef35"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/instruction-following.webp\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/02.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -123,7 +123,7 @@
|
||||
"source": [
|
||||
"- The topics covered in this chapter are summarized in the figure below\n",
|
||||
"\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-1.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/03.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -312,7 +312,7 @@
|
||||
"id": "dffa4f70-44d4-4be4-89a9-2159f4885b10"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/prompt-style.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/04.webp?2\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -509,7 +509,7 @@
|
||||
"id": "233f63bd-9755-4d07-8884-5e2e5345cf27"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-2.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/05.webp?1\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -521,7 +521,7 @@
|
||||
"source": [
|
||||
"- We tackle this dataset batching in several steps, as summarized in the figure below\n",
|
||||
"\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/detailed-batching.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/06.webp?1\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -533,7 +533,7 @@
|
||||
"source": [
|
||||
"- First, we implement an `InstructionDataset` class that pre-tokenizes all inputs in the dataset, similar to the `SpamDataset` in chapter 6\n",
|
||||
"\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/pretokenizing.webp\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/07.webp?1\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -627,7 +627,7 @@
|
||||
"id": "65c4d943-4aa8-4a44-874e-05bc6831fbd3"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/padding.webp\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/08.webp?1\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -710,12 +710,10 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c46832ab-39b7-45f8-b330-ac9adfa10d1b",
|
||||
"metadata": {
|
||||
"id": "c46832ab-39b7-45f8-b330-ac9adfa10d1b"
|
||||
},
|
||||
"id": "5673ade5-be4c-4a2c-9a9a-d5c63fb1c424",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/batching-step-4.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/09.webp?1\" width=400px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -736,7 +734,7 @@
|
||||
"id": "0386b6fe-3455-4e70-becd-a5a4681ba2ef"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/inputs-targets.webp?1\" width=400px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/10.webp?1\" width=400px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -819,7 +817,7 @@
|
||||
"source": [
|
||||
"- Next, we introduce an `ignore_index` value to replace all padding token IDs with a new value; the purpose of this `ignore_index` is that we can ignore padding values in the loss function (more on that later)\n",
|
||||
"\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/batching-step-5.webp?1\" width=500px>\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/11.webp?1\" width=400px>\n",
|
||||
"\n",
|
||||
"- Concretely, this means that we replace the token IDs corresponding to `50256` with `-100` as illustrated below"
|
||||
]
|
||||
@@ -831,7 +829,7 @@
|
||||
"id": "bd4bed33-956e-4b3f-a09c-586d8203109a"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/ignore-index.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/12.webp?2\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1085,7 +1083,7 @@
|
||||
"id": "fab8f0ed-80e8-4fd9-bf84-e5d0e0bc0a39"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/mask-instructions.webp?1\" width=600px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/13.webp\" width=600px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1095,6 +1093,7 @@
|
||||
"id": "bccaf048-ec95-498c-9155-d5b3ccba6c96"
|
||||
},
|
||||
"source": [
|
||||
" \n",
|
||||
"## 7.4 Creating data loaders for an instruction dataset"
|
||||
]
|
||||
},
|
||||
@@ -1115,7 +1114,7 @@
|
||||
"id": "9fffe390-b226-4d5c-983f-9f4da773cb82"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-3.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/14.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1515,7 +1514,7 @@
|
||||
"id": "8d1b438f-88af-413f-96a9-f059c6c55fc4"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-4.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/15.webp?1\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1746,7 +1745,7 @@
|
||||
"source": [
|
||||
"- In this section, we finetune the model\n",
|
||||
"\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-5.webp?1\" width=500px>\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/16.webp\" width=500px>\n",
|
||||
"\n",
|
||||
"- Note that we can reuse all the loss calculation and training functions that we used in previous chapters"
|
||||
]
|
||||
@@ -2015,7 +2014,7 @@
|
||||
"id": "5a25cc88-1758-4dd0-b8bf-c044cbf2dd49"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-6.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/18.webp?1\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -2271,7 +2270,7 @@
|
||||
"id": "805b9d30-7336-499f-abb5-4a21be3129f5"
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/chapter-overview-7.webp?1\" width=500px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/19.webp?1\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -2309,7 +2308,7 @@
|
||||
"\n",
|
||||
"- In general, before we can use ollama from the command line, we have to either start the ollama application or run `ollama serve` in a separate terminal\n",
|
||||
"\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/ollama-run.webp?1\" width=700px>"
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/20.webp?1\" width=700px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -2854,7 +2853,7 @@
|
||||
"- This marks the final chapter of this book\n",
|
||||
"- We covered the major steps of the LLM development cycle: implementing an LLM architecture, pretraining an LLM, and finetuning it\n",
|
||||
"\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/final-overview.webp?1\" width=500px>\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch07_compressed/21.webp?1\" width=500px>\n",
|
||||
"\n",
|
||||
"- An optional step that is sometimes followed after instruction finetuning, as described in this chapter, is preference finetuning\n",
|
||||
"- Preference finetuning process can be particularly useful for customizing a model to better align with specific user preferences; see the [../04_preference-tuning-with-dpo](../04_preference-tuning-with-dpo) folder if you are interested in this\n",
|
||||
@@ -2929,7 +2928,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.16"
|
||||
"version": "3.13.5"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
Reference in New Issue
Block a user