mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-10 12:33:42 +00:00
@@ -1020,7 +1020,7 @@
|
||||
"id": "cef09d21-b652-4760-abea-4f76920e6a25"
|
||||
},
|
||||
"source": [
|
||||
"- As we can see, the resulting loss on these 3 training examples is the same as the loss we calculated from the 2 training examples, which means that the cross entropy loss function ignored the training example with the -100 label\n",
|
||||
"- As we can see, the resulting loss on these 3 training examples is the same as the loss we calculated from the 2 training examples, which means that the cross-entropy loss function ignored the training example with the -100 label\n",
|
||||
"- By default, PyTorch has the `cross_entropy(..., ignore_index=-100)` setting to ignore examples corresponding to the label -100\n",
|
||||
"- Using this -100 `ignore_index`, we can ignore the additional end-of-text (padding) tokens in the batches that we used to pad the training examples to equal length\n",
|
||||
"- However, we don't want to ignore the first instance of the end-of-text (padding) token (50256) because it can help signal to the LLM when the response is complete"
|
||||
@@ -2051,7 +2051,7 @@
|
||||
" - automated conversational benchmarks, where another LLM like GPT-4 is used to evaluate the responses, such as AlpacaEval ([https://tatsu-lab.github.io/alpaca_eval/](https://tatsu-lab.github.io/alpaca_eval/))\n",
|
||||
"\n",
|
||||
"- In the next section, we will use an approach similar to AlpaceEval and use another LLM to evaluate the responses of our model; however, we will use our own test set instead of using a publicly available benchmark dataset\n",
|
||||
"- For this, we add the model response to the `test_set` dictionary and save it as a `\"instruction-data-with-response.json\"` file for record-keeping so that we can load and analyze it in separate Python sessions if needed"
|
||||
"- For this, we add the model response to the `test_data` dictionary and save it as a `\"instruction-data-with-response.json\"` file for record-keeping so that we can load and analyze it in separate Python sessions if needed"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -2105,7 +2105,7 @@
|
||||
"id": "228d6fa7-d162-44c3-bef1-4013c027b155"
|
||||
},
|
||||
"source": [
|
||||
"- Let's double-check one of the entries to see whether the responses have been added to the `test_set` dictionary correctly"
|
||||
"- Let's double-check one of the entries to see whether the responses have been added to the `test_data` dictionary correctly"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -2727,7 +2727,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.4"
|
||||
"version": "3.10.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
Reference in New Issue
Block a user