mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-10 12:33:42 +00:00
minor fixes (#246)
* removed duplicated white spaces * Update ch07/01_main-chapter-code/ch07.ipynb * Update ch07/05_dataset-generation/llama3-ollama.ipynb * removed duplicated white spaces * fixed title again --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
This commit is contained in:
@@ -520,7 +520,7 @@
|
||||
"- Note that we also add a smaller value (`eps`) before computing the square root of the variance; this is to avoid division-by-zero errors if the variance is 0\n",
|
||||
"\n",
|
||||
"**Biased variance**\n",
|
||||
"- In the variance calculation above, setting `unbiased=False` means using the formula $\\frac{\\sum_i (x_i - \\bar{x})^2}{n}$ to compute the variance where n is the sample size (here, the number of features or columns); this formula does not include Bessel's correction (which uses `n-1` in the denominator), thus providing a biased estimate of the variance \n",
|
||||
"- In the variance calculation above, setting `unbiased=False` means using the formula $\\frac{\\sum_i (x_i - \\bar{x})^2}{n}$ to compute the variance where n is the sample size (here, the number of features or columns); this formula does not include Bessel's correction (which uses `n-1` in the denominator), thus providing a biased estimate of the variance \n",
|
||||
"- For LLMs, where the embedding dimension `n` is very large, the difference between using n and `n-1`\n",
|
||||
" is negligible\n",
|
||||
"- However, GPT-2 was trained with a biased variance in the normalization layers, which is why we also adopted this setting for compatibility reasons with the pretrained weights that we will load in later chapters\n",
|
||||
@@ -1498,7 +1498,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.4"
|
||||
"version": "3.10.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
Reference in New Issue
Block a user