diff --git a/ch05/01_main-chapter-code/exercise-solutions.ipynb b/ch05/01_main-chapter-code/exercise-solutions.ipynb index 99a8626..1b03893 100644 --- a/ch05/01_main-chapter-code/exercise-solutions.ipynb +++ b/ch05/01_main-chapter-code/exercise-solutions.ipynb @@ -193,12 +193,31 @@ "There is a 4.3% probability that the word \"pizza\" is sampled if the temperature is set to 5." ] }, + { + "cell_type": "markdown", + "id": "b510ffb0-adca-4d64-8a12-38c4646fd736", + "metadata": {}, + "source": [ + "# Exercise 5.2: Different temperature and top-k settings" + ] + }, + { + "cell_type": "markdown", + "id": "884990db-d1a6-4c4e-8e36-2c1e4c1e67c7", + "metadata": {}, + "source": [ + "- Both temperature and top-k settings have to be adjusted based on the individual LLM (a kind of trial and error process until it generates desirable outputs)\n", + "- The desirable outcomes are also application-specific, though\n", + " - Lower top-k and temperatures result in less random outcomes, which is desired when creating educational content, technical writing or question answering, data analyses, code generation, and so forth\n", + " - Higher top-k and temperatures result in more diverse and random outputs, which is more desirable for brainstorming tasks, creative writing, and so forth" + ] + }, { "cell_type": "markdown", "id": "3f35425d-529d-4179-a1c4-63cb8b25b156", "metadata": {}, "source": [ - "# Exercise 5.2: Deterministic behavior in the decoding functions" + "# Exercise 5.3: Deterministic behavior in the decoding functions" ] }, { @@ -357,7 +376,7 @@ "id": "6d0480e5-fb4e-41f8-a161-7ac980d71d47", "metadata": {}, "source": [ - "# Exercise 5.3: Continued pretraining" + "# Exercise 5.4: Continued pretraining" ] }, { @@ -507,7 +526,7 @@ "id": "3384e788-f5a1-407c-8dd1-87959b75026d", "metadata": {}, "source": [ - "# Exercise 5.4: Training and validation set losses of the pretrained model" + "# Exercise 5.5: Training and validation set losses of the pretrained model" ] }, { @@ -774,7 +793,7 @@ "id": "3a76a1e0-9635-480a-9391-3bda7aea402d", "metadata": {}, "source": [ - "# Exercise 5.5: Trying larger models" + "# Exercise 5.6: Trying larger models" ] }, {