Direct Preference Optimization from scratch (#294)

2026-04-10 12:33:42 +00:00 · 2024-08-04 08:57:36 -05:00
parent 3ea0798d44
commit 09dc080cf3
5 changed files with 3570 additions and 7 deletions
--- a/README.md
+++ b/README.md
@@ -118,6 +118,7 @@ Several folders contain optional materials as a bonus for interested readers:
  - [Evaluating Instruction Responses Using the OpenAI API and Ollama](ch07/03_model-evaluation)
  - [Generating a Dataset for Instruction Finetuning](ch07/05_dataset-generation)
  - [Generating a Preference Dataset with Llama 3.1 70B and Ollama](ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb)
+  - [Direct Preference Optimization (DPO) for LLM Alignment](ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb)

 <br>
 &nbsp