Files
LLMs-from-scratch/ch07/04_preference-tuning-with-dpo/README.md
Sebastian Raschka f78ad1f95b Update README.md
2024-06-23 08:25:01 -05:00

9 lines
499 B
Markdown

# Chapter 7: Finetuning to Follow Instructions
In progress ...
In the meantime, see
- LLM Training: RLHF and Its Alternatives, [https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives](https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives)
- Tips for LLM Pretraining and Evaluating Reward Models, [https://sebastianraschka.com/blog/2024/research-papers-in-march-2024.html](https://sebastianraschka.com/blog/2024/research-papers-in-march-2024.html)