link GRPO notebook (#950)

This commit is contained in:
Sebastian Raschka
2026-01-18 11:42:03 -06:00
committed by GitHub
parent 9c4be478f8
commit 47cfc61800
2 changed files with 8 additions and 1 deletions

View File

@@ -212,6 +212,13 @@ More bonus material from the [Reasoning From Scratch](https://github.com/rasbt/r
- [Multiple-Choice Evaluation (MMLU)](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/02_mmlu)
- [LLM Leaderboard Evaluation](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/03_leaderboards)
- [LLM-as-a-Judge Evaluation](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/04_llm-judge)
- **Inference Scaling**
- [Self-Consistency](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch04/01_main-chapter-code/ch04_main.ipynb)
- [Self-Refinement](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch05/01_main-chapter-code/ch05_main.ipynb)
- **Reinforcement Learning** (RL)
- [RLVR with GRPO From Scratch](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch06/01_main-chapter-code/ch06_main.ipynb)
<br>
&nbsp;