link GRPO notebook (#950)

2026-04-10 12:33:42 +00:00 · 2026-01-18 11:42:03 -06:00
parent 9c4be478f8
commit 47cfc61800
2 changed files with 8 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -212,6 +212,13 @@ More bonus material from the [Reasoning From Scratch](https://github.com/rasbt/r
  - [Multiple-Choice Evaluation (MMLU)](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/02_mmlu)
  - [LLM Leaderboard Evaluation](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/03_leaderboards)
  - [LLM-as-a-Judge Evaluation](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/04_llm-judge)
+- **Inference Scaling**
+  - [Self-Consistency](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch04/01_main-chapter-code/ch04_main.ipynb)
+  - [Self-Refinement](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch05/01_main-chapter-code/ch05_main.ipynb)
+
+- **Reinforcement Learning** (RL)
+  - [RLVR with GRPO From Scratch](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch06/01_main-chapter-code/ch06_main.ipynb)
+

 <br>
 &nbsp;
--- a/2
+++ b/2