diff --git a/README.md b/README.md index e6b1a26..2d7bae6 100644 --- a/README.md +++ b/README.md @@ -212,6 +212,13 @@ More bonus material from the [Reasoning From Scratch](https://github.com/rasbt/r - [Multiple-Choice Evaluation (MMLU)](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/02_mmlu) - [LLM Leaderboard Evaluation](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/03_leaderboards) - [LLM-as-a-Judge Evaluation](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/04_llm-judge) +- **Inference Scaling** + - [Self-Consistency](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch04/01_main-chapter-code/ch04_main.ipynb) + - [Self-Refinement](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch05/01_main-chapter-code/ch05_main.ipynb) + +- **Reinforcement Learning** (RL) + - [RLVR with GRPO From Scratch](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch06/01_main-chapter-code/ch06_main.ipynb) +
  diff --git a/reasoning-from-scratch b/reasoning-from-scratch index a8cfd55..edcae1d 160000 --- a/reasoning-from-scratch +++ b/reasoning-from-scratch @@ -1 +1 @@ -Subproject commit a8cfd55fca9ff37177d675442de77601c3281728 +Subproject commit edcae1d894192a2d7c036bfea43922cb140dea10