mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-10 12:33:42 +00:00
link GRPO notebook (#950)
This commit is contained in:
committed by
GitHub
parent
9c4be478f8
commit
47cfc61800
@@ -212,6 +212,13 @@ More bonus material from the [Reasoning From Scratch](https://github.com/rasbt/r
|
||||
- [Multiple-Choice Evaluation (MMLU)](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/02_mmlu)
|
||||
- [LLM Leaderboard Evaluation](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/03_leaderboards)
|
||||
- [LLM-as-a-Judge Evaluation](https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/04_llm-judge)
|
||||
- **Inference Scaling**
|
||||
- [Self-Consistency](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch04/01_main-chapter-code/ch04_main.ipynb)
|
||||
- [Self-Refinement](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch05/01_main-chapter-code/ch05_main.ipynb)
|
||||
|
||||
- **Reinforcement Learning** (RL)
|
||||
- [RLVR with GRPO From Scratch](https://github.com/rasbt/reasoning-from-scratch/blob/main/ch06/01_main-chapter-code/ch06_main.ipynb)
|
||||
|
||||
|
||||
<br>
|
||||
|
||||
|
||||
Submodule reasoning-from-scratch updated: a8cfd55fca...edcae1d894
Reference in New Issue
Block a user