Add simpler BPE, and make previous BPE better (#870)

* Add simpler BPE, and make previous BPE better

* update

* Update README.md
This commit is contained in:
Sebastian Raschka
2025-10-08 22:22:34 -05:00
committed by GitHub
parent 1164cb3e8f
commit fecfdd16ff
6 changed files with 1223 additions and 122 deletions

View File

@@ -158,7 +158,7 @@ Several folders contain optional materials as a bonus for interested readers:
- [Installing Python Packages and Libraries Used In This Book](setup/02_installing-python-libraries)
- [Docker Environment Setup Guide](setup/03_optional-docker-environment)
- **Chapter 2: Working with text data**
- [Byte Pair Encoding (BPE) Tokenizer From Scratch](ch02/05_bpe-from-scratch/bpe-from-scratch.ipynb)
- [Byte Pair Encoding (BPE) Tokenizer From Scratch](ch02/05_bpe-from-scratch/bpe-from-scratch-simple.ipynb)
- [Comparing Various Byte Pair Encoding (BPE) Implementations](ch02/02_bonus_bytepair-encoder)
- [Understanding the Difference Between Embedding Layers and Linear Layers](ch02/03_bonus_embedding-vs-matmul)
- [Dataloader Intuition with Simple Numbers](ch02/04_bonus_dataloader-intuition)