Commit Graph

54 Commits

Author SHA1 Message Date
Sebastian Raschka
c814814d72 Grouped-Query Attention memory (#874)
* GQA memory

* remove redundant code

* update links

* update
2025-10-11 08:44:19 -05:00
Sebastian Raschka
fecfdd16ff Add simpler BPE, and make previous BPE better (#870)
* Add simpler BPE, and make previous BPE better

* update

* Update README.md
2025-10-08 22:22:34 -05:00
Sebastian Raschka
e742d8af2c Improve MoE implementation (#841) 2025-09-22 15:21:06 -05:00
rasbt
9ea2c57c5f simplify 2025-09-01 22:15:47 -05:00
rasbt
643f800a94 remove local config files 2025-09-01 20:52:40 -05:00
Sebastian Raschka
9eee9296d9 Interactive qwen3 chat interface (#801)
* Interactive qwen3 chat interface

* update

* update

* update url
2025-09-01 20:50:25 -05:00
Sebastian Raschka
a6b883c9f9 Gemma 3 270M From Scratch (#771)
* Gemma 3 270M From Scratch

* fix path

* update readme
2025-08-17 08:23:05 -05:00
Sebastian Raschka
b14325e56d Qwen3 and Llama3 equivalency teests with HF transformers (#768)
* Qwen3 and Llama3 equivalency teests with HF transformers

* update
2025-08-14 18:36:07 -05:00
Sebastian Raschka
190c66b3b0 Add Qwen3 1.7, 4B, 8B, and 32B support to from-scratch nb (#709) 2025-06-25 08:53:09 -05:00
Sebastian Raschka
e719bd86ad Qwen3 From Scratch (#678)
* Qwen3 From Scratch

* rev other file

* upd

* upd

* upd

* url fixes
2025-06-19 18:44:38 -05:00
Sebastian Raschka
c4cde1c21b Reduce Llama 3 RoPE memory requirements (#658)
* Llama3 from scratch improvements

* Fix Llama 3 expensive RoPE memory issue

* updates

* update package

* benchmark

* remove unused rescale_theta
2025-06-12 11:08:02 -05:00
Daniel Kleine
f01e163aad updated .gitignore (#581) 2025-03-26 13:21:14 -05:00
Sebastian Raschka
f63f04d8d5 Fix BPE bonus materials (#561)
* Fix BPE bonus materials

* fix bpe implementation

* update

* Add 'Hello, world. Is this-- a test?' test case

* update link to test file

* update path handling

* update path handling

* fix pytest paths
2025-03-08 17:21:30 -06:00
rasbt
24f78865df update badges 2025-02-17 12:00:46 -06:00
Matthew Feickert
a8b8eb4731 feat: Add pixi environment (#534)
* feat: Add pixi environment

* Add pixi manifest pixi.toml for Linux x86, macOS arm64, Windows 64.

* ci: Update CI workflow and unify to one

* Enable workflow dispatch.
* Add concurrency limits.
* Use pixi for CI workflow.
* Unify to a single workflow for all OS tested

* feat: Add pixi lock file

* Ensure tensorflow-cpu installed on Windows

* fix package check

* fix package check

* simplification plus uv and pip runners

* some fixes to pixi and pip

* create pixi.lock

* fix pixi.lock issue

* another attempt trying to fix get_packages

* another attempt trying to fix get_packages

* clean up python_environment_check.py

* updated runner and docs

* use bash

* proper env activiation

* proper env activiation

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2025-02-17 11:33:53 -06:00
Sebastian Raschka
3e3dc3c5dc Native uv docs (#530)
* Replace pip by more modern uv

* uv tests

* Native uv docs

* resolve merge conflicts

* resolve merge conflicts
2025-02-15 20:35:23 -06:00
Sebastian Raschka
25ea71e713 Alternative weight loading via .safetensors (#507) 2025-01-29 08:15:29 -06:00
Daniel Kleine
60acb94894 BPE: fixed typo (#492)
* fixed typo

* use rel path if exists

* mod gitignore and use existing vocab files

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2025-01-20 20:49:53 -06:00
Daniel Kleine
81eed9afe2 updated RoPE statement (#423)
* updated RoPE statement

* updated .gitignore

* Update ch05/07_gpt_to_llama/converting-gpt-to-llama2.ipynb

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-10-30 08:00:08 -05:00
Daniel Kleine
d38083c401 Updated Llama 2 to 3 paths (#413)
* llama 2 and 3 path fixes

* updated llama 3, 3.1 and 3.2 paths

* updated .gitignore

* Typo fix

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-10-24 07:40:08 -05:00
Sebastian Raschka
8a448a4410 Llama 3 (#384)
* Implement Llama 3.2

* Add Llama 3.2 files

* exclude IMDB link because stanford website seems down
2024-10-05 07:52:15 -05:00
Sebastian Raschka
b993c2b25b Improve rope settings for llama3 (#380) 2024-10-03 08:29:54 -05:00
rasbt
6bc3de165c move access token to config.json 2024-09-23 08:56:16 -05:00
Sebastian Raschka
0467c8289b GPT to Llama (#368)
* GPT to Llama

* fix urls
2024-09-23 07:34:06 -05:00
Sebastian Raschka
76e9a9ec02 Add user interface to ch06 and ch07 (#366)
* Add user interface to ch06 and ch07

* pep8

* fix url
2024-09-21 20:33:00 -05:00
Daniel Kleine
eefe4bf12b Chainlit bonus material fixes (#361)
* fix cmd

* moved idx to device

* improved code with clone().detach()

* fixed path

* fix: added extra line for pep8

* updated .gitginore

* Update ch05/06_user_interface/app_orig.py

* Update ch05/06_user_interface/app_own.py

* Apply suggestions from code review

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-09-18 08:08:50 -07:00
Sebastian Raschka
ea9b4e83a4 Add chatpgpt-like user interface (#360)
* Add chatpgpt-like user interface

* fixes
2024-09-17 08:26:44 -05:00
Eric Thomson
da5236ee72 Adds .vscode folder to .gitignore (#314)
* added .vscode folder to .gitignore

* Update .gitignore

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-08-12 07:49:11 -05:00
Daniel Kleine
8318d1f002 minor DPO fixes (#298)
* fixed issues, updated .gitignore

* added closing paren

* fixed CEL spelling

* fixed more minor issues

* Update ch07/01_main-chapter-code/ch07.ipynb

* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb

* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb

* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-08-05 08:40:46 -05:00
Daniel Kleine
3ac363d005 updated .gitignore for ch07/01 artefacts (#242)
* fixed markdown

* removed redundant imports

* updated .gitignore for ch07/01 artefacts
2024-06-22 18:12:01 -05:00
Sebastian Raschka
ec5baa1f33 Add CI tests for chapter 7 (#239) 2024-06-22 08:57:18 -05:00
Sebastian Raschka
b90c7ad2d6 Exercise solutions (#237) 2024-06-22 08:30:45 -05:00
Sebastian Raschka
6c0dc2362b Add standalone finetuning and evaluation scripts for chapter 7 (#234)
* add finetuning and eval scripts

* update link

* update links

* fix link
2024-06-21 05:23:24 -05:00
Daniel Kleine
dcbdc1d2e5 fixes for code (#206)
* updated .gitignore

* removed unused GELU import

* fixed model_configs, fixed all tensors on same device

* removed unused tiktoken

* update

* update hparam search

* remove redundant tokenizer argument

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2024-06-11 20:59:48 -05:00
Daniel Kleine
da9f64215a ch07 fixes (#204)
* updated .gitginore for ch07

* fixed extract_response()
2024-06-10 17:31:13 -05:00
rasbt
42af52fef4 revert unnecessary changes 2024-05-27 07:37:06 -05:00
rasbt
dd7ba32b56 add comment 2024-05-27 07:18:07 -05:00
Daniel Kleine
e7914182c6 updated .gitignore 2024-05-19 16:07:20 +00:00
Daniel Kleine
fabdefe959 updated .gitignore with appendix artifacts 2024-05-15 06:30:24 +00:00
Daniel Kleine
88ee7793d4 updated .gitignore with 06/02 und /03 artifacts 2024-05-14 12:16:24 +00:00
rasbt
21172a6a7e add chapter 6 unit test 2024-05-12 18:51:28 -05:00
rasbt
2e47a6e61c update dataset naming 2024-05-12 09:22:42 -05:00
rasbt
16e276f8df show downloads 2024-05-06 07:40:09 -05:00
rasbt
258dcad5ee ch06 csv 2024-05-06 07:16:30 -05:00
rasbt
83d5cea795 ch06 dataset 2024-05-06 06:55:56 -05:00
Sebastian Raschka
fc3d70f72f Data loader intuition with numbers (#132)
* data loader intuition with numbers

* fix link

* fix tests
2024-04-27 07:56:41 -05:00
Sebastian Raschka
dd51d4ad83 Make datesets and loaders compatible with multiprocessing (#118) 2024-04-13 13:57:56 -05:00
Daniel Kleine
44c0494406 Updated devcontainer, .gitignore and README for gutenberg project (#107)
* added ch05/03_bonus_pretraining_on_gutenberg model checkpoints and preprocessing output folders to .gitignore

* removed prettier extension, added github alerts markdown extension

* specified download instructions and fixed code markdown

* Update ch05/03_bonus_pretraining_on_gutenberg/README.md

* Update ch05/03_bonus_pretraining_on_gutenberg/README.md

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-04-05 06:53:01 -05:00
rasbt
ac2bdb02bd make figures for appendix d 2024-03-31 21:22:49 -05:00
rasbt
35c6e12730 ignore ch05 tmp files 2024-03-23 06:52:08 -05:00