rasbt
8fd29ed079
Gemma 3 270M from scratch
2025-08-16 19:49:38 -05:00
Sebastian Raschka
07c3122b5c
Qwen3 and Llama3 equivalency teests with HF transformers ( #768 )
...
* Qwen3 and Llama3 equivalency teests with HF transformers
* update
2025-08-14 18:36:07 -05:00
Sebastian Raschka
81be5fab0b
Add Qwen3 1.7, 4B, 8B, and 32B support to from-scratch nb ( #709 )
2025-06-25 08:53:09 -05:00
Sebastian Raschka
3d4bce6d57
Qwen3 From Scratch ( #678 )
...
* Qwen3 From Scratch
* rev other file
* upd
* upd
* upd
* url fixes
2025-06-19 18:44:38 -05:00
Sebastian Raschka
a3c4c33347
Reduce Llama 3 RoPE memory requirements ( #658 )
...
* Llama3 from scratch improvements
* Fix Llama 3 expensive RoPE memory issue
* updates
* update package
* benchmark
* remove unused rescale_theta
2025-06-12 11:08:02 -05:00
Daniel Kleine
d4d420361c
updated .gitignore ( #581 )
2025-03-26 13:21:14 -05:00
Sebastian Raschka
6aec412421
Fix BPE bonus materials ( #561 )
...
* Fix BPE bonus materials
* fix bpe implementation
* update
* Add 'Hello, world. Is this-- a test?' test case
* update link to test file
* update path handling
* update path handling
* fix pytest paths
2025-03-08 17:21:30 -06:00
rasbt
47030fd8c1
update badges
2025-02-17 12:00:46 -06:00
Matthew Feickert
bd0484c1be
feat: Add pixi environment ( #534 )
...
* feat: Add pixi environment
* Add pixi manifest pixi.toml for Linux x86, macOS arm64, Windows 64.
* ci: Update CI workflow and unify to one
* Enable workflow dispatch.
* Add concurrency limits.
* Use pixi for CI workflow.
* Unify to a single workflow for all OS tested
* feat: Add pixi lock file
* Ensure tensorflow-cpu installed on Windows
* fix package check
* fix package check
* simplification plus uv and pip runners
* some fixes to pixi and pip
* create pixi.lock
* fix pixi.lock issue
* another attempt trying to fix get_packages
* another attempt trying to fix get_packages
* clean up python_environment_check.py
* updated runner and docs
* use bash
* proper env activiation
* proper env activiation
---------
Co-authored-by: rasbt <mail@sebastianraschka.com >
2025-02-17 11:33:53 -06:00
Sebastian Raschka
aa60bb3cd5
Native uv docs ( #530 )
...
* Replace pip by more modern uv
* uv tests
* Native uv docs
* resolve merge conflicts
* resolve merge conflicts
2025-02-15 20:35:23 -06:00
Sebastian Raschka
fd24a3679a
Alternative weight loading via .safetensors ( #507 )
2025-01-29 08:15:29 -06:00
Daniel Kleine
3f9facbc55
BPE: fixed typo ( #492 )
...
* fixed typo
* use rel path if exists
* mod gitignore and use existing vocab files
---------
Co-authored-by: rasbt <mail@sebastianraschka.com >
2025-01-20 20:49:53 -06:00
Daniel Kleine
7e6f8ce020
updated RoPE statement ( #423 )
...
* updated RoPE statement
* updated .gitignore
* Update ch05/07_gpt_to_llama/converting-gpt-to-llama2.ipynb
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com >
2024-10-30 08:00:08 -05:00
Daniel Kleine
8b60460319
Updated Llama 2 to 3 paths ( #413 )
...
* llama 2 and 3 path fixes
* updated llama 3, 3.1 and 3.2 paths
* updated .gitignore
* Typo fix
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com >
2024-10-24 07:40:08 -05:00
Sebastian Raschka
58c3bb3d9d
Llama 3 ( #384 )
...
* Implement Llama 3.2
* Add Llama 3.2 files
* exclude IMDB link because stanford website seems down
2024-10-05 07:52:15 -05:00
Sebastian Raschka
feb0647c79
Improve rope settings for llama3 ( #380 )
2024-10-03 08:29:54 -05:00
rasbt
835832a0f9
move access token to config.json
2024-09-23 08:56:16 -05:00
Sebastian Raschka
c38b003aa9
GPT to Llama ( #368 )
...
* GPT to Llama
* fix urls
2024-09-23 07:34:06 -05:00
Sebastian Raschka
7a9a17608d
Add user interface to ch06 and ch07 ( #366 )
...
* Add user interface to ch06 and ch07
* pep8
* fix url
2024-09-21 20:33:00 -05:00
Daniel Kleine
92ad9570e4
Chainlit bonus material fixes ( #361 )
...
* fix cmd
* moved idx to device
* improved code with clone().detach()
* fixed path
* fix: added extra line for pep8
* updated .gitginore
* Update ch05/06_user_interface/app_orig.py
* Update ch05/06_user_interface/app_own.py
* Apply suggestions from code review
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com >
2024-09-18 08:08:50 -07:00
Sebastian Raschka
1bc560fb13
Add chatpgpt-like user interface ( #360 )
...
* Add chatpgpt-like user interface
* fixes
2024-09-17 08:26:44 -05:00
Eric Thomson
b3d550bfd5
Adds .vscode folder to .gitignore ( #314 )
...
* added .vscode folder to .gitignore
* Update .gitignore
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com >
2024-08-12 07:49:11 -05:00
Daniel Kleine
dcdf04e3bd
minor DPO fixes ( #298 )
...
* fixed issues, updated .gitignore
* added closing paren
* fixed CEL spelling
* fixed more minor issues
* Update ch07/01_main-chapter-code/ch07.ipynb
* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb
* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb
* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com >
2024-08-05 08:40:46 -05:00
Daniel Kleine
82c06420ac
updated .gitignore for ch07/01 artefacts ( #242 )
...
* fixed markdown
* removed redundant imports
* updated .gitignore for ch07/01 artefacts
2024-06-22 18:12:01 -05:00
Sebastian Raschka
eb85c43bc3
Add CI tests for chapter 7 ( #239 )
2024-06-22 08:57:18 -05:00
Sebastian Raschka
0114dee9f6
Exercise solutions ( #237 )
2024-06-22 08:30:45 -05:00
Sebastian Raschka
87deec0f5f
Add standalone finetuning and evaluation scripts for chapter 7 ( #234 )
...
* add finetuning and eval scripts
* update link
* update links
* fix link
2024-06-21 05:23:24 -05:00
Daniel Kleine
79210eb393
fixes for code ( #206 )
...
* updated .gitignore
* removed unused GELU import
* fixed model_configs, fixed all tensors on same device
* removed unused tiktoken
* update
* update hparam search
* remove redundant tokenizer argument
---------
Co-authored-by: rasbt <mail@sebastianraschka.com >
2024-06-11 20:59:48 -05:00
Daniel Kleine
9a81230968
ch07 fixes ( #204 )
...
* updated .gitginore for ch07
* fixed extract_response()
2024-06-10 17:31:13 -05:00
rasbt
f86a929665
revert unnecessary changes
2024-05-27 07:37:06 -05:00
rasbt
b2ad4fb0d6
add comment
2024-05-27 07:18:07 -05:00
Daniel Kleine
7b397fcd46
updated .gitignore
2024-05-19 16:07:20 +00:00
Daniel Kleine
c78ceafe51
updated .gitignore with appendix artifacts
2024-05-15 06:30:24 +00:00
Daniel Kleine
d2fe7287a2
updated .gitignore with 06/02 und /03 artifacts
2024-05-14 12:16:24 +00:00
rasbt
37c33d6fee
add chapter 6 unit test
2024-05-12 18:51:28 -05:00
rasbt
98c0723b3d
update dataset naming
2024-05-12 09:22:42 -05:00
rasbt
0448162fdc
show downloads
2024-05-06 07:40:09 -05:00
rasbt
15d6f29cf8
ch06 csv
2024-05-06 07:16:30 -05:00
rasbt
c6528ede9e
ch06 dataset
2024-05-06 06:55:56 -05:00
Sebastian Raschka
0f03c20483
Data loader intuition with numbers ( #132 )
...
* data loader intuition with numbers
* fix link
* fix tests
2024-04-27 07:56:41 -05:00
Sebastian Raschka
bae4b0fb08
Make datesets and loaders compatible with multiprocessing ( #118 )
2024-04-13 13:57:56 -05:00
Daniel Kleine
7d0b9b78b0
Updated devcontainer, .gitignore and README for gutenberg project ( #107 )
...
* added ch05/03_bonus_pretraining_on_gutenberg model checkpoints and preprocessing output folders to .gitignore
* removed prettier extension, added github alerts markdown extension
* specified download instructions and fixed code markdown
* Update ch05/03_bonus_pretraining_on_gutenberg/README.md
* Update ch05/03_bonus_pretraining_on_gutenberg/README.md
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com >
2024-04-05 06:53:01 -05:00
rasbt
ac2bdb02bd
make figures for appendix d
2024-03-31 21:22:49 -05:00
rasbt
35c6e12730
ignore ch05 tmp files
2024-03-23 06:52:08 -05:00
Sebastian Raschka
4582995ced
Add alternative weight loading strategy as backup ( #82 )
2024-03-20 08:43:18 -05:00
Sebastian Raschka
48253c4f88
Ch05 ( #75 )
...
* add chapter 5 main code
2024-03-17 21:07:19 -05:00
rasbt
55aa84ac5c
remove OS temp files
2023-12-09 17:17:47 -06:00
rasbt
d66b23588d
first sync
2023-07-23 13:18:13 -05:00