Sebastian Raschka
a4094470c7
Write-up on how to get the most out of this book ( #909 )
2025-11-12 20:20:48 -06:00
rasbt
35354fac80
Use consistent title case
2025-11-06 15:22:24 -06:00
Sebastian Raschka
c6b8332a59
Gated DeltaNet write-up ( #901 )
...
* Gated DeltaNet write-up
* Add copyright and source information to script
Added copyright notice and source information.
* Remove unused import of Path in plot_memory_estimates
* Fix url
2025-11-02 21:03:42 -06:00
Sebastian Raschka
218221ab62
Mixture-of-Experts intro ( #888 )
2025-10-19 22:17:59 -05:00
Sebastian Raschka
bf039ff3dc
Add alternative attention structure ( #880 )
2025-10-13 14:31:13 -05:00
Sebastian Raschka
6eb6adfa33
sliding window attention ( #879 )
2025-10-12 22:13:20 -05:00
Sebastian Raschka
21f0617ea3
Add other appendices for completeness ( #878 )
...
* Add other appendices for completeness
* update
* update
* Update
2025-10-12 19:04:53 -05:00
Sebastian Raschka
9b9586688d
Multi-Head Latent Attention ( #876 )
...
* Multi-Head Latent Attention
* update
2025-10-11 20:08:30 -05:00
Sebastian Raschka
c814814d72
Grouped-Query Attention memory ( #874 )
...
* GQA memory
* remove redundant code
* update links
* update
2025-10-11 08:44:19 -05:00
Sebastian Raschka
fecfdd16ff
Add simpler BPE, and make previous BPE better ( #870 )
...
* Add simpler BPE, and make previous BPE better
* update
* Update README.md
2025-10-08 22:22:34 -05:00
Sebastian Raschka
1164cb3e8f
Qwen3 and evaluation bonus materials ( #869 )
2025-10-08 18:22:19 -05:00
Sebastian Raschka
6d175a22df
Fix IMDb spelling ( #811 )
...
* Add SSL instructions
* Fix IMDb spelling
2025-09-06 12:04:47 -05:00
Sebastian Raschka
a51ff65488
reasoning-from-scratch ( #793 )
2025-08-28 18:36:41 -05:00
Sebastian Raschka
a6b883c9f9
Gemma 3 270M From Scratch ( #771 )
...
* Gemma 3 270M From Scratch
* fix path
* update readme
2025-08-17 08:23:05 -05:00
Sebastian Raschka
f92b40e4ab
Qwen3 Coder Flash & MoE from Scratch ( #760 )
...
* Qwen3 Coder Flash & MoE from Scratch
* update
* refinements
* updates
* update
* update
* update
2025-08-01 19:13:17 -05:00
Sebastian Raschka
7e9ce325de
Add link to official video course ( #741 )
2025-07-13 10:35:12 -05:00
Sebastian Raschka
3c9dc4807b
Simplify KV cache usage ( #728 )
...
* Simplify KV cache usage
* Swap mark text with ghostwriter
2025-07-08 12:56:55 -05:00
Sebastian Raschka
c8c6e7814a
Update README.md
2025-07-06 17:58:33 -05:00
Sebastian Raschka
6103acbedb
Add prerequisite section ( #723 )
2025-07-06 12:45:42 -05:00
Sebastian Raschka
47a750014d
Add link to free exercise PDF ( #706 )
2025-06-24 08:24:02 -05:00
Sebastian Raschka
e719bd86ad
Qwen3 From Scratch ( #678 )
...
* Qwen3 From Scratch
* rev other file
* upd
* upd
* upd
* url fixes
2025-06-19 18:44:38 -05:00
Sebastian Raschka
2af686d70b
Add KV cache ( #671 )
2025-06-15 09:58:08 -05:00
Sebastian Raschka
3f93d73d6d
Alt weight loading code via PyTorch ( #585 )
...
* Alt weight loading code via PyTorch
* commit additional files
2025-03-27 20:10:23 -05:00
Sebastian Raschka
f12b899d96
GitHub markdown updates ( #545 )
...
* GitHub markdown updates
* Apply suggestions from code review
* Apply suggestions from code review
2025-02-23 12:25:44 -06:00
Sebastian Raschka
67c226bf67
Badge url updates
2025-02-17 12:07:47 -06:00
rasbt
9ccecd13ae
update badges
2025-02-17 12:02:06 -06:00
rasbt
24f78865df
update badges
2025-02-17 12:00:46 -06:00
rasbt
2f67cbca0b
update readme badges
2025-02-17 11:49:41 -06:00
Sebastian Raschka
bacb7aa90c
Update README.md
2025-02-16 13:37:32 -06:00
Sebastian Raschka
908dd2f71e
PyTorch tips for better training performance ( #525 )
...
* PyTorch tips for better training performance
* formatting
* pep 8
2025-02-12 16:10:34 -06:00
Sebastian Raschka
a22d612be6
Bonus material: extending tokenizers ( #496 )
...
* Bonus material: extending tokenizers
* small wording update
2025-01-22 09:26:54 -06:00
Sebastian Raschka
0d4967eda6
Implementingthe BPE Tokenizer from Scratch ( #487 )
2025-01-17 12:22:00 -06:00
Sebastian Raschka
27a6a7e64a
Add chapter names
2024-11-08 08:39:34 -06:00
Sebastian Raschka
b5f2aa3500
Update README.md
2024-10-29 20:20:48 -05:00
Sebastian Raschka
05b04f2a5a
Memory efficient weight loading ( #401 )
...
* memory efficient weight loading
* remove unused code
2024-10-14 10:30:25 -05:00
Sebastian Raschka
8a448a4410
Llama 3 ( #384 )
...
* Implement Llama 3.2
* Add Llama 3.2 files
* exclude IMDB link because stanford website seems down
2024-10-05 07:52:15 -05:00
Sebastian Raschka
0467c8289b
GPT to Llama ( #368 )
...
* GPT to Llama
* fix urls
2024-09-23 07:34:06 -05:00
Sebastian Raschka
76e9a9ec02
Add user interface to ch06 and ch07 ( #366 )
...
* Add user interface to ch06 and ch07
* pep8
* fix url
2024-09-21 20:33:00 -05:00
Sebastian Raschka
ea9b4e83a4
Add chatpgpt-like user interface ( #360 )
...
* Add chatpgpt-like user interface
* fixes
2024-09-17 08:26:44 -05:00
Sebastian Raschka
835ed29dbf
reflection-tuning dataset generation ( #349 )
2024-09-10 21:42:12 -05:00
Daniel Kleine
2ee3df622e
nbviewer links / typo ( #346 )
...
* fixed typo
* removed remaining nbviewer links
* Update mha-implementations.ipynb
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com >
2024-09-07 07:27:28 +02:00
Sebastian Raschka
91db4e3a0f
Revert nbviewer links
2024-09-05 08:09:33 +02:00
Sebastian Raschka
d391796ec2
use nbviewer links ( #339 )
2024-08-29 09:09:10 +02:00
Sebastian Raschka
26f94876f7
Update README.md
2024-08-24 07:22:18 -05:00
Sebastian Raschka
f1c3d451fe
Update README.md
2024-08-08 07:50:45 -05:00
Sebastian Raschka
81e9cea3d3
Update README.md
2024-08-08 07:47:31 -05:00
Sebastian Raschka
98d24a1607
Update README.md
2024-08-06 08:02:01 -05:00
Sebastian Raschka
50332cf75b
Update README.md
2024-08-05 17:47:06 -05:00
Sebastian Raschka
16e83434b5
Update README.md
2024-08-04 16:06:38 -05:00
Sebastian Raschka
52435804eb
Direct Preference Optimization from scratch ( #294 )
2024-08-04 08:57:36 -05:00