Sebastian Raschka
|
6eb6adfa33
|
sliding window attention (#879)
|
2025-10-12 22:13:20 -05:00 |
|
Sebastian Raschka
|
21f0617ea3
|
Add other appendices for completeness (#878)
* Add other appendices for completeness
* update
* update
* Update
|
2025-10-12 19:04:53 -05:00 |
|
Sebastian Raschka
|
9b9586688d
|
Multi-Head Latent Attention (#876)
* Multi-Head Latent Attention
* update
|
2025-10-11 20:08:30 -05:00 |
|
Sebastian Raschka
|
c814814d72
|
Grouped-Query Attention memory (#874)
* GQA memory
* remove redundant code
* update links
* update
|
2025-10-11 08:44:19 -05:00 |
|
Sebastian Raschka
|
fecfdd16ff
|
Add simpler BPE, and make previous BPE better (#870)
* Add simpler BPE, and make previous BPE better
* update
* Update README.md
|
2025-10-08 22:22:34 -05:00 |
|
Sebastian Raschka
|
1164cb3e8f
|
Qwen3 and evaluation bonus materials (#869)
|
2025-10-08 18:22:19 -05:00 |
|
Sebastian Raschka
|
6d175a22df
|
Fix IMDb spelling (#811)
* Add SSL instructions
* Fix IMDb spelling
|
2025-09-06 12:04:47 -05:00 |
|
Sebastian Raschka
|
a51ff65488
|
reasoning-from-scratch (#793)
|
2025-08-28 18:36:41 -05:00 |
|
Sebastian Raschka
|
a6b883c9f9
|
Gemma 3 270M From Scratch (#771)
* Gemma 3 270M From Scratch
* fix path
* update readme
|
2025-08-17 08:23:05 -05:00 |
|
Sebastian Raschka
|
f92b40e4ab
|
Qwen3 Coder Flash & MoE from Scratch (#760)
* Qwen3 Coder Flash & MoE from Scratch
* update
* refinements
* updates
* update
* update
* update
|
2025-08-01 19:13:17 -05:00 |
|
Sebastian Raschka
|
7e9ce325de
|
Add link to official video course (#741)
|
2025-07-13 10:35:12 -05:00 |
|
Sebastian Raschka
|
3c9dc4807b
|
Simplify KV cache usage (#728)
* Simplify KV cache usage
* Swap mark text with ghostwriter
|
2025-07-08 12:56:55 -05:00 |
|
Sebastian Raschka
|
c8c6e7814a
|
Update README.md
|
2025-07-06 17:58:33 -05:00 |
|
Sebastian Raschka
|
6103acbedb
|
Add prerequisite section (#723)
|
2025-07-06 12:45:42 -05:00 |
|
Sebastian Raschka
|
47a750014d
|
Add link to free exercise PDF (#706)
|
2025-06-24 08:24:02 -05:00 |
|
Sebastian Raschka
|
e719bd86ad
|
Qwen3 From Scratch (#678)
* Qwen3 From Scratch
* rev other file
* upd
* upd
* upd
* url fixes
|
2025-06-19 18:44:38 -05:00 |
|
Sebastian Raschka
|
2af686d70b
|
Add KV cache (#671)
|
2025-06-15 09:58:08 -05:00 |
|
Sebastian Raschka
|
3f93d73d6d
|
Alt weight loading code via PyTorch (#585)
* Alt weight loading code via PyTorch
* commit additional files
|
2025-03-27 20:10:23 -05:00 |
|
Sebastian Raschka
|
f12b899d96
|
GitHub markdown updates (#545)
* GitHub markdown updates
* Apply suggestions from code review
* Apply suggestions from code review
|
2025-02-23 12:25:44 -06:00 |
|
Sebastian Raschka
|
67c226bf67
|
Badge url updates
|
2025-02-17 12:07:47 -06:00 |
|
rasbt
|
9ccecd13ae
|
update badges
|
2025-02-17 12:02:06 -06:00 |
|
rasbt
|
24f78865df
|
update badges
|
2025-02-17 12:00:46 -06:00 |
|
rasbt
|
2f67cbca0b
|
update readme badges
|
2025-02-17 11:49:41 -06:00 |
|
Sebastian Raschka
|
bacb7aa90c
|
Update README.md
|
2025-02-16 13:37:32 -06:00 |
|
Sebastian Raschka
|
908dd2f71e
|
PyTorch tips for better training performance (#525)
* PyTorch tips for better training performance
* formatting
* pep 8
|
2025-02-12 16:10:34 -06:00 |
|
Sebastian Raschka
|
a22d612be6
|
Bonus material: extending tokenizers (#496)
* Bonus material: extending tokenizers
* small wording update
|
2025-01-22 09:26:54 -06:00 |
|
Sebastian Raschka
|
0d4967eda6
|
Implementingthe BPE Tokenizer from Scratch (#487)
|
2025-01-17 12:22:00 -06:00 |
|
Sebastian Raschka
|
27a6a7e64a
|
Add chapter names
|
2024-11-08 08:39:34 -06:00 |
|
Sebastian Raschka
|
b5f2aa3500
|
Update README.md
|
2024-10-29 20:20:48 -05:00 |
|
Sebastian Raschka
|
05b04f2a5a
|
Memory efficient weight loading (#401)
* memory efficient weight loading
* remove unused code
|
2024-10-14 10:30:25 -05:00 |
|
Sebastian Raschka
|
8a448a4410
|
Llama 3 (#384)
* Implement Llama 3.2
* Add Llama 3.2 files
* exclude IMDB link because stanford website seems down
|
2024-10-05 07:52:15 -05:00 |
|
Sebastian Raschka
|
0467c8289b
|
GPT to Llama (#368)
* GPT to Llama
* fix urls
|
2024-09-23 07:34:06 -05:00 |
|
Sebastian Raschka
|
76e9a9ec02
|
Add user interface to ch06 and ch07 (#366)
* Add user interface to ch06 and ch07
* pep8
* fix url
|
2024-09-21 20:33:00 -05:00 |
|
Sebastian Raschka
|
ea9b4e83a4
|
Add chatpgpt-like user interface (#360)
* Add chatpgpt-like user interface
* fixes
|
2024-09-17 08:26:44 -05:00 |
|
Sebastian Raschka
|
835ed29dbf
|
reflection-tuning dataset generation (#349)
|
2024-09-10 21:42:12 -05:00 |
|
Daniel Kleine
|
2ee3df622e
|
nbviewer links / typo (#346)
* fixed typo
* removed remaining nbviewer links
* Update mha-implementations.ipynb
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
|
2024-09-07 07:27:28 +02:00 |
|
Sebastian Raschka
|
91db4e3a0f
|
Revert nbviewer links
|
2024-09-05 08:09:33 +02:00 |
|
Sebastian Raschka
|
d391796ec2
|
use nbviewer links (#339)
|
2024-08-29 09:09:10 +02:00 |
|
Sebastian Raschka
|
26f94876f7
|
Update README.md
|
2024-08-24 07:22:18 -05:00 |
|
Sebastian Raschka
|
f1c3d451fe
|
Update README.md
|
2024-08-08 07:50:45 -05:00 |
|
Sebastian Raschka
|
81e9cea3d3
|
Update README.md
|
2024-08-08 07:47:31 -05:00 |
|
Sebastian Raschka
|
98d24a1607
|
Update README.md
|
2024-08-06 08:02:01 -05:00 |
|
Sebastian Raschka
|
50332cf75b
|
Update README.md
|
2024-08-05 17:47:06 -05:00 |
|
Sebastian Raschka
|
16e83434b5
|
Update README.md
|
2024-08-04 16:06:38 -05:00 |
|
Sebastian Raschka
|
52435804eb
|
Direct Preference Optimization from scratch (#294)
|
2024-08-04 08:57:36 -05:00 |
|
Sebastian Raschka
|
ff7a6db212
|
Update README.md
|
2024-08-01 18:17:42 -05:00 |
|
Sebastian Raschka
|
9bf5d67d61
|
Update README.md
|
2024-07-28 09:28:11 -05:00 |
|
Sebastian Raschka
|
4f7f5bd443
|
Update README.md
|
2024-07-28 08:21:38 -05:00 |
|
Sebastian Raschka
|
deea13e5c2
|
Understanding PyTorch Buffers (#288)
|
2024-07-26 08:45:36 -05:00 |
|
Sebastian Raschka
|
bbe09e9799
|
Update README.md
|
2024-07-21 10:42:28 -05:00 |
|