Sebastian Raschka
|
8447d70b18
|
Some gemma 3 improvements (#1000)
* some gemma 3 improvements
* update url
|
2026-04-05 21:05:05 -05:00 |
|
Sebastian Raschka
|
be5e2a3331
|
Readability and code quality improvements (#959)
* Consistent dataset naming
* consistent section headers
|
2026-02-17 18:44:56 -06:00 |
|
Sebastian Raschka
|
bc6f335526
|
Olmo 3 from scratch (#914)
* Olmo 3 from scratch
* update
* update
* update
|
2025-11-22 22:42:18 -06:00 |
|
Sebastian Raschka
|
b6cd0a312f
|
More efficient angles computation in RoPE (#830)
|
2025-09-16 03:23:33 +00:00 |
|
Sebastian Raschka
|
8add26cbe9
|
Improve weight tying handling (#826)
* Improve weight tying handling
* fix
|
2025-09-14 15:46:48 -05:00 |
|
casinca
|
670f7a4dd0
|
- added (missing) Gemma3 bullet point in parent folder's readme.md (#788)
- typo in nbs
|
2025-08-22 15:03:47 -05:00 |
|
Sebastian Raschka
|
4a84cfccf9
|
Minor cosmetic fixes in Gemma 3 nbs (#780)
|
2025-08-19 21:08:29 -05:00 |
|
Sebastian Raschka
|
f571b5e493
|
Add Gemma3 KV cache variant (#776)
* Add Gemma3 KV cache variant
* update
|
2025-08-19 12:37:49 -05:00 |
|