Commit Graph

11 Commits

Author SHA1 Message Date
casinca
145322ded8 [Minor] Qwen3 typo & optim (#758)
* typo

* remove weight dict after loading
2025-07-28 17:29:44 -05:00
Sebastian Raschka
13f049f6a4 Minor typo: pply -> Apply (#749) 2025-07-22 08:19:25 -05:00
Sebastian Raschka
a354555049 Batched KV Cache Inference for Qwen3 (#735) 2025-07-10 08:09:35 -05:00
Sebastian Raschka
21c41721cc Add more sophisticated Qwen3 tokenizer (#729) 2025-07-09 13:16:26 -05:00
Sebastian Raschka
c4ec55edac Support different Qwen3 sizes in pkg (#714) 2025-06-28 08:00:23 -05:00
Sebastian Raschka
190c66b3b0 Add Qwen3 1.7, 4B, 8B, and 32B support to from-scratch nb (#709) 2025-06-25 08:53:09 -05:00
Sebastian Raschka
81eda38d3b Improve KV cache code for torch.compile (#705)
* Improve KV cache code for torch.compile

* cleanup

* cleanup
2025-06-23 18:08:49 -05:00
Sebastian Raschka
37b26c2e04 CPU compile performance for Qwen3 models (#704)
* Ch06 classifier function asserts

* Qwen3 cpu compilation perf
2025-06-23 11:06:10 -05:00
Sebastian Raschka
0a2e8c39c4 Qwen3 KV cache (#688) 2025-06-21 17:34:39 -05:00
Sebastian Raschka
c008f95072 Fix formatting in Qwen3 nb (#680)
* Fix formatting in Qwen3 nb

* upd
2025-06-20 07:28:27 -05:00
Sebastian Raschka
e719bd86ad Qwen3 From Scratch (#678)
* Qwen3 From Scratch

* rev other file

* upd

* upd

* upd

* url fixes
2025-06-19 18:44:38 -05:00