LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2026-04-10 12:33:42 +00:00

Author	SHA1	Message	Date
casinca	145322ded8	[Minor] Qwen3 typo & optim (#758 ) * typo * remove weight dict after loading	2025-07-28 17:29:44 -05:00
Sebastian Raschka	13f049f6a4	Minor typo: pply -> Apply (#749 )	2025-07-22 08:19:25 -05:00
Sebastian Raschka	a354555049	Batched KV Cache Inference for Qwen3 (#735 )	2025-07-10 08:09:35 -05:00
Sebastian Raschka	21c41721cc	Add more sophisticated Qwen3 tokenizer (#729 )	2025-07-09 13:16:26 -05:00
Sebastian Raschka	c4ec55edac	Support different Qwen3 sizes in pkg (#714 )	2025-06-28 08:00:23 -05:00
Sebastian Raschka	190c66b3b0	Add Qwen3 1.7, 4B, 8B, and 32B support to from-scratch nb (#709 )	2025-06-25 08:53:09 -05:00
Sebastian Raschka	81eda38d3b	Improve KV cache code for torch.compile (#705 ) * Improve KV cache code for torch.compile * cleanup * cleanup	2025-06-23 18:08:49 -05:00
Sebastian Raschka	37b26c2e04	CPU compile performance for Qwen3 models (#704 ) * Ch06 classifier function asserts * Qwen3 cpu compilation perf	2025-06-23 11:06:10 -05:00
Sebastian Raschka	0a2e8c39c4	Qwen3 KV cache (#688 )	2025-06-21 17:34:39 -05:00
Sebastian Raschka	c008f95072	Fix formatting in Qwen3 nb (#680 ) * Fix formatting in Qwen3 nb * upd	2025-06-20 07:28:27 -05:00
Sebastian Raschka	e719bd86ad	Qwen3 From Scratch (#678 ) * Qwen3 From Scratch * rev other file * upd * upd * upd * url fixes	2025-06-19 18:44:38 -05:00