LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2026-04-10 12:33:42 +00:00

Files

casinca 9c4be478f8 Optional weight tying for Qwen3 and Llama3.2 pretraining (#949 )

* optional weight tying for Qwen3 and Llama3.2

* typo

2026-01-14 09:07:04 -06:00

__init__.py

Llama 3 KV Cache (#685 )

2025-06-21 10:55:20 -05:00

generate.py

2025-09-09 20:19:00 -05:00

gpt2.py

2025-09-11 15:16:08 -05:00

llama3.py

2026-01-14 09:07:04 -06:00

qwen3.py

2025-09-22 15:21:06 -05:00

utils.py

2025-06-23 18:08:49 -05:00