Logo
Explore Help
Sign In
books/LLMs-from-scratch
1
0
Fork 0
You've already forked LLMs-from-scratch
mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2026-04-10 12:33:42 +00:00
Code Issues Packages Projects Releases Wiki Activity
Files
main
LLMs-from-scratch/pkg/llms_from_scratch/kv_cache
History
casinca 9c4be478f8 Optional weight tying for Qwen3 and Llama3.2 pretraining (#949)
* optional weight tying for Qwen3 and Llama3.2

* typo
2026-01-14 09:07:04 -06:00
..
__init__.py
Llama 3 KV Cache (#685)
2025-06-21 10:55:20 -05:00
generate.py
Add defensive context trimming for multiturn (#815)
2025-09-09 20:19:00 -05:00
gpt2.py
remove redundant next_cache (#817)
2025-09-11 15:16:08 -05:00
llama3.py
Optional weight tying for Qwen3 and Llama3.2 pretraining (#949)
2026-01-14 09:07:04 -06:00
qwen3.py
Improve MoE implementation (#841)
2025-09-22 15:21:06 -05:00
utils.py
Improve KV cache code for torch.compile (#705)
2025-06-23 18:08:49 -05:00
Powered by Gitea Version: 1.25.0+dev-393-g3a969a58c2 Page: 103ms Template: 2ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API