This website requires JavaScript.
Explore
Help
Sign In
books
/
LLMs-from-scratch
Watch
1
Star
0
Fork
0
You've already forked LLMs-from-scratch
mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced
2026-04-10 12:33:42 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
052c2dea4f964529bb265cec928fe54d86d38912
LLMs-from-scratch
/
pkg
/
llms_from_scratch
/
kv_cache
History
casinca
9c4be478f8
Optional weight tying for Qwen3 and Llama3.2 pretraining (
#949
)
...
* optional weight tying for Qwen3 and Llama3.2 * typo
2026-01-14 09:07:04 -06:00
..
__init__.py
Llama 3 KV Cache (
#685
)
2025-06-21 10:55:20 -05:00
generate.py
Add defensive context trimming for multiturn (
#815
)
2025-09-09 20:19:00 -05:00
gpt2.py
remove redundant next_cache (
#817
)
2025-09-11 15:16:08 -05:00
llama3.py
Optional weight tying for Qwen3 and Llama3.2 pretraining (
#949
)
2026-01-14 09:07:04 -06:00
qwen3.py
Improve MoE implementation (
#841
)
2025-09-22 15:21:06 -05:00
utils.py
Improve KV cache code for torch.compile (
#705
)
2025-06-23 18:08:49 -05:00