Default Branch

8447d70b18 · Some gemma 3 improvements (#1000) · Updated 2026-04-06 02:05:05 +00:00

Branches

fd11713ed9 · exclude torchvission from nightly · Updated 2026-04-04 17:17:50 +00:00

2
2

9f7dbb2493 · Update docker file · Updated 2025-10-06 23:31:59 +00:00    books

76
1

b1f852c1ba · Update requirements.txt · Updated 2025-09-27 02:57:22 +00:00    books

84
2

862df48e38 · use apply_chat_template · Updated 2025-09-16 13:12:01 +00:00    books

91
9

8fd29ed079 · Gemma 3 270M from scratch · Updated 2025-08-17 00:49:38 +00:00    books

804
679

06aa6d470a · Fix eos token usage in Qwen3 tokenizer · Updated 2025-08-05 18:42:18 +00:00    books

128
1

4aa398c79d · Comment typo: head_dim -> head_dim // 2 · Updated 2025-07-23 13:16:30 +00:00    books

131
1

1552023bd4 · Fix issue 724: unused args (#726) · Updated 2025-07-10 17:58:32 +00:00    books

143
9

713a6e24c9 · add tests · Updated 2025-06-22 22:48:23 +00:00    books

162
2

4715dc3be5 · remove redundant context_length in GQA · Updated 2025-03-31 21:49:10 +00:00    books

206
3

ca0eee4cf9 · simplify and use pythorch 3.12 · Updated 2025-02-20 03:01:15 +00:00    books

238
7