rasbt
|
4612d20fa8
|
User argpars utils to show default args on command line
|
2026-03-01 20:15:21 -06:00 |
|
Sebastian Raschka
|
be5e2a3331
|
Readability and code quality improvements (#959)
* Consistent dataset naming
* consistent section headers
|
2026-02-17 18:44:56 -06:00 |
|
Sebastian Raschka
|
28a8408d4d
|
Update README wrt multi-query attention
Clarified the implications of using multi-query attention on modeling performance and memory usage.
|
2025-11-17 16:39:32 -06:00 |
|
Sebastian Raschka
|
9b9586688d
|
Multi-Head Latent Attention (#876)
* Multi-Head Latent Attention
* update
|
2025-10-11 20:08:30 -05:00 |
|
Sebastian Raschka
|
bf27ad1485
|
Use GB instead of GiB consistently (#875)
|
2025-10-11 09:11:33 -05:00 |
|
Sebastian Raschka
|
c814814d72
|
Grouped-Query Attention memory (#874)
* GQA memory
* remove redundant code
* update links
* update
|
2025-10-11 08:44:19 -05:00 |
|