diff --git a/ch05/10_llm-training-speed/README.md b/ch05/10_llm-training-speed/README.md index c76a453..c96aa22 100644 --- a/ch05/10_llm-training-speed/README.md +++ b/ch05/10_llm-training-speed/README.md @@ -174,6 +174,24 @@ After: - `Step tok/sec: 112046` - `Reserved memory: 6.1875 GB` +
+ +--- + +**Windows note** + +- Compilation can be tricky on Windows +- `torch.compile()` uses Inductor, which JIT-compiles kernels and needs a working C/C++ toolchain +- For CUDA, Inductor also depends on Triton, available via the community package `triton-windows` + - If you see `cl not found`, [install Visual Studio Build Tools with the "C++ workload"](https://learn.microsoft.com/en-us/cpp/build/vscpp-step-0-installation?view=msvc-170) and run Python from the "x64 Native Tools" prompt + - If you see `triton not found` with CUDA, install `triton-windows` (for example, `uv pip install "triton-windows<3.4"`). +- For CPU, a reader further recommended following this [PyTorch Inductor guide for Windows](https://docs.pytorch.org/tutorials/unstable/inductor_windows.html) + - Here, it is important to install the English language package when installing Visual Studio 2022 to avoid a UTF-8 error + - Also, please note that the code needs to be run via the "Visual Studio 2022 Developer Command Prompt" rather than a notebook +- If this setup proves tricky, you can skip compilation; **compilation is optional, and all code examples work fine without it** + +--- +   ### 9. Vocabulary padding diff --git a/reasoning-from-scratch b/reasoning-from-scratch index 2fc0f00..3961a71 160000 --- a/reasoning-from-scratch +++ b/reasoning-from-scratch @@ -1 +1 @@ -Subproject commit 2fc0f00ccb16ec4f764725ea53f7e32077a6ed05 +Subproject commit 3961a7101465ac12cc476bb24ffcb0c27c073982