From 28b5d4e8a6f6aacd9166705c6affd67cf2e6927b Mon Sep 17 00:00:00 2001 From: Sebastian Raschka Date: Mon, 23 Jun 2025 18:33:04 -0500 Subject: [PATCH] Update Llama 3 table for consistency with Qwen3 --- ch05/07_gpt_to_llama/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/ch05/07_gpt_to_llama/README.md b/ch05/07_gpt_to_llama/README.md index 6cda1f4..3fcb120 100644 --- a/ch05/07_gpt_to_llama/README.md +++ b/ch05/07_gpt_to_llama/README.md @@ -241,18 +241,18 @@ Note that the peak memory usage is only listed for Nvidia CUDA devices, as it is | Model | Mode | Hardware | Tokens/sec | GPU Memory (VRAM) | | ----------- | ----------------- | --------------- | ---------- | ----------------- | | Llama3Model | Regular | Mac Mini M4 CPU | 1 | - | -| Llama3Model | Regular compiled | Mac Mini M4 CPU | - | - | +| Llama3Model | Regular compiled | Mac Mini M4 CPU | 1 | - | | Llama3Model | KV cache | Mac Mini M4 CPU | 68 | - | | Llama3Model | KV cache compiled | Mac Mini M4 CPU | 86 | - | | | | | | | | Llama3Model | Regular | Mac Mini M4 GPU | 15 | - | -| Llama3Model | Regular compiled | Mac Mini M4 GPU | - | - | +| Llama3Model | Regular compiled | Mac Mini M4 GPU | Error | - | | Llama3Model | KV cache | Mac Mini M4 GPU | 62 | - | -| Llama3Model | KV cache compiled | Mac Mini M4 GPU | - | - | +| Llama3Model | KV cache compiled | Mac Mini M4 GPU | Error | - | | | | | | | | Llama3Model | Regular | Nvidia A100 GPU | 42 | 2.91 GB | | Llama3Model | Regular compiled | Nvidia A100 GPU | 170 | 3.12 GB | | Llama3Model | KV cache | Nvidia A100 GPU | 58 | 2.87 GB | | Llama3Model | KV cache compiled | Nvidia A100 GPU | 161 | 3.61 GB | -Note that all settings above have been tested to produce the same text outputs. \ No newline at end of file +Note that all settings above have been tested to produce the same text outputs.