mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-10 12:33:42 +00:00
Update Llama 3 table for consistency with Qwen3
This commit is contained in:
committed by
GitHub
parent
58b30e2f7b
commit
28b5d4e8a6
@@ -241,18 +241,18 @@ Note that the peak memory usage is only listed for Nvidia CUDA devices, as it is
|
|||||||
| Model | Mode | Hardware | Tokens/sec | GPU Memory (VRAM) |
|
| Model | Mode | Hardware | Tokens/sec | GPU Memory (VRAM) |
|
||||||
| ----------- | ----------------- | --------------- | ---------- | ----------------- |
|
| ----------- | ----------------- | --------------- | ---------- | ----------------- |
|
||||||
| Llama3Model | Regular | Mac Mini M4 CPU | 1 | - |
|
| Llama3Model | Regular | Mac Mini M4 CPU | 1 | - |
|
||||||
| Llama3Model | Regular compiled | Mac Mini M4 CPU | - | - |
|
| Llama3Model | Regular compiled | Mac Mini M4 CPU | 1 | - |
|
||||||
| Llama3Model | KV cache | Mac Mini M4 CPU | 68 | - |
|
| Llama3Model | KV cache | Mac Mini M4 CPU | 68 | - |
|
||||||
| Llama3Model | KV cache compiled | Mac Mini M4 CPU | 86 | - |
|
| Llama3Model | KV cache compiled | Mac Mini M4 CPU | 86 | - |
|
||||||
| | | | | |
|
| | | | | |
|
||||||
| Llama3Model | Regular | Mac Mini M4 GPU | 15 | - |
|
| Llama3Model | Regular | Mac Mini M4 GPU | 15 | - |
|
||||||
| Llama3Model | Regular compiled | Mac Mini M4 GPU | - | - |
|
| Llama3Model | Regular compiled | Mac Mini M4 GPU | Error | - |
|
||||||
| Llama3Model | KV cache | Mac Mini M4 GPU | 62 | - |
|
| Llama3Model | KV cache | Mac Mini M4 GPU | 62 | - |
|
||||||
| Llama3Model | KV cache compiled | Mac Mini M4 GPU | - | - |
|
| Llama3Model | KV cache compiled | Mac Mini M4 GPU | Error | - |
|
||||||
| | | | | |
|
| | | | | |
|
||||||
| Llama3Model | Regular | Nvidia A100 GPU | 42 | 2.91 GB |
|
| Llama3Model | Regular | Nvidia A100 GPU | 42 | 2.91 GB |
|
||||||
| Llama3Model | Regular compiled | Nvidia A100 GPU | 170 | 3.12 GB |
|
| Llama3Model | Regular compiled | Nvidia A100 GPU | 170 | 3.12 GB |
|
||||||
| Llama3Model | KV cache | Nvidia A100 GPU | 58 | 2.87 GB |
|
| Llama3Model | KV cache | Nvidia A100 GPU | 58 | 2.87 GB |
|
||||||
| Llama3Model | KV cache compiled | Nvidia A100 GPU | 161 | 3.61 GB |
|
| Llama3Model | KV cache compiled | Nvidia A100 GPU | 161 | 3.61 GB |
|
||||||
|
|
||||||
Note that all settings above have been tested to produce the same text outputs.
|
Note that all settings above have been tested to produce the same text outputs.
|
||||||
|
|||||||
Reference in New Issue
Block a user