GPU: NVIDIA GeForce GTX 1070 | VRAM: 8 GB | Architecture: Pascal (sm_61)
Prompt: "a turtle and a bird together in a forest" | Resolution: 512 × 512 | Replicates: 5 timed runs
Load time is from a warm cache (model already downloaded), median of 5 timed replicates. Generation time is the median of 5 timed replicates (seconds). VRAM and RAM are peak values in GB. OOM = out of memory during load or inference.
| Model | Mode | Load time (s) | Gen time (s) | Peak GPU VRAM (GB) | Peak system RAM (GB) |
|---|---|---|---|---|---|
| SD 1.4 | GPU only | 5.4 | 16.6 | 3.44 | 3.0 |
| Model offload | 5.4 | 18.6 | 2.64 | 5.2 | |
| Sequential offload | 4.8 | 40.5 | 1.43 | 7.0 | |
| SD 2.1 Base | GPU only | 4.6 | 15.5 | 3.28 | 3.0 |
| Model offload | 4.6 | 17.2 | 2.09 | 5.2 | |
| Sequential offload | 4.6 | 41.0 | 0.86 | 6.9 | |
| SDXL | GPU only | 10.7 | OOM | OOM | — |
| Model offload | 10.7 | 32.0 | 5.34 | 15.0 | |
| Sequential offload | 9.5 | 79.4 | 0.76 | 18.4 | |
| SDXL Turbo | GPU only | 10.5 | OOM | OOM | — |
| Model offload | 10.5 | 6.4 | 5.32 | 20.5 | |
| Sequential offload | 9.7 | 10.3 | 0.76 | 20.3 | |
| SD 3.5 Medium | GPU only | OOM | OOM | OOM | — |
| Model offload | 10.1 | OOM | OOM | — | |
| Sequential offload | 10.5 | 100.6 | 1.29 | 48.0 | |
| SD 3.5 Large Turbo | GPU only | OOM | OOM | OOM | — |
| Model offload | 14.1 | OOM | OOM | — | |
| Sequential offload | 14.6 | 42.8 | 1.04 | 64.6 | |
| Kandinsky 2.2 | GPU only | OOM | OOM | OOM | — |
| Model offload | 9.8 | 23.9 | 5.37 | 62.6 | |
| Sequential offload | 9.6 | 50.7 | 2.74 | 38.9 | |
| FLUX.1 Schnell | GPU only | OOM | OOM | OOM | — |
| Model offload | 15.7 | OOM | OOM | — | |
| Sequential offload | 16.5 | 56.6 | 0.87 | 85.6 |
Five replicates per model. Click to view full size.
Raw data: benchmark_results.json