GPU: Tesla P100-PCIE | VRAM: 16 GB | Architecture: Pascal (sm_60)
Prompt: "a turtle and a bird together in a forest" | Resolution: 512 × 512 | Replicates: 5 timed runs
Load time is from a warm cache (model already downloaded), median of 5 timed replicates. Generation time is the median of 5 timed replicates (seconds). VRAM and RAM are peak values in GB. OOM = out of memory during load or inference.
| Model | Mode | Load time (s) | Gen time (s) | Peak GPU VRAM (GB) | Peak system RAM (GB) |
|---|---|---|---|---|---|
| SD 1.4 | GPU only | 5.4 | 8.6 | 3.44 | 3.2 |
| Model offload | 5.4 | 10.4 | 2.64 | 5.4 | |
| Sequential offload | 5.4 | 37.8 | 1.43 | 6.9 | |
| SD 2.1 Base | GPU only | 5.2 | 7.8 | 3.28 | 7.3 |
| Model offload | 5.2 | 9.5 | 2.09 | 7.5 | |
| Sequential offload | 4.8 | 38.8 | 0.86 | 8.8 | |
| SDXL | GPU only | 6.3 | 14.1 | 7.48 | 11.9 |
| Model offload | 6.3 | 18.2 | 5.34 | 13.7 | |
| Sequential offload | 5.4 | 74.9 | 0.76 | 18.6 | |
| SDXL Turbo | GPU only | 4.6 | 1.3 | 7.48 | 19.1 |
| Model offload | 4.6 | 5.5 | 5.32 | 19.1 | |
| Sequential offload | 5.3 | 10.5 | 0.76 | 20.2 | |
| SD 3.5 Medium | GPU only | OOM | OOM | OOM | — |
| Model offload | 6.5 | 41.1 | 12.06 | 34.5 | |
| Sequential offload | 5.0 | 66.1 | 1.29 | 48.5 | |
| SD 3.5 Large Turbo | GPU only | OOM | OOM | OOM | — |
| Model offload | 7.0 | OOM | OOM | — | |
| Sequential offload | 6.8 | 33.6 | 1.04 | 72.1 | |
| Kandinsky 2.2 | GPU only | 4.6 | 8.5 | 10.00 | 54.2 |
| Model offload | 4.6 | 14.8 | 5.37 | 29.6 | |
| Sequential offload | 4.6 | 44.8 | 2.74 | 33.8 | |
| PixArt XL 512 | GPU only | 3.7 | 9.0 | 12.91 | 32.6 |
| Model offload | 3.7 | 21.2 | 10.78 | 37.9 | |
| Sequential offload | 2.5 | 22.4 | 0.87 | 62.6 | |
| FLUX.1 Schnell | GPU only | OOM | OOM | OOM | — |
| Model offload | 7.9 | OOM | OOM | — | |
| Sequential offload | 7.6 | 38.2 | 0.86 | 87.6 |
Five replicates per model. Click to view full size.
Raw data: benchmark_results.json