Running SD3.5, which is rumored to be fast and high-quality, via the command line (supports AMD GPU/CPU)

The purpose
Build environment
1. stable-diffusion.cpp
2. Model
Execute
1. Option（Argument）
2. Execute time

The purpose

Running SD3.5-medium (stable-diffusion-3.5-medium) from the command line using stable-diffusion.cpp.

Rumor has it that it’s both fast and high-quality.

It can be run on AMD GPUs as well as on CPUs.

Build environment

stable-diffusion.cpp

Download the ZIP file that matches your environment from the following page.

Releases · leejet/stable-diffusion.cpp

Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++ - leejet/stable-diffusion.cpp

If you want to run it on an AMD GPU, look for versions labeled Vulkan or ROCm.

(Generally, Vulkan should be fine. ROCm tends to support a more limited range of GPUs.)

For NVIDIA GPUs, look for versions labeled “CUDA“.

The AVX512, AVX2, AVX, and No-AVX versions are for CPU operation. Please check which AVX version your CPU supports before downloading. (I was under the impression they weren’t, but it turns out AMD CPUs also support AVX. The easiest way to check your specific version is to ask an AI.)

Once you have extracted the downloaded file to a folder of your choice, the setup is complete.

Model

Please download a model from the following page. (The larger the number after “Q”, the higher the performance, but the longer it takes to generate.)

city96/stable-diffusion-3.5-medium-gguf at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Next, download one of the t5xxl_XXXXX.safetensors files, along with clip_g.safetensors and clip_l.safetensors (three files in total) from the following page.

Comfy-Org/stable-diffusion-3.5-fp8 at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Next, download diffusion_pytorch_model.safetensors from the following page. Please note that you will need to log in to Hugging Face and agree to the terms of use in order to download it. (The UI for agreeing will appear when you log in and navigate to the “Model card” tab.)

stabilityai/stable-diffusion-3.5-medium at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Execute

Open the command line and navigate to the folder where you extracted stable-diffusion.cpp.

Run the following command. (Please replace the model path placeholder with the actual path of the model you are using.)

Bash

sd-cli.exe --diffusion-model [Model_Path] --clip_l [Path_to_clip_l.safetensors] --vae [Path_to_diffusion_pytorch_model.safetensors] --clip_g [Path_to_clip_g.safetensors] --t5xxl [Path_to_t5xxl_XXXX.safetensors] -H 512 -W 512 -p "a lovely cat" --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu

If a cat image is generated in the folder where you ran the command, the execution was successful.

Option（Argument）

The options are summarized on the following page.

stable-diffusion.cpp/examples/cli/README.md at master · leejet/stable-diffusion.cpp

Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++ - leejet/stable-diffusion.cpp

Only the most commonly used ones are listed below.

-m	path for Model
-p	prompt
-s	Seed To generate a random image, specify -1. Note that if this is not specified, the same image will be generated every time.
`-H`	Image height
-W	Image width
`--vae`	path for VAE
`--steps`	Step default 20 Be careful, as some models perform better with lower values.

Execute time

Image generation speeds are as follows. (This excludes model loading time and post-iteration processing.)

Model	Creation Time（ｓ）
stable-diffusion（Vulkan）	36
Qwen Image（Vulkan）	623
SD3.5-midium（Vulkan）	56

On a personal note, just as the rumors suggested, the image quality felt higher than Stable Diffusion, and it seemed to have a better balance than the two models I compared it to.