Stable Diffusion – The Love, Hate, and Drama of Installing it from Scratch

Domestic resources are basically all recommending Autumn Leaf’s one-click deployment package, thinking that they are open-source projects based on Python, so deployment wouldn’t be very complicated, let’s try to start from scratch.

I was messing around with AI-generated images and specifically changed my graphics card, a beginner version of the 3060 12g; the seven-year-old 960 retired gloriously.

The core pytorch cuda installation, which I previously encountered issues with when writing Python game helper scripts (I had installed it locally before), still presented problems – the cuda encryption consistently failed to activate.

To Do

Replan the article structure, first introduce PyTorch, version correspondence, and how to check versions.
How to create a new virtual environment from scratch locally and deploy PyTorch.
Translate the manuscript from scratch: https://stable-diffusion-art.com/install-windows/
Organize reference materials

Steps

Step-by-step installation tutorials in Chinese may not be readily available. When you search in English on Google, you’ll find many similar tutorials starting from scratch. After a brief introduction, we need to install git and then explain the need to install python. Then, you go ahead and download the repository – simply double-clicking the script does the trick.

https://github.com/AUTOMATIC1111/stable-diffusion-webui

For detailed usage and Q&A, consult the issues, https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki. I don’t know why no one explains what this repository is for. Actually, the name itself isn’t difficult to understand – it’s a graphical control console that makes it easier to use. During installation, it downloads the official repository content and obtains the actual SD code.

The repository also provides an installation and startup script that automatically recognizes the current folder and whether there is a Python virtual environment. If one exists, it defaults to using the Python in the current path.

For beginner users, we recommend checking out: https://stable-diffusion-art.com/install-windows/

PyTorch

https://pytorch.org/get-started/locally/

Here’s what I wanted to talk about today. Don’t just follow their steps and run the script directly. Python uses requirement files to install dependencies, which is a minor issue. The core thing is your GPU version and driver version, which need to match PyTorch. Many people have discussed this relationship online – you can find it by searching.

Refer to: https://blog.csdn.net/weixin_40660408/article/details/129896700

Creating a virtual environment is like creating an empty virtual environment, where you first execute the official script and install PyTorch within it.

python -c "import torch; print(torch.version.cuda)"

python -c "import torch; print(torch.__version__, torch.cuda.is_available())"

The above two scripts can check the CUDA version you need to install and also check if PyTorch has been installed successfully.

It’s not recommended to do fancy operations here – just follow the logic on the official page and copy it over directly to install. Directly using pip to install PyTorch is likely to fail or won’t activate CUDA.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Key point: Don’t use messy folder names, as this could cause PyTorch to not work correctly. I spent a lot of time installing and reinstalling, trying to upgrade to version 2.0 because the official documentation said it would be faster. I hadn’t used it much before, and wasn’t sure if Python versions had an impact. I also reviewed the official manual, which recommended using version 3.8. This created a small conflict since I’d previously used a one-click installation package that contained version 3.10. Finally, I started from scratch by creating a new folder, creating a virtual environment, and ensuring PyTorch was installed successfully.

Then I moved the newly installed virtual environment into the web UI folder. At this point, running the script to install other dependencies didn’t cause any problems.

After moving it, you need to execute: python -m pip install --upgrade --force-reinstall pip to fix Pip.

It might seem a bit strange, but I spent quite a long time troubleshooting this because it couldn’t correctly recognize my PyTorch. I realized that by installing it first, then installing other dependencies, I could eliminate all sources of interference.

Xformers

It is recommended to enable this, which can accelerate image generation and reduce memory usage. However, a side effect is that generated images are relatively less stable with the same set of parameters. stable-diffusion-webui:Xformers huggingface optimization | 100.00% | 2m 57.03s | 7440/10058 MiB | 12288/12288 MiB (100.0%) |

Xformers

Optimization Ratio	Time taken	Torch active/reserved	Sys VRAM
51.02%	1m 29.21s	4547/7164 MiB	9298/12288 MiB (75.67%)

Xformers

((masterpiece)),((best quality)),((high detail)),((realistic,))
Industrial age city, deep canyons in the middle, Chinese architectural streets, bazaars, Bridges, (rainy days:1.2), (steampunk:0.8), Chinese architecture
Negative prompt: nsfw,((cowboy)),(((pubic))), ((((pubic_hair))))sketch, duplicate, ugly, huge eyes, text, logo, monochrome, worst face, (bad and mutated hands:1.3), (worst quality:2.0), (low quality:2.0), (blurry:2.0), horror, geometry, bad_prompt, (bad hands), (missing fingers), multiple limbs, bad anatomy, (interlocked fingers:1.2), Ugly Fingers, (extra digit and hands and fingers and legs and arms:1.4), crown braid, ((2girl)), (deformed fingers:1.2), (long fingers:1.2), succubus wings,horn,succubus horn,succubus hairstyle, (bad-artist-anime), bad-artist, bad hand, borrowed character, text focus, watermark, sample watermark, character watermark, lofter username, photo date watermark, movie poster, magazine cover, journal, cover, cover page, doujin cover, album cover, manga cover, brand name imitation, EasyNegative,Tights, silk stockings,shorts
Steps: 35, Sampler: DPM adaptive, CFG scale: 5.5, Seed: 2223996555, Size: 1088x1088, Model hash: 543bcbc212, Model: base_Anything-V3.0-pruned, Clip skip: 2, ENSD: 31337

Epilogue

We didn’t recommend the one-click deployment package because it contained some settings that were customized by the author and differed from the official, out-of-the-box configuration. If you’re a beginner, you might not understand why those parameters are optimal; it’s generally best to start with the official version. As you use it more and more, take time to read the official documentation, and you’ll learn which parameters need adjustment.

GPU Selection

Following the cryptocurrency mining boom, GPU prices have become relatively less high; for entry-level players choosing between the 3060 and 3060ti, it’s generally recommended to opt for the 12G version of the 3060 due to its larger VRAM, as it can generate larger resolution images. Why do you need a higher resolution? Because you can increase the resolution during generation, which will result in clearer and more detailed images. If you only want to generate small images, then 8GB of VRAM is sufficient.

There’s also the Super Resolution Upscaling option, which enhances details and makes the image richer in detail, requiring more VRAM.

Below is a summary table of the single-precision (FP32), half-precision (FP16), and double-precision (FP64) floating-point computing capabilities of NVIDIA GeForce GTX 970, GeForce RTX 3060 Ti, GeForce RTX 3060, GeForce RTX 3080, and GeForce RTX 3080 Ti:

| GeForce GTX 970 | 2014 | 3.49 | 87.2 | 0.109 |

Graphics Card Selection

Graphics Card Model	Release Year	Single-Precision Floating Point Compute Capability (TFLOPS)	Half-Precision Floating Point Compute Capability (TFLOPS)	Double-Precision Floating Point Compute Capability (TFLOPS)

Graphics Card Selection

Graphics Card Model	Release Year	Single-Precision Floating-Point Compute Capability (TFLOPS)	Half-Precision Floating-Point Compute Capability (TFLOPS)	Double-Precision Floating-Point Compute Capability (TFLOPS)

Graphics Card Selection

Graphics Card Model	Release Year	Single-Precision Floating-Point Compute Capability (TFLOPS)	Half-Precision Floating-Point Compute Capability (TFLOPS)	Double-Precision Floating-Point Compute Capability (TFLOPS)

Graphics Card Selection

Graphics Card Model	Release Year	Single-Precision Floating-Point Compute Capability (TFLOPS)	Half-Precision Floating-Point Compute Capability (TFLOPS)	Double-Precision Floating-Point Compute Capability (TFLOPS)

GPU Selection

Excerpted from various GPU performance test data

Updates

Every six months, I originally planned to revisit and refine the installation steps, and explain more basic concepts. However, I discovered that most people using AI image generation are simply adjusting parameters based on images provided by experts, or re-rendering existing images with formatting changes.

I had previously attempted a project using AI to generate UI materials for mini programs, but after struggling for half a day, the results were unsatisfactory compared to just pulling resource images directly from the official mini program documentation.