Stable Diffusion – A Saga of Installation Troubles

Domestic resources generally recommend 秋叶’s one-click deployment package. Assuming they are based on Python open-source projects, the deployment shouldn’t be too complex; let’s try starting from scratch.

After messing around with AI-generated images, I specially replaced my graphics card, only to have it gloriously shut down

The core encryption remains inactive

Pending

Restructure the article, first introducing PyTorch, version compatibility, and how to check versions How to create a new virtual environment from scratch and deploy PyTorch locally Translate draft, starting from scratch, Stable Diffusion https://stable-diffusion-art.com/install-windows/ Organize reference materials

Steps

You might not find a step-by-step installation tutorial by searching in Chinese. Then, you just download the repository and run the script with a double click.

https://github.com/AUTOMATIC1111/stable-diffusion-webui

For detailed usage instructions and FAQs, please refer to issues，https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki

I don’t know why no one explained what this repository is for. Actually, it’s not hard to tell from the name—it’s an interface console that makes it more convenient to use. In fact, during installation, it will download the official repository content and obtain the actual INLINE_CODE_0 code.

The warehouse also includes an installation startup script that automatically detects the presence of Python虚拟环境。如果有的话默认使用当前路径的的python in the current folder

If you’re a complete beginner, it is recommended that you check out: https://stable-diffusion-art.com/install-windows/

pytorch

https://pytorch.org/get-started/locally/

Here’s what I wanted to say today: First, don’t just run the script directly based on their instructions. Python installs dependencies through requirement files – that’s a minor issue. The core thing is your graphics card driver version needs to correspond with PyTorch. Many resources online explain this relationship; you can easily find it.

Reference: https://blog.csdn.net/weixin_40660408/article/details/129896700

Creating a virtual environment first—an empty one—allows you to directly run the script from the official website to install PyTorch

python -c "import torch; print(torch.version.cuda)"

python -c "import torch; print(torch.__version__, torch.cuda.is_available())"

These two scripts can check your CUDA version and whether the installation was successful

It’s not recommended to use complicated operations here. Just copy the logic from the official page and install it directly. Installing via pip may result in failure or CUDA activation issues.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Important: Ensure the folder path is clean; otherwise, PyTorch may not work properly

It involved numerous installations and attempts to manually install the official files. The goal was to upgrade to version 2.0, as the official documentation states it offers improved speed. However, I wasn’t familiar with the software and unsure if the Python version or other factors were influencing performance. I also consulted the official manual, which recommends using version 3.8. This created a conflict since a previous one-click deployment package used version 3.10. Ultimately, I started from scratch: creating a new folder, establishing a virtual environment, and ensuring torch was successfully installed.

Then move this installed virtual environment into the web UI folder. Starting the installation script again at this point should resolve most dependency issues.

After moving, you need to run: python -m pip install –upgrade –force-reinstall pip to fix it

It might seem odd, but I spent a while troubleshooting this. It couldn’t correctly identify my torch, so to eliminate all potential interference, I decided to install it first, before installing other dependencies.

Xformers

Recommended to enable; accelerates image generation and reduces current usage. Side effect: same parameter group, 生成的图像相对不是那么稳定.

stable-diffusion-webui:Xformers

huggingface optimization

Optimization Ratio	Time taken	Torch active/reserved	Sys VRAM
100.00%	2m 57.03s	7440/10058 MiB	12288/12288 MiB (100.0%)
51.02%	1m 29.21s	4547/7164 MiB	9298/12288 MiB (75.67%)

((masterpiece)),((best quality)),((high detial)),((realistic,))
Industrial age city, deep canyons in the middle,chinese architectural streets,bazaars, Bridges, (rainy days:1.2), (steampunk:0.8), chinese architecture
Negative prompt: nsfw,((cowboy)),(((pubic))), ((((pubic_hair))))sketch, duplicate, ugly, huge eyes, text, logo, monochrome, worst face, (bad and mutated hands:1.3), (worst quality:2.0), (low quality:2.0), (blurry:2.0), horror, geometry, bad_prompt, (bad hands), (missing fingers), multiple limbs, bad anatomy, (interlocked fingers:1.2), Ugly Fingers, (extra digit and hands and fingers and legs and arms:1.4), crown braid, ((2girl)), (deformed fingers:1.2), (long fingers:1.2),succubus wings,horn,succubus horn,succubus hairstyle, (bad-artist-anime), bad-artist, bad hand, borrowed character, text focus, watermark, sample watermark, character watermark, lofter username, photo date watermark, movie poster, magazine cover, journal, cover, cover page, doujin cover, album cover, manga cover, brand name imitation, EasyNegative,Tights, silk stockings,shorts
Steps: 35, Sampler: DPM adaptive, CFG scale: 5.5, Seed: 2223996555, Size: 1088x1088, Model hash: 543bcbc212, Model: base_Anything-V3.0-pruned, Clip skip: 2, ENSD: 31337

Afterword

It’s better to recommend the one-click deployment package. However, that package contains some custom settings from the author, which differ from the official, original version. If you’re a beginner, you might not understand why it’s best to start with the official parameters. As you gain experience, refer to the official documentation to learn which parameters need adjustment.

Graphics card selection

After the cryptocurrency mining boom, graphics card prices are relatively lower now. For ordinary entry-level players, the amount of VRAM is sufficient.

Also, 高清放大 options require more detailed implementation, enriching screen details and demanding more video memory

Here’s a summary table of single-precision (FP32), half-precision (FP16), and double-precision (FP64) floating-point compute capabilities for NVIDIA GeForce GTX 970, GeForce RTX 3060 Ti, GeForce RTX 3060, GeForce RTX 3080, and GeForce RTX 3080 Ti

Graphics Card Model	Release Year	Single-Precision Floating-Point Performance (TFLOPS)	Half-Precision Floating-Point Performance (TFLOPS)	Double-Precision Floating-Point Performance (TFLOPS)
GeForce GTX 970	2014	3.49	87.2	0.109
GeForce RTX 3060 Ti	2020	16.2	32.4	0.51
GeForce RTX 3060	2021	12.7	25.4	0.39
GeForce RTX 3080	2020	29.8	58.9	0.93
GeForce RTX 3080 Ti	2021	34.8	68.7	1.36

Quoted from

Update

After six months, I intended to review the installation steps and explain more basic concepts. However, it turns out that for ordinary people using AI image generation, it’s mostly about adjusting parameters based on images provided by experts or re-rendering existing images in a formatted way.

We tried using AI to generate UI assets for a mini-program, but after all that effort, the results were unsatisfactory. It’s better to just pull resources directly from the official mini-program.