Skip to content

Zero to Hero Stable Diffusion 3 Tutorial with Amazing SwarmUI SD Web UI that Utilizes ComfyUI

FurkanGozukara edited this page Oct 19, 2025 · 1 revision

Zero to Hero Stable Diffusion 3 Tutorial with Amazing SwarmUI SD Web UI that Utilizes ComfyUI

Zero to Hero Stable Diffusion 3 Tutorial with Amazing SwarmUI SD Web UI that Utilizes ComfyUI

image Hits Patreon BuyMeACoffee Furkan Gözükara Medium Codio Furkan Gözükara Medium

YouTube Channel Furkan Gözükara LinkedIn Udemy Twitter Follow Furkan Gözükara

Do not skip any part of this tutorial to master how to use Stable Diffusion 3 (SD3) with the most advanced generative AI open source APP SwarmUI. Automatic1111 SD Web UI or Fooocus are not supporting the #SD3 yet. Therefore, I am starting to make tutorials for SwarmUI as well. #StableSwarmUI is officially developed by the StabilityAI and your mind will be blown after you watch this tutorial and learn its amazing features. StableSwarmUI uses #ComfyUI as the back end thus it has all the good features of ComfyUI and it brings you easy to use features of Automatic1111 #StableDiffusion Web UI with them. I really liked SwarmUI and planning to do more tutorials for it.

🔗 The Public Post (no login or account required) Shown In The Video With The Links ➡️ https://www.patreon.com/posts/stableswarmui-3-106135985

00:00:00 Introduction to the Stable Diffusion 3 (SD3) and SwarmUI and what is in the tutorial

00:04:12 Architecture and features of SD3

00:05:05 What each different model files of Stable Diffusion 3 means

00:06:26 How to download and install SwarmUI on Windows for SD3 and all other Stable Diffusion models

00:08:42 What kind of folder path you should use when installing SwarmUI

00:10:28 If you get installation error how to notice and fix it

00:11:49 Installation has been completed and now how to start using SwarmUI

00:12:29 Which settings I change before start using SwarmUI and how to change your theme like dark, white, gray

00:12:56 How to make SwarmUI save generated images as PNG

00:13:08 How to find description of each settings and configuration

00:13:28 How to download SD3 model and start using on Windows

00:13:38 How to use model downloader utility of SwarmUI

00:14:17 How to set models folder paths and link your existing models folders in SwarmUI

00:14:35 Explanation of Root folder path in SwarmUI

00:14:52 VAE of SD3 do we need to download?

00:15:25 Generate and model section of the SwarmUI to generate images and how to select your base model

00:16:02 Setting up parameters and what they do to generate images

00:17:06 Which sampling method is best for SD3

00:17:22 Information about SD3 text encoders and their comparison

00:18:14 First time generating an image with SD3

00:19:36 How to regenerate same image

00:20:17 How to see image generation speed and step speed and more information

00:20:29 Stable Diffusion 3 it per second speed on RTX 3090 TI

00:20:39 How to see VRAM usage on Windows 10

00:22:08 And testing and comparing different text encoders for SD3

00:22:36 How to use FP16 version of T5 XXL text encoder instead of default FP8 version

00:25:27 The image generation speed when using best config for SD3

00:26:37 Why VAE of the SD3 is many times better than previous Stable Diffusion models, 4 vs 8 vs 16 vs 32 channels VAE

00:27:40 How to and where to download best AI upscaler models

00:29:10 How to use refiner and upscaler models to improve and upscale generated images

00:29:21 How to restart and start SwarmUI

00:32:01 The folders where the generated images are saved

00:32:13 Image history feature of SwarmUI

00:33:10 Upscaled image comparison

00:34:01 How to download all upscaler models at once

00:34:34 Presets feature in depth

00:36:55 How to generate forever / infinite times

00:37:13 Non-tiled upscale caused issues

00:38:36 How to compare tiled vs non-tiled upscale and decide best

00:39:05 275 SwarmUI presets (cloned from Fooocus) I prepared and the scripts I coded to prepare them and how to import those presets

00:42:10 Model browser feature

00:43:25 How to generate TensorRT engine for huge speed up

00:43:47 How to update SwarmUI

00:44:27 Prompt syntax and advanced features

00:45:35 How to use Wildcards (random prompts) feature

00:46:47 How to see full details / metadata of generated images

00:47:13 Full guide for extremely powerful grid image generation (like X/Y/Z plot)

00:47:35 How to put all downloaded upscalers from zip file

00:51:37 How to see what is happening at the server logs

00:53:04 How to continue grid generation process after interruption

00:54:32 How to open grid generation after it has been completed and how to use it

00:56:13 Example of tiled upscaling seaming problem

01:00:30 Full guide for image history

01:02:22 How to directly delete images and star them

01:03:20 How to use SD 1.5 and SDXL models and LoRAs

01:06:24 Which sampler method is best

01:06:43 How to use image to image

01:08:43 How to use edit image / inpainting

01:10:38 How to use amazing segmentation feature to automatically inpaint any part of images

01:15:55 How to use segmentation on existing images for inpainting and get perfect results with different seeds

01:18:19 More detailed information regarding upscaling and tiling and SD3

01:20:08 Seams perfect explanation and example and how to fix it

01:21:09 How to use queue system

01:21:23 How to use multiple GPUs with adding more backends

01:24:38 Loading model in low VRAM mode

01:25:10 How to fix colors over saturation

01:27:00 Best image generation configuration for SD3

01:27:44 How to apply upscale to your older generated images quickly via preset

01:28:39 Other amazing features of SwarmUI

01:28:49 Clip tokenization and rare token OHWX

Video Transcription

  • 00:00:00 Greetings everyone, in this massive tutorial I am going to show you how to install

  • 00:00:05 Stable Swarm UI and start using Stable Diffusion 3.

  • 00:00:09 Stable Swarm UI is officially developed by Stability AI and the developer is just amazing.

  • 00:00:15 I will show how to download and start using Stable Diffusion 3.

  • 00:00:20 I will show you advanced features of Stable Swarm UI, like segmentation and automatically inpainting any part

  • 00:00:27 with just prompting and automatically masking. I will show you very best configuration for how to

  • 00:00:33 use Stable Diffusion 3. I will show

  • 00:00:35 you Wildcard feature of Stable Swarm UI. I will show you how to use LoRAs with Stable Swarm UI.

  • 00:00:41 I will show you the amazing grid generator feature of the Stable Swarm UI. This is many times better

  • 00:00:46 than the X/Y/Z plot of Automatic1111 Web UI. You will be amazed after you see the grid feature

  • 00:00:54 of Stable Swarm UI. I will show you how to use model downloader to automatically download from

  • 00:01:00 CivitAI or from Hugging Face. I will show you how to use multiple GPUs if you have at

  • 00:01:05 the same time. I will show you amazing image history feature of the Stable Swarm UI.

  • 00:01:11 It is just mind-blowing. You will see it. I will show you how to use image to image feature

  • 00:01:15 of Stable Swarm UI. You see from init image. I will show you how to do inpainting feature

  • 00:01:21 of the Stable Swarm from init image. I will show you models browser feature of Stable

  • 00:01:26 Swarm UI. It is just amazing. You will see it. I will show you how

  • 00:01:30 to download very best upscaler models and use them in your workflow.

  • 00:01:35 Moreover, I will show you very best upscaling configuration for the Stable Diffusion 3.

  • 00:01:40 It is a little bit different than the Stable Diffusion XL or Stable Diffusion 1.5.

  • 00:01:45 I will show you advanced prompt syntax of the Stable Swarm UI. It is very powerful.

  • 00:01:50 I will give you some information regarding the model structure of the Stable Diffusion 3.

  • 00:01:56 Information regarding each one of the Stable Diffusion 3 model

  • 00:02:00 files and the text encoders. I will also do a comparison of them. I will show you how to

  • 00:02:06 contact the developer of the Stable Swarm UI, ask questions and how he fixes the Stable Swarm

  • 00:02:13 UI immediately. This tutorial will be on Windows. I am also going to produce tutorials for cloud

  • 00:02:19 services for who doesn't have GPUs on their computer. However, you still need to watch

  • 00:02:24 this tutorial fully to learn how to use Stable Swarm UI. Moreover, the optimization of

  • 00:02:30 Stable Swarm UI is just mind-blowing. Let me show you before we begin. Currently, these are my VRAM

  • 00:02:36 usages. Because the models are loaded, let's generate images with the very best configuration

  • 00:02:41 of Stable Diffusion 3 by using both of the text encoders and see the VRAM usage. You see,

  • 00:02:47 it is running under 6 GB GPUs as well. So if you have a 6 GB or

  • 00:02:52 above GPU in your computer, you can use Stable Diffusion 3 with Stability AI developed Stable Swarm UI.

  • 00:02:59 It is just working amazing. If you have a better GPU, of course, it will be faster, it will use more VRAM,

  • 00:03:05 but it works with even under 6GB VRAM having GPUs when the best configuration is used.

  • 00:03:12 Because at the backend, Stable Swarm uses ComfyUI and also I will show you how

  • 00:03:17 to get more information in the logs panel as well.

  • 00:03:20 As a final thing, I suggest you to not skip any part of this video. Every part

  • 00:03:26 of this video will be super important. I will show additional features as well,

  • 00:03:30 like queue system. So do not skip, watch this video even if you are going to use this

  • 00:03:36 application on cloud services still you need to watch this video to learn how to

  • 00:03:40 use Stable Swarm UI. So I have prepared this amazing public post that all of the

  • 00:03:46 links of this tutorial will be in. This post will get updated. Moreover, this is a

  • 00:03:52 public post so that you can see the content of this post without even being

  • 00:03:58 the free member of my Patreon account.

  • 00:04:01 It is easier for me to manage. So I am making the post details here. Before we begin

  • 00:04:06 installing Stable Swarm UI, I want to show you some of the features of Stable Diffusion 3.

  • 00:04:12 So you see there is Stable Diffusion 3 official page link here. Let's open it. This is the link.

  • 00:04:17 This is on Hugging Face, Stable Diffusion 3 medium model. When we scroll down a little bit,

  • 00:04:23 we can see the model architecture of Stable Diffusion 3 What is different with Stable Diffusion 3? You see

  • 00:04:30 it uses 3 models: Clip-G, Clip-large and T5. The power of Stable Diffusion 3 comes from T5 XXL.

  • 00:04:41 Also a better VAE it has. Moreover, U-Net is now multiple MM-DiT blocks which are Multi-Modal

  • 00:04:50 Diffusion transformer blocks. Moreover, when you click files and versions in this repository you

  • 00:04:56 will get to this page. Now this page is important to understand.

  • 00:05:00 You will see that there are 4 different safetensors files, medium safetensors, including Clips safetensors,

  • 00:05:08 including Clips and T5, fp16 version safetensors and fp8 version safetensors files.

  • 00:05:16 So what is these files? The medium safetensors file is the raw model of Stable Diffusion 3 medium model.

  • 00:05:24 What does it mean? It means that it only contains these MM-DiT blocks, Multi-Modal Diffusion transformer blocks and also VAE.

  • 00:05:34 So it doesn't include any of these text encoders.

  • 00:05:37 This version include Clips, includes also the Clips which are text encoders.

  • 00:05:43 When we go to the text encoders folder here, you will see each text encoder individually. Clip-G, Clip-Large, T5-XXL.

  • 00:05:52 So this is the model that contains both of the Clip-G and Clip-Large models.

  • 00:05:57 There is also include Clips plus T5-XXL fp16 version. That model is including these two plus T5-XXL fp16 and

  • 00:06:07 these are individual files. However, for this tutorial we are not going to

  • 00:06:11 download any of these. We are only going to download a SD3 medium safetensors file

  • 00:06:16 and the rest will be automatically downloaded by Swarm UI. I just wanted to

  • 00:06:20 give some extra information. So how we are going to install this amazing Stable

  • 00:06:25 Swarm UI? When you click this link you will get to the official repository

  • 00:06:30 page of Stable Swarm UI. Installation on Windows is so easy. And also I will make another tutorial

  • 00:06:37 for how to install and use on cloud services if you don't

  • 00:06:42 have a powerful GPU. So I will show you Massed Compute, RunPod and a free Kaggle account if it works,

  • 00:06:49 but it will be in another video.

  • 00:06:51 Still, you need to watch this tutorial to learn how to use the amazing Web UI application.

  • 00:06:57 So in this repository page, go to the installing on Windows section here. For this to work,

  • 00:07:05 we need to have installed Git and

  • 00:07:07 .NET 8 first. So click these links, you will see that Git for Windows. Just download it and then next,

  • 00:07:14 next, next, that is it, you don't need to do anything else. And download .NET version 8

  • 00:07:19 is important. You see there are different installers for Linux, Mac OS, Windows, and all.

  • 00:07:26 So which one you need for this, you need to install Windows

  • 00:07:30 x64. So click this Windows x64 link here, you will get to this page and your download should

  • 00:07:38 have started. If it didn't start, you can click this direct link to download it, then open it

  • 00:07:45 from your downloads folder, you will get to this page, click install, it will ask you permission,

  • 00:07:50 click Yes, then it will install it and that's it. You don't need to have installed Python for

  • 00:07:55 Stable Swarm UI because Stable Swarm UI works with

  • 00:08:00 ComfyUI at backend and it installs an isolated Python version to work with it. So it installs a

  • 00:08:08 portable Python version. And you don't need to have system wide installed Python for Stable Swarm UI.

  • 00:08:15 Since I already had this, it reinstalled it and close and we are ready after you have installed

  • 00:08:22 Git and .NET 8. How you can verify the Git you have installed, just open a CMD window and

  • 00:08:28 type Git and you should get a message like this. Okay, then what we are going to do,

  • 00:08:33 all you need to do is download this

  • 00:08:35 bat file, click here to download, then copy that bat file, right click and copy or you can cut,

  • 00:08:42 then move into the disk drive where you want to install, I will install it into my R drive,

  • 00:08:47 do not install it into downloads, music, documents, users, install it directly into a drive.

  • 00:08:54 Also do not install it onto cloud drives like One-drive, then in here,

  • 00:09:00 make a new folder where you want to install. When making a new folder, do not use space characters.

  • 00:09:06 So I am going to make the name as SwarmUI like this and enter inside that folder and paste the

  • 00:09:13 install Windows.bat file here. You see my name doesn't have any space characters, it is directly

  • 00:09:19 inside this drive folder. You should follow these instructions, then double click the bat file,

  • 00:09:25 you will get to this screen likely. Click more info and run anyway. Then it

  • 00:09:29 will clone and install everything automatically for you after this. You don't need to do anything else.

  • 00:09:36 It will start a webpage for installation like this. You see Stable Swarm UI installer. Click agree. Click customize settings.

  • 00:09:43 Click next. You can select your theme. I am going to use modern light for this tutorial. Click next.

  • 00:09:51 Then don't change anything here. Click next. Okay, in here we are going to use ComfyUI local. Click next.

  • 00:09:58 You can pick the models that you want.

  • 00:10:01 You don't need to download anything if you already have the models on your another folder.

  • 00:10:06 But I am going to download Stable Diffusion XL 1.0 base model for this tutorial. Click next.

  • 00:10:12 And then, yes I am sure, install now.

  • 00:10:14 So it is going to install ComfyUI as a backend automatically into an isolated python environment.

  • 00:10:21 And these are the other settings. You can change all of the settings later.

  • 00:10:25 You can see the progress here. It is an amazing installer by the way.

  • 00:10:28 During the installation if you get this error it means that there is a problem with the remote server. So you

  • 00:10:36 can reset your computer, reset your internet modem, and restart the installation. This is

  • 00:10:42 an internet-related problem. So therefore, I have closed the installation and I am restarting

  • 00:10:49 the installation. So I also deleted the already generated folders. This time, I have enabled

  • 00:10:55 the Warp VPN. This is the VPN of Cloudflare, a free VPN.

  • 00:11:00 And my installation became many times faster right now.

  • 00:11:04 It was previously 600 kilobits per second, and now I am downloading with 36 megabytes per second.

  • 00:11:12 You can download Warp from just typing Warp to Google and you will get to this page

  • 00:11:18 you see Warp just install it and start it.

  • 00:11:20 If you get errors during the installation, and it will become many times faster to install.

  • 00:11:27 So the installation is continuing.

  • 00:11:29 It already passed the step 1 and I had an error in this step 1 previously. Alright, now it is downloading

  • 00:11:36 the SDXL base model since we have picked it but if you already have the models, you don't need to

  • 00:11:42 re-download them. I will show you how to use your existing models as well. So the installation has

  • 00:11:48 been completed. This page automatically started and opened. You see it is saying that backends

  • 00:11:54 are still loading on the server and now it is gone. You can also see the

  • 00:12:00 progress in the CMD window. You see it started on 7822 port and this model not found is not

  • 00:12:08 important. Let's close it. So this is the interface to use the Swarm UI. I know it

  • 00:12:14 may be a little bit overwhelming at first, but don't worry. Also, it didn't obey our selected template. First, make

  • 00:12:23 a few settings then start downloading the Stable Diffusion 3 and start using it. First of all, let's go

  • 00:12:29 to the users from here in this menu in the user settings. I am going to change

  • 00:12:35 my theme. Okay, let me zoom in. Here. So you can change the themes from here. I am

  • 00:12:40 going modern light and that's it. Then you see there are so many other settings that

  • 00:12:47 you can read and change them. There is one other setting that I am going to

  • 00:12:51 change which is save format of generated images. I am going to use PNG because PNG is the loseless data.

  • 00:13:00 So it is best and you see it's already saving the metadata.

  • 00:13:04 Also, you will notice that there are question mark icons here.

  • 00:13:08 So when you click these question mark icons, it will show you the description of what each thing does.

  • 00:13:15 You see each one is like this, so you can click them to see the information regarding each option.

  • 00:13:21 This is the only settings that I'm changing: PNG output and the template. Okay, let's save.

  • 00:13:28 Then first of all we need to download Stable Diffusion 3. To download it, go to the utilities tab here

  • 00:13:35 and in this section, you will see model downloader. So this is a very convenient way of downloading

  • 00:13:40 models from Hugging Face or CivitAI. So return back to the Patreon post, and in here you see

  • 00:13:47 SD3 download model. Right-click and copy link address. Return back to the model downloader,

  • 00:13:52 paste it, and it says that URL appears to be a valid Hugging Face download link and which

  • 00:13:58 name you want to save it. Let's say Stable Diffusion 3 medium and click download and it will start

  • 00:14:03 downloading that model into the accurate folder. This is the base model. If you are downloading different

  • 00:14:09 stuff, you need to select them like LoRA, VAE, embedding, control net, and where are the models

  • 00:14:14 saved? Where should we save it? When you click server and server configuration, you

  • 00:14:21 will see the path settings related to the file paths. There is a model root like models

  • 00:14:27 you can give your Automatic1111 Web UI model

  • 00:14:30 root as well. SD model folder. This is where the models are put. You see the root folder is being

  • 00:14:37 the models folder here, so it includes all of the models. Then SDModelFolder is the Stable

  • 00:14:44 Diffusion folder where the safetensors files are downloaded. Then there is also VAE folder here.

  • 00:14:52 Currently, the Stable Diffusion 3 has an embedded VAE, so we don't need external VAE.

  • 00:14:57 And then there is also SD Embedding Folder, SD ControlNets Folder, upscale models folder, TensorRT folder.

  • 00:15:05 So you can set the folder paths from here.

  • 00:15:08 You can give your Automatic1111 Web UI model folder as well.

  • 00:15:12 But once you do that, change and save, then you should restart the Stable Swarm UI to

  • 00:15:19 not get any errors. Okay, let's return back to model downloader and it is done.

  • 00:15:24 Then let's return back to the generate. This is where we generate the images.

  • 00:15:29 How we use this interface. First of all, let's go to the models tab here. This will display

  • 00:15:36 our currently available models. You see Stable Diffusion XL base is here, then I will

  • 00:15:42 click this refresh button, and it will refresh the interface and you see Stable Diffusion 3

  • 00:15:47 also has arrived. So you need to pick your model from either here or from here. You

  • 00:15:53 see left model selection dropbox. Let's select the model from here. And when I click and select here, you see

  • 00:15:59 it is already selected from here as well. Then there are core parameters.

  • 00:16:04 Images and each core parameter has a description like this. How many images we want to generate.

  • 00:16:09 Let's generate 3 images. Do you want a static seed or not?

  • 00:16:13 When you set it as minus 1, it will generate a random seed for every time. And number of steps.

  • 00:16:21 I find that 40 steps is best. CFG scale.

  • 00:16:25 When you click this icon, you can see the effect of CFG scale. I prefer

  • 00:16:29 the 7 which is default. Variation seed is used to slightly change the output of the image.

  • 00:16:37 You can see that, variation seed and variation seed strength. Currently, we don't need that.

  • 00:16:42 Currently, we don't need anything else. We just need to set the resolution because

  • 00:16:45 this model is 1024x1024 pixels by default like the SDXL base model.

  • 00:16:52 So I am going to set this aspect ratio custom from here to set any width and height.

  • 00:16:58 You can also set this as 1:1 and you see it's automatically 1024 to 1024. And there is also sampling.

  • 00:17:06 I find that UniPC is the best sampler for both SDXL and Stable Diffusion 3. SD3 scheduler normal is best.

  • 00:17:16 I have done a lot of testing before making this video. And SD3 text encoders. Now this is important.

  • 00:17:22 When you don't enable anything, it is going to use Clip text encoders,

  • 00:17:27 which I have explained it in the beginning. However, the true power comes from T5.

  • 00:17:32 So first of all, let's make a test with Clip only text encoder.

  • 00:17:37 Then I will show you the difference of other text encoders. Seamless tileable, you see this is what it does.

  • 00:17:43 So always click these icons to see what they do.

  • 00:17:46 Initial image, this is image to image of Automatic1111 Web UI, we don't need it yet.

  • 00:17:51 Refiner is used to improve image after it is generated, like upscaling.

  • 00:17:57 I will also show you that, so we don't need that right now. So all other options are here.

  • 00:18:02 We don't need any of them right now because we are doing a beginning test.

  • 00:18:06 So I have copied this prompt from CivitAI and pasted it here then hit the generate to start generation.

  • 00:18:14 When you first time generate an image it is going to download the Clip models into the accurate

  • 00:18:20 folders because we didn't download them and we are using SD3 so it is downloading the Clip model

  • 00:18:26 and they are downloaded then it will show you that

  • 00:18:30 3 current generations, 2 running 1 queued because we are generating 3 images so we

  • 00:18:36 will get 3 images and where are these Clip models downloaded? They are downloaded inside here.

  • 00:18:41 Currently you need to have exactly same naming for to work otherwise it will redownload the models. So

  • 00:18:47 these are the naming. Not same as uploaded on Hugging Face folder and the images are generated.

  • 00:18:55 So how do we see images. You can click here to see the images also you see it is

  • 00:19:00 here or you can go to the image history and refresh and you can see every image here.

  • 00:19:06 So this Web UI image saving and reusing is extremely good. When I click here, it will

  • 00:19:12 show me generated image like this. This is a pretty good image, by the way. You see it's

  • 00:19:17 looking amazing and this is it. So this one is pretty good. This one is not very accurate

  • 00:19:23 and this one is like this. However, remember that this was only Clip only generation.

  • 00:19:30 So now we are going to test other ones. I am going to use the seed of this image.

  • 00:19:35 You see this one. So to regenerate this image, I will click reuse parameters here.

  • 00:19:41 So it is going to load everything of this image into the parameters.

  • 00:19:45 I am going to change the image size to 1. I am only going to generate 1 image.

  • 00:19:50 And you see the seed is set. So I can regenerate this image.

  • 00:19:54 When I click generate, it should regenerate the same image. Let's generate and see the result.

  • 00:20:00 By the way, currently on the CMD window, I am not able to see the generation speed.

  • 00:20:05 So it says it took 15 seconds to generate. And is it the same image? Let's refresh.

  • 00:20:10 Yes, this one and this one. So we regenerated. So how can I also see the image generation speed?

  • 00:20:17 Let's go to the server. And in here, go to the logs. And in here you see there is info,

  • 00:20:24 make it debug. And when you make it debug, it will also show you every step.

  • 00:20:29 And this is my it per second with Stable Diffusion 3.

  • 00:20:33 And how much VRAM it is using.

  • 00:20:35 To see the VRAM usage, I have opened a CMD window and I will type pip install nvitop.

  • 00:20:42 For this to work, you need to have Python 3.10 installed.

  • 00:20:45 Ok, it is installed. Then just type nvitop and it will show me the memory usage.

  • 00:20:51 So, currently I am using 8.7 GB. However, this is not the what SD3 uses. To calculate it, I will

  • 00:21:00 restart the Stable Swarm UI. So let's close this. And this is my VRAM usage, you need to try to reduce

  • 00:21:08 your VRAM usage before starting to use the application, you can reduce this as low as 500 megabytes on

  • 00:21:14 Windows 10. So we are using 2.2 gigabytes before starting the Stable Swarm UI. So let's go back to

  • 00:21:22 the folder, let's just launch Windows.bat file. This was our parent folder. And this is where the launch.bat file

  • 00:21:30 exists and it reloaded the UI then go to the image history refresh let's click

  • 00:21:36 this reuse parameters so everything is set and generate and let's see the VRAM

  • 00:21:43 usage. So it was 2.5 gigabytes. So currently we are using 9 gigabytes VRAM

  • 00:21:50 when generating the image. We should also see the peak. Okay the peak was 11.5

  • 00:21:57 gigabyte when it was decoding the latent image with the VAE. So it was around 11.5 gigabyte. So in total,

  • 00:22:05 it uses around 9 gigabyte VRAM. Now let's see the difference of the other text encoders. I am going to

  • 00:22:12 try only T5. T5 is extremely powerful. However, it is also going to use more VRAM on my machine.

  • 00:22:19 If your machine has a lower VRAM, I think it is going to load it onto the RAM memory

  • 00:22:25 instead of VRAM. So you still should be able to use it. Just try and see. Okay.

  • 00:22:29 Let's generate this time it will download the this text encoder model. By default it

  • 00:22:36 is downloading the fp8 version not fp16 version and it is using that. However you can also

  • 00:22:43 use fp16 version. How you can use it? You can go to the files and versions that I have shown

  • 00:22:49 you in the beginning and download this text encoder model. This is fp16 version. After downloading

  • 00:22:56 it move it into the models folder, inside

  • 00:23:00 clip and paste it here. Then you need to rename it exactly as this one. You see

  • 00:23:06 this is the rename that you need otherwise it will not work. Currently

  • 00:23:09 this is Stable Swarm UI requirement, this is how it works. However you probably

  • 00:23:16 will not gain anything from it because FP8 version is working as good as FP16

  • 00:23:22 version. So we are waiting the text encoder to be automatically downloaded

  • 00:23:27 by the Stable Swarm UI. It is downloading pretty fast

  • 00:23:30 right now and let's watch the VRAM usage as well. Since Stable Swarm UI is using

  • 00:23:35 ComfyUI it has huge amount of performance tweaking, performance

  • 00:23:40 improvements so if your VRAM is not as high as mine it will probably load it

  • 00:23:45 into the RAM memory to make VRAM optimizations and it still should work.

  • 00:23:51 You should try it and this was the VRAM usage on my computer with only T5 and

  • 00:23:57 this is the result we got. So how we can compare?

  • 00:24:00 This is the result with T5 and this is the result with clip only version. So T5 versus clip only.

  • 00:24:09 So T5 only versus clip only images.

  • 00:24:14 Now let's regenerate it with the best configuration which is, let's go to here, which is Clip plus T5.

  • 00:24:23 So this is the best configuration and let's generate. By the way, we need to decide

  • 00:24:29 its accuracy we need to look at the prompt and see how well it is matching to the prompt.

  • 00:24:36 So this is a very detailed prompt. When you're using T5, you can write a very detailed prompt

  • 00:24:42 because this text encoder is extremely powerful. And you see when I use both of them, I get

  • 00:24:48 even more amazing image like this. It is very detailed. Of course, there are some mistakes,

  • 00:24:54 but it is very powerful. I can generate several images and pick the very best one. So for

  • 00:24:59 example, let's generate 10 images with this configuration and see

  • 00:25:04 the result. I'm going to click here to set the seed random and let's generate 10 images. You see it

  • 00:25:09 says that 10 current generations, 2 running, 3 queued. And we can see the progress in the server,

  • 00:25:15 in the logs. Let's look at the speed of generation when we use the best text encoder. Let's make

  • 00:25:22 this debug to see them. Let's scroll down. So currently I am getting 2.5 it per second on

  • 00:25:30 RTX 3090 GPU, it is using 18GB VRAM currently on my system. I have 24GB with RTX 3090.

  • 00:25:40 And it is very good speed, 2.5 it per second. This is a very decent speed,

  • 00:25:45 considering that we are using a very powerful text encoder. Let's go back to the generate.

  • 00:25:50 We can also see how it is being generated. And we can see the generated images already.

  • 00:25:56 Okay, these are very, very good, very high detail images. Wow, this one is also really good, looking really good.

  • 00:26:04 And since this is a new model, we need to figure out how to prompt it accurately.

  • 00:26:08 This is just a 2 billion parameters model. Hopefully, Stability AI will also release much more powerful 8 billion

  • 00:26:16 parameters model. And it will be hopefully many times better than this model.

  • 00:26:21 And also there will be community fine-tuned models.

  • 00:26:25 They will be very powerful with these text encoders and with the VAE

  • 00:26:30 of the Stable Diffusion 3 because the VAE of the Stable Diffusion 3 is many times

  • 00:26:35 powerful what I mean by that? In this research paper there is a VAE channels comparison. When

  • 00:26:42 you scroll down you will see that 4 channel VAE able to regenerate this image, 8 channel

  • 00:26:49 is able to regenerate this image, and 16 channel VAE is able to regenerate this image and SD3

  • 00:26:57 Stable Diffusion 3 is using 16 channel

  • 00:26:59 VAE. So therefore, its VAE is extremely powerful. And if you wonder what is Stable Diffusion

  • 00:27:06 VAE, this is the description of it. Just pause the video and read it if you want. Okay, our

  • 00:27:12 10 images are generated. Let's look at them. So this one, okay, this one is pretty cool.

  • 00:27:19 This one, this one, and these images. Now how we can further improve this image, we

  • 00:27:27 change how we prompt and we can also

  • 00:27:30 upscale this image. I also have done some upscaling testing and now I will show you how to upscale your

  • 00:27:37 images to get even better quality results. So when you return back to this page you will see that

  • 00:27:43 there is Helaman upscalers with thumbnails. When you click here you will get to this page open

  • 00:27:50 model db maybe you know already this website where the upscaler models are listed. You can download

  • 00:27:58 the upscaler that will be useful for you. And you can see which upscaler is working on which kind of

  • 00:28:05 images. You see there are so many different upscalers for each task. I'm going to use a new upscaler.

  • 00:28:12 You will see that there is download very best new upscalers released sorted by name. When

  • 00:28:18 we go to this link, this will open this guy's GitHub page and you will see the latest released

  • 00:28:25 upscalers. There is 4x real web photo. This is a very good upscaler. I will download this one.

  • 00:28:31 When you click that link, you will get to this page and there are some examples of upscaling like this.

  • 00:28:38 In the very bottom of the GitHub page, you will see this link. 4x real web photo version 4.pth.

  • 00:28:48 This is the file that we need. Just download it. There is also a safetensors version.

  • 00:28:53 I wonder if it is working. Let's also download it and we can test both of them.

  • 00:28:56 You can go to your downloads folder, cut, move

  • 00:29:00 into the models, inside StableSwarmUI into the upscale models and paste it there.

  • 00:29:07 You see this is the folder path where you need to put. Then in here we will use the refiner.

  • 00:29:14 We will enable it and we are going to change the refiner upscale method. Okay it is not visible yet

  • 00:29:21 so I'm going to restart to get them. Just close this CMD window go back to the StableSwarmUI

  • 00:29:28 launch Windows.bat file. Okay, it is reloaded very fast. And whichever image you want to upscale, for example,

  • 00:29:36 let's upscale maybe, let's see. Okay, let's say let's upscale this image. So click reuse

  • 00:29:44 parameters. So it did reset every parameter. Then in here, I will use this. Okay, it also

  • 00:29:51 supports safetensors very good. This safetensors file, this is a safe file, it cannot contain any malicious code.

  • 00:29:59 Refiner control percentage. Now this is very important. This is same as denoising that

  • 00:30:05 we have in Automatic 1111 Web UI. So this will decide how much change you want. I will make

  • 00:30:12 this 50% and you see there is refiner method post apply, step swap, step swap noisy. When

  • 00:30:19 you click this icon, you will see the difference of them. Refiner steps how many steps you

  • 00:30:23 want to use, you can set it. So I will just set it as 40

  • 00:30:28 as well like this and you can change the refiner model.

  • 00:30:31 I am going to use the same model as the base model, but you can use any model for refinement

  • 00:30:36 process. What this does this do is, based on your refiner upscale method, it is going to use that new

  • 00:30:43 model during the upscale. Latent and pixel are different. You can test them and see the differences.

  • 00:30:50 And then there is also refiner do tiling. If you do tiling, it will make seaming. What does seaming means?

  • 00:30:57 Seaming means that, let's say, it will generate multiple heads of this image.

  • 00:31:03 However, it will also reduce the VRAM usage.

  • 00:31:05 But with this upscaler, I am not going to use refiner to do tiling.

  • 00:31:09 And I am going to upscale 1.5 times. You see there is the upscale times.

  • 00:31:14 Let's make it 1.5 and let's generate. So currently, it is generating 10 images because it is set as 10

  • 00:31:22 last time. I am going to just cancel the operation by clicking this X icon.

  • 00:31:26 Then I will set the number of images to 1. How did I understand?

  • 00:31:30 Because it's shown here. Let's hit generate. You see one current generation, one running and now

  • 00:31:36 first it will generate base image, then it will use this upscaler to upscale. Since I am using the

  • 00:31:43 both of the CLIP and T5 text encoder it is using a lot of VRAM initially, then it will start

  • 00:31:49 upscaling process. Okay, it didn't start. Why? Because I didn't enable it. Don't forget to enable these

  • 00:31:57 features. Let's enable it and generate.

  • 00:32:00 And it will generate. So where are these images saved? They are saved inside

  • 00:32:04 output folder, inside local, inside raw, and the date. And we can see all the

  • 00:32:10 generated images are saved here. Also, there is image history. You can search,

  • 00:32:16 you can sort by name, date, reverse sort. You can see the depth like this. This is

  • 00:32:21 a very, very convenient way to use. Very, very good. There are also presets,

  • 00:32:26 wildcards. I will also show them in this tutorial. Okay, the upscaling process started.

  • 00:32:32 So it is using the same amount of VRAM, nothing new.

  • 00:32:35 Let's go to the server and see the speed of upscale. Let's go to the debug.

  • 00:32:41 By the way, when you use tiled upscaling, it is the same speed as the original because

  • 00:32:46 it tiles them into 1024 pixels and upscale.

  • 00:32:51 However, it causes some issues when you have a higher rate of refiner control percentage.

  • 00:32:57 And let's refresh here and the image has arrived. Okay, wow, just wow. Look at this quality.

  • 00:33:04 So let's open this in a new tab and let's also open the original image.

  • 00:33:09 This was the original. Let's compare. So this was the original image and this is the upscaled image. You see?

  • 00:33:15 Now I will show you a comparison on ImageSLI.com. Let's make a new album. Then let's add the images.

  • 00:33:21 So this is the first image and this is the second image. Let's upload. I like this comparison website.

  • 00:33:28 Okay, let's make this full screen.

  • 00:33:30 Okay, so the left one was the original image, low resolution 1024 and the right one is the 1536 upscaled.

  • 00:33:39 You can see how much detail, sharpness, focus, clarity is added. This is just mind-blowing.

  • 00:33:46 This upscaler is also very good. There are so many other upscalers, you can try them.

  • 00:33:51 You see amazing, just amazing. You can see how much difference it makes.

  • 00:33:55 You see, it is just mind-blowingly amazing quality. It is just amazing.

  • 00:34:01 Then there is also a link here to download all of the upscaler models.

  • 00:34:06 When you go to this link on the Patreon post, you will get to this page.

  • 00:34:10 In here, this developer uploaded all of the models as a zip file.

  • 00:34:16 You see, there are two zip files because he had to split them under 2 gigabytes.

  • 00:34:22 So download both of them and extract them into the upscaler models folder

  • 00:34:27 and you will be able to use all of the upscaler models.

  • 00:34:30 of this guy. So what other features this application has? One of the very best

  • 00:34:34 features of this application is presets. You can generate as many presets as you

  • 00:34:40 want and use them with one click. So when I click create new preset here, I can set

  • 00:34:45 the preset image like this and I can set what prompt I want. You see there is

  • 00:34:50 value. When you set value it is going to append new prompt that you have let's

  • 00:34:56 say this. So my new prompt will get here and this will be the rest of the prompt.

  • 00:35:03 Then there are other parameters that you can set. Currently it is going to use my set parameters here.

  • 00:35:09 So the sampling should be like here, you see. Then click save.

  • 00:35:14 Okay, we need to set a name, let's say 3D. And save, and it is set.

  • 00:35:20 Now I can change the parameters here like this. And when I click this, you see it says one

  • 00:35:27 and overriding zero parameters. And because it didn't set all the parameters accurately.

  • 00:35:33 So we need to select the parameters from here. Okay, these are the best parameters.

  • 00:35:38 And if you want refiner step to automatically applied, you can also select it.

  • 00:35:43 So let's also select it like this 0.5. Let's say refiner steps 40. Okay, and refiner upscale method.

  • 00:35:51 So you can pretty much set every parameters here and I will make it upscale it 1.5 times.

  • 00:35:58 You see like this refiner upscale and resolution is like this. Let's also set it to be perfect

  • 00:36:04 like this. I will also set the steps count and CFG scale. Okay, everything is set. Then let's just save

  • 00:36:11 it. Okay, it is set and saved. It also shows here and let's re-set it. Okay, it says

  • 00:36:18 overriding 13 parameters. You can see the overridden ones and let's just type a new prompt like a cat

  • 00:36:25 and generate. So it will put a cat into here value and the rest will be also applied.

  • 00:36:31 This is how the presets work. We are getting the cat, and it will also do refiner step

  • 00:36:37 because it is set in here. You can also scroll down with your mouse or here

  • 00:36:42 to see all the set parameters of this preset. And the first image is generated.

  • 00:36:49 You see, I am generating 10 images, so I am going to click here to cancel.

  • 00:36:53 There is also this arrow. When you click this arrow, you will see generate, generate forever,

  • 00:36:58 generate previews, interrupt current session, interrupt all sessions and other things.

  • 00:37:03 So you can also use generate forever to generate unlimited number of images.

  • 00:37:08 And this is the cat image we got. It is looking very good actually.

  • 00:37:13 And you will notice that there are some mistakes in the borders.

  • 00:37:18 This happens when you don't use tiled upscaling.

  • 00:37:22 However, when I do the tiled upscaling, it may cause some other problems. So let's say reuse parameters.

  • 00:37:30 So all the parameters of this is set and I will change the refiner do tiling here and generate.

  • 00:37:38 So it should fix the error at the borders, however, we may get seaming problem.

  • 00:37:44 And seaming problem is that it will repeat some of the subjects in the image.

  • 00:37:50 You see, it says that this can fix some visual artifacts from scaling but also introduce others, e.g., seams.

  • 00:37:57 Let's see if we will get seaming problem in this image.

  • 00:38:00 Okay, let's just wait. When doing tiled upscaling you will notice that it is split into

  • 00:38:06 the tiles like this and it is upscaling each tile then it will merge them all.

  • 00:38:12 So you can also see the progress here. Okay, it is done and still generating ten images so let's

  • 00:38:18 just stop it. And this is the image. I don't see a noticeable seaming in this image and you

  • 00:38:24 can see that those blurry lines at the borders are gone. So this

  • 00:38:29 fixes that issue. So let's make a comparison. Let's refresh the images history. This is

  • 00:38:35 without tiling. This is with tiling and this one is without tiling. Let's open them. So

  • 00:38:41 you see this is without tiling. We can notice these errors at the borders and this is with

  • 00:38:47 tiling and you see there are no such things. And when we compare both, they look both perfect.

  • 00:38:53 I think for this image with these settings, tiling worked better. So it is up to you to

  • 00:38:59 test and see in your case whether it is working good or not.

  • 00:39:02 Now I will show you another very cool feature. I have prepared 275 presets with their generated thumbnail images.

  • 00:39:12 So click here to go to that post and in this post, you will see that there are several things.

  • 00:39:18 First of all, I also have shared the styles file for Automatic1111 Web UI.

  • 00:39:24 Moreover, I am sharing the necessary files to prepare these presets.

  • 00:39:29 Let's download the zip file and then let's extract it into our downloads folder. Enter inside the extracted folder.

  • 00:39:38 You see there are python files. I will show you the content of those files as well.

  • 00:39:43 So this is the convert Fooocus csv to Swarm preset. This is how it is done.

  • 00:39:49 Then there is also convert Fooocus styles to the Automatic1111 Web UI style.

  • 00:39:54 This is how it is done. You just set the folder path, output csv file path.

  • 00:39:59 In here also we set the CSV file. It will be overwritten. And then there is also generate presets thumbnail.

  • 00:40:07 So this is how I generated thumbnails. For generating thumbnails, first of all, you need to generate all the thumbnails

  • 00:40:14 with the grid generation I will show. This is not mandatory.

  • 00:40:19 So, you see there is Stable Swarm UI presets json file which we are going to load it.

  • 00:40:24 Go to presets menu and click import presets. Then from here, click choose file.

  • 00:40:29 Enter inside your downloaded file and select this presets json file and open it.

  • 00:40:34 Then you can also enable overwrite existing presets if you have the same preset name which I don't have now.

  • 00:40:40 Ok click import and you see all the presets of the amazing Fooocus are now available on the Stable Swarm UI.

  • 00:40:50 So this is how I prepared and generated them. For example, let's generate this image with the dark fantasy.

  • 00:40:57 Ok, to generate it with dark fantasy,

  • 00:41:00 I'm going to just type a cat and click here, and that style is applied and hit Generate.

  • 00:41:07 By the way, this doesn't have currently upscale enabled. Okay, it is generating the image.

  • 00:41:13 We are still at 10 images because we reused the parameters.

  • 00:41:17 Moreover, it also applies the settings that you have, which are not conflicting with the applied style.

  • 00:41:25 Also, it is also applying the refiner step right now, because in this

  • 00:41:30 presets, there are no such things. Let me show you one of these presets. It has the prompt,

  • 00:41:36 it has the negative prompt, 40 steps, CFG scale 7, sampler is UniPC, scheduler normal, and it uses

  • 00:41:43 Clip plus T5 encoder like this. You can also edit this JSON file before import and change the

  • 00:41:52 settings as you wish. Okay, the image is generating and getting upscaled, and we got the first image.

  • 00:41:58 You see with this style,

  • 00:42:00 I was able to generate this amazing image, as you are seeing right now. It is a

  • 00:42:04 very, very good quality. Okay, let's just

  • 00:42:06 cancel it. Now. What other things there are? One another great feature of ComfyUI

  • 00:42:12 is that the model browser. When you click the models tab, it will list you all of

  • 00:42:17 the models. You see there are also

  • 00:42:20 categorical filtering as well. So when you click root here, it will show you all.

  • 00:42:25 When you click the folder name, it will show you under that folder. You can also see the folder structure

  • 00:42:30 here. You can also filter by name. There is also cards, small cards, big cards. So you can also

  • 00:42:36 change these as well. Thumbnails, small thumbnails. This applies to presets too, like big cards like

  • 00:42:44 this or small thumbnails or thumbnails. You can change both of them. And in image history, you

  • 00:42:50 can also change them to small thumbnails, big thumbnails, or big cards as well. So it is totally

  • 00:42:56 up to you to change these as well. It is very versatile. In the models you can also set

  • 00:43:04 image from the edit metadata, and in here you can say use this image and save

  • 00:43:10 so it will use that image as an image of the model.

  • 00:43:15 For some reason, it didn't save with an accurate size. Yeah, I'm going to tell this to the developer,

  • 00:43:22 so hopefully it will be fixed. Also, when you click here, you see there is also create TensorRT engine

  • 00:43:29 and set as refiner. Hopefully, in another tutorial, I will show you more detailed stuff regarding

  • 00:43:36 these as well. Okay, after I reported this error, the developer already fixed. So

  • 00:43:42 let's also try the fixed version. I'm going to update the application one more time. Let's

  • 00:43:47 go to the folder update Windows.bat file, you see one file changed, he already fixed

  • 00:43:54 it and done and then restart the application. Okay, let's return back here.

  • 00:44:00 Click here, edit metadata and use image, save. Okay, it says server has updated,

  • 00:44:07 so we need to refresh it. Okay, I did refresh the page. Let's click edit metadata, use image.

  • 00:44:15 And yes, you see it is already fixed, amazing. The updates and the communication between developer

  • 00:44:22 is just amazing with Stable Swarm UI. Now, what other things there are?

  • 00:44:27 First of all, I suggest you to read some of the documentation which is full prompting syntax. This is

  • 00:44:34 super important. When you click this link, you will see that advanced prompt syntax here. It explains everything weighting

  • 00:44:42 of the prompts, alternating, from to, randomization, wildcards, repeat, textual inversion embeddings

  • 00:44:51 LoRAs embeddings, presets, automatic segmentation and refining. This

  • 00:44:55 is extremely useful to inpaint faces. This is like after detailer extension.

  • 00:45:01 You see clear transparency and break keywords. So this is extremely important and when

  • 00:45:06 you go to the app folder here, features, you will also see other usages like control net. When you

  • 00:45:12 read here, you will see the control net. Hopefully I will make control net tutorial as well. I don't

  • 00:45:16 want to put everything into one tutorial because it would be very, very long and presets read me

  • 00:45:23 video. So make sure to read them. There is also docs app folder and you will see even more readme files.

  • 00:45:31 You should read them to learn them. What another things that you can do let's also do

  • 00:45:36 wildcard. Wildcard is simply putting random prompts. You see wildcards are list of random

  • 00:45:42 prompt segments one entry per line. Let's say random color like this and let's say blue, red,

  • 00:45:51 yellow. Okay and let's save and let's refresh. Okay random color arrived. So what we are going to do

  • 00:45:58 is a cat. So when I

  • 00:46:00 type this letter you see it shows all of the available wildcards so I'm going to use the

  • 00:46:06 wildcard that I have made which is random color let's type it. Okay. Okay I don't see random color

  • 00:46:13 here yet let's refresh maybe I need to restart. Okay I can't see it maybe I need to click here.

  • 00:46:18 Yes. After I clicked this it appended it into the prompt like this: a cat and

  • 00:46:25 random color. You need have different settings to get randoms each

  • 00:46:30 time so let's disable the refiner and let's generate 3 images with random seeding

  • 00:46:35 from here. Okay, everything set and generate and our 3 images are generated. Let's

  • 00:46:40 see the prompts used for each one of them. Okay, this one used original prompt a cat

  • 00:46:46 wild random color. To see the full details click the images and in the bottom we can

  • 00:46:51 see which is used. Okay, a cat blue. You see this was a blue randomization. Let's move.

  • 00:46:58 This was also a cat

  • 00:46:59 blue and this was a cat red. So this is how wildcards working. Just read the documentation

  • 00:47:07 and you will see. This interface, this web UI is extremely advanced there are so many

  • 00:47:12 stuff. So as a next step, I will show you very powerful thing which is inside tools.

  • 00:47:17 And in here, when you click here, you will see grid generator. This grid generator is

  • 00:47:23 equal to X/Y/Z plot of Automatic1111 Web UI. But this one is much more powerful, advanced and versatile.

  • 00:47:30 I will show you how to use it right now. Before showing you, what I want to do is,

  • 00:47:35 I want to put all of the upscalers into the upscalers folder.

  • 00:47:40 So, since we have downloaded them, let's cut them and move them into the upscale folder,

  • 00:47:47 which is inside Stable Swarm UI, models, upscale models here.

  • 00:47:52 Okay, you need to put both of the zip files like this from downloads and extract here,

  • 00:47:57 and it will extract all of the upscalers as a

  • 00:48:00 safetensors like this. You see. Let's sort them by size. Okay i will use the

  • 00:48:06 biggest ones and i will do some testing to see which one is performing better. Okay there are just so

  • 00:48:11 many let's also yes

  • 00:48:13 to all yeah it is done. So now i need to restart the Stable Swarm UI i will just

  • 00:48:19 do that let's close it let's return back and launch Windows.bat file and it is started. Okay. So

  • 00:48:27 what should we test. Let's upscale this dragon image,

  • 00:48:30 re-use parameters so all parameters are set and let's go to the tools. Click here,

  • 00:48:36 select grid generator. Now there are three options of output type I suggest

  • 00:48:41 you to use web page. This is the best one. This give you so many options. You can

  • 00:48:46 also just generate a grid image or just images as well. So when you use just

  • 00:48:51 images they will be just generated as an image and saved in the outputs folder.

  • 00:48:55 When you generate grid image it will just generate a grid image but when you

  • 00:49:00 generate a webpage it can continue to generation if you just interrupt it for some reason and

  • 00:49:08 you can filter so many different options to see. So you see output folder name will be given like this.

  • 00:49:13 I will just give a custom name myself. Let's say upscale testing like this.

  • 00:49:19 So it will be saved inside View/local/grids upscale_testing and there is continue on error

  • 00:49:24 so it will continue generation. Then when you click here you will see all

  • 00:49:30 of the options that you can select. You see there are so many options. So let's test several things

  • 00:49:35 to be able to compare. Let's test steps. Let's make it 20 and also , and 40. If you don't

  • 00:49:43 know how to type them, you can just click examples and it will fill with the examples.

  • 00:49:47 Let's just delete them. 20 and 40. Okay, then go to the refiner upscale, select it and click

  • 00:49:53 examples. You see these are the upscale resolutions. You don't need to set them if you set them here.

  • 00:49:59 So we don't need to set everything here. So what we are going to test,

  • 00:50:04 we are going to go Refiner, enable it. Refiner control percentage, actually I will test this too.

  • 00:50:10 And let's set the steps 40 from here, so it will use that.

  • 00:50:13 Refiner upscale 1.5, so it will use that. Okay, so what I am going to test is, Refiner upscale method.

  • 00:50:22 And when I hit the backspace in here, it will show me all of the options.

  • 00:50:25 You see all of the options are here. There are so many options, since we downloaded everything.

  • 00:50:30 So which options we want to test. Let's go to models, upscale models.

  • 00:50:35 And the biggest size are, for example, let's see. Yeah, 4X real web photo. This is a very good model.

  • 00:50:42 Let's select it. Okay, 4X real web photo. Okay, here you see 4X real web photo version 4, like this.

  • 00:50:51 Let's try 4X LSD. Okay, here. Let's put comma then, 4X, you can type it also LS.

  • 00:51:00 Okay, it lists all of them. I think this one. Yes.

  • 00:51:04 So I'm going to compare two methods, two different steps count. And I will also test refiner do tiling.

  • 00:51:11 Then in here I will click fill. So true and false is filled.

  • 00:51:15 Okay, so it is going to do how many testing it is going to do 2 multiply it with 2

  • 00:51:20 multiplied with 2, 8 different testing, because I'm going to generate one image. It

  • 00:51:25 will read all of the other parameters, whatever I set here.

  • 00:51:30 And just click generate grid and you see 9 current generations 5 queued waiting on model

  • 00:51:37 load we can also go to the server, logs and debug here and we can see what is

  • 00:51:45 happening. This is at the back end running with the ComfyUI, but this is saving us huge amount of time,

  • 00:51:52 effort. We don't need to use ComfyUI those annoying these nodes, so it is handling

  • 00:52:00 everything for us automatically. So let's go back to the server. You can see the what is happening here.

  • 00:52:06 ComfyUI back-end direct web socket request. 2.6 it per second. Let's go to the generate.

  • 00:52:12 9 current generations, 2 running, 3 queued. And let's see the server. Okay, so it is generating images.

  • 00:52:19 Let's go to the grids inside here. And inside output, inside local, inside grids. We can see the

  • 00:52:27 folder, upscale testing, and we can see the generated images. Currently none is generated because upscaling will take more time.

  • 00:52:37 Okay. Okay, it says that, yeah, completed generation 1 over 10, refiner do tiling True.

  • 00:52:43 So these were the settings and it is saved. Let's go to 201. You see, it shows upscaler method name

  • 00:52:51 and this was the upscaled image. Okay, the others are also getting generated.

  • 00:52:57 So we just need to wait and it is running. Let's say something happened and you are interrupted.

  • 00:53:04 So how you can continue? To continue, first of all, I will just cancel the operation.

  • 00:53:10 You see it is canceling. Yes, generation session interrupted.

  • 00:53:13 Then I will click load grid config and it will show you the history.

  • 00:53:19 Currently I only have one history of grid upscale testing, load grid config.

  • 00:53:23 Then make sure that you have the same output folder name to continue. It says output will override

  • 00:53:29 existing folder. Okay, it's fine. And hit generate grid. Now it should skip what it

  • 00:53:36 has generated and continue. We can see that on the CMD window. Okay, it says skipped one

  • 00:53:42 files because only one generation has been completed. So it will generate the remaining

  • 00:53:48 seven files. This is amazing. This is just amazing. I wish Automatic1111 Web UI also

  • 00:53:54 had this grid generator. This is just amazing. You can put many things here like

  • 00:54:00 X/Y/Z and any others. We are not limited to test only 3.

  • 00:54:05 I can add as many as here and I can compare all of them after it.

  • 00:54:09 I will show you, so this is just very best grid comparison tool available.

  • 00:54:15 Okay, so the grid generation has been completed. I had to restart application one time and continue

  • 00:54:22 because VRAM was full and because we have used very heavy upscaling methods.

  • 00:54:28 So how we are going to see this grid

  • 00:54:30 generation after it has been completed. Click here to open the folder and you see it is already opened.

  • 00:54:37 Now, the interface may be a little bit hard to understand in the beginning. When you click

  • 00:54:43 advanced settings, you will see that there are amazing options to hide each one of the parameter

  • 00:54:50 as you wish like here. Also, you can change which parameter will be displayed like. For example,

  • 00:54:57 when you set both of them to upscale method, it will show upscale method here and here.

  • 00:55:02 Refiner do tiling, it will change the order. So you can play with these orders,

  • 00:55:08 display different one of them as you wish. So let's set them. So this true is

  • 00:55:15 tiled and this false is not tiled. There is also the upscaler method is displayed here. Let's see.

  • 00:55:23 Okay. So this is not what we want. We need to change it to the version we want.

  • 00:55:28 Okay, now it displays steps on the left. It displays false and true and the upscaling method. Okay, this

  • 00:55:36 is the upscaling method, 20 steps and 40 steps, and we have tiled it True, and we have the other

  • 00:55:45 one. So this upscaler didn't work as we wanted obviously. Model 4x

  • 00:55:50 real web photo version 4. Okay, this is weird.

  • 00:55:54 because this one is perfect, so we have an error somewhere. Okay, the error is the number of steps.

  • 00:55:59 Okay, 20 steps resulted in this abomination, unfortunately, but 40 steps made this one, and this has tiled true.

  • 00:56:10 and now I will show you tiling true effect. You see there is eyes and another head here.

  • 00:56:17 So this is called a seaming. When you do tiled upscaling, you need to go with lower denoise strength.

  • 00:56:25 So the denoise strength is refiner control percentage. You need to go lower

  • 00:56:30 to prevent it. You need to go like 30% instead of 50%. However, when you don't do

  • 00:56:38 tiled upscale, which is the second option is here you see false. This false is the tiled

  • 00:56:44 upscale here. So I can also disable it from here, you see false and true. So when

  • 00:56:50 tiled upscale is false, we won't get that seaming. So in this image, actually the seaming

  • 00:56:56 is much more worse. You see, there are seaming here, seaming here. Seaming here, seaming here.

  • 00:57:01 And when that tiled upscale is disabled, the seaming is extremely reduced. However, now we can see the

  • 00:57:08 degrade in quality at the borders because we did upscale a huge resolution. But in the bottom, with 40 steps,

  • 00:57:18 it is looking many times better. 40 steps is a very good spot. You can go to 100 steps.

  • 00:57:23 It is working even better in more steps. So this is it. And then there is this third one,

  • 00:57:29 which is this another upscaling model, model for 4x LSDIRHAT hat. This is a very heavy model by the way,

  • 00:57:37 it requires huge amount of VRAM. This is the upscale with tiling,

  • 00:57:42 and you see there are a lot of seaming, repetition of the subject, the head is also generated here

  • 00:57:47 and also here. And in the bottom you see there is also a head here because of the tiling.

  • 00:57:53 When we don't do tiling, this is the result and this is the result. So how we can actually

  • 00:57:59 compare them. To compare them I'm going to remove the 20 steps. I'm going to remove the

  • 00:58:07 tiling true and now I can compare side by side two models and decide which one is looking

  • 00:58:14 better. Actually they are looking both of them very good. So this is the first and this

  • 00:58:19 is the second image. First and second image. Okay now it is more visible like this. You

  • 00:58:25 see. Okay in the right one there is also another nose here.

  • 00:58:30 you see and this is the one, so let's copy and use imagesli to compare. Okay,

  • 00:58:36 let's click new album.

  • 00:58:37 and can we paste image? No, so what can we do? So this is the model 4x real webp.

  • 00:58:45 Lets save this

  • 00:58:47 as model 4x webp and let's save this one as model 4x LSDIRHAT hat. Okay,

  • 00:58:56 like this let's add the images and upload. I really like this

  • 00:59:00 website, is really cool and we can see now. Okay, now the left one is 4X LSD. This is

  • 00:59:07 very heavy VRAM and the right one is 4X real webp. And I can say that yeah, LSD added

  • 00:59:15 more details. I can see that it is more sharp. You can see on the scale, for example, let

  • 00:59:20 me show you. The scale is looking more detailed. The eyes are also looking more detailed. Let's

  • 00:59:26 see in the close, yes, yes, it added more.

  • 00:59:29 more details, that's for certainly. It is more sharp, more focused. However, it also

  • 00:59:35 added some other nose here, but I think it is looking very very good. Yeah, this

  • 00:59:41 upscale is very very good, but it requires a lot of VRAM. So if you don't

  • 00:59:45 have a 24 gigabyte VRAM, you probably won't be able to upscale, but it is just

  • 00:59:49 amazing, amazing quality you see. So this is how you can use grid. You can use this

  • 00:59:55 for anything. I actually done even bigger tests. Let me show it was inside

  • 01:00:00 output, inside grids. This is my another installation. You see there are a lot of testing.

  • 01:00:05 Okay, this one is 140 megabytes. And you see I have tested a lot of different options here.

  • 01:00:11 I have tested the sampler steps, CFG value. So I can set them differently from prompt,

  • 01:00:18 text encoder, CFG steps. You can display any one of them as you wish. This is a really,

  • 01:00:24 really powerful tool. Just play with it to understand it better. Okay, one another very useful

  • 01:00:30 feature of the Stable Swarm UI is the image history. In the image history there

  • 01:00:35 is already a folder structure like raw outputs. They can be categorized as by the

  • 01:00:41 folder names like grid outputs, the grid testing names from here you see upscale

  • 01:00:46 testing or the raw outputs. Moreover, you can also filter them by a prompt that

  • 01:00:52 you used. Let's say cat and it will filter all the images that contains cat word.

  • 01:01:00 Actually it is doing not only a word search but it is doing a wildcard search.

  • 01:01:04 So delicate is also displayed. You see cat is here and delicate is here.

  • 01:01:10 So this is how useful it is. If you are remembering a word among your generations,

  • 01:01:17 you can search them very quickly from filtering. You can also sort them by date or by name.

  • 01:01:23 You can also reverse sort. You can click here and display all the contents of that folder.

  • 01:01:29 You can search among that folder.

  • 01:01:31 You can go to another folder and search in there. This is just amazing. This is very easy to use.

  • 01:01:38 Automatic1111 Web UI is missing this feature, but you see with Stable Swarm UI,

  • 01:01:43 you can quickly find your generated images, reuse parameters and regenerate them. This is just amazing.

  • 01:01:50 Just click and see them. The image history is just amazing feature of

  • 01:01:53 Stable Swarm UI. And if you are wondering how Stable Swarm UI is

  • 01:01:57 able to do that when you go to the

  • 01:02:00 output, go to the local, go to the raw, you will see that each folder has database files

  • 01:02:06 in the end. Image metadata and image metadata log. So it is using these files to improve

  • 01:02:14 image history management. But this is just amazing. You can also directly delete images

  • 01:02:20 from the gallery, you see there is an icon in every image. When you click this icon, you

  • 01:02:26 will see that star, open in folder, download, or delete

  • 01:02:30 option. So when you click delete, it will directly delete the image. This is a

  • 01:02:34 very convenient way to use as well. Moreover, you can star them too. So let's say I want

  • 01:02:41 to star this so I can star that image and your starred images will appear here. You see

  • 01:02:46 when I click it, I will be able to quickly find my starred image. This is amazing. You

  • 01:02:52 can also directly star images from here as well. For Stable Diffusion 3, upscale 2x is not working so you

  • 01:03:00 use the way that I have shown for upscaling. Also, you see the more option is available

  • 01:03:06 here as well, as well as this icon here for each image. Okay, now as a next step, I will

  • 01:03:14 show you how to use SDXL models and LoRAs on this amazing Web UI. Okay, for example,

  • 01:03:21 let's try pixel art XL on the base SDXL model. Let's go to the utilities and model downloader.

  • 01:03:29 Let's just paste the link. Okay, it says model. So we need to just go to the full link like this

  • 01:03:36 and just paste. Okay, it works. Yeah. So this is it. You see it automatically recognized

  • 01:03:41 it is LoRA model and just download. Sometimes these downloads may be behind a login. In

  • 01:03:49 that case, you have to manually download it. So after you download it, you just need to

  • 01:03:53 put them inside models, inside LoRA in here. You see, this is where you need to put them.

  • 01:03:59 So it is getting downloaded and done. Let's go to the generate option. Let's turn

  • 01:04:04 off the refiner and let's set

  • 01:04:07 this as a one. So I'm going to use SDXL base model. So from here or from here, you

  • 01:04:13 can select. Let's select SDXL base and let's say a photo of a realistic image of a snake with

  • 01:04:20 dragon horns, a simple thing. Then let's go to the LoRAs, refresh, it's already here and just

  • 01:04:26 click it and you see this LoRA is already

  • 01:04:30 selected. You can also put it in here by typing LoRA and the name of the LoRA.

  • 01:04:37 When I hit the space character, it automatically recognized. So you can

  • 01:04:41 use this as a LoRA activation or you can use the LoRA activation from here,

  • 01:04:47 which I prefer. Okay, it is activated and then if you want to set a strength, you

  • 01:04:52 see this strength of the LoRA is set here like this. So let's make the strength

  • 01:04:57 1 and it is enabled and let's

  • 01:05:00 just generate. Yes, oh, by the way, it is in the grid. So just cancel it, go to the

  • 01:05:05 tools. You need to disable the grid generator from here too this. Now it is generate. Okay, now let's

  • 01:05:11 see the generation. We can see that it is loading the model right now because we switched it to the

  • 01:05:16 official SDXL base model, which we downloaded in the beginning and it is generated. Okay, this is the

  • 01:05:23 image we got. A photorealistic image. Let's change the photorealistic image because it is affecting. Let's try like

  • 01:05:30 this and we can see the pixel art LoRA here. Okay, it applied the strength as. It

  • 01:05:35 says that LoRA zero is here. So you can apply multiple LoRAs and the weight, the power of the LoRA

  • 01:05:41 was 1 and you see

  • 01:05:43 it is turned into pixel art. How can I be sure? I can use the same seed from here

  • 01:05:49 and I can disable the LoRA and I can regenerate the image. This time I should get a

  • 01:05:56 different image, not a pixel art because the LoRA will not be applied. And this is the image without the

  • 01:06:02 LoRA. You see, completely different image.

  • 01:06:05 So I can enable the LoRA again and this is how it works. You can also

  • 01:06:10 enable multiple embeddings. There are also control nets, but for control net, I

  • 01:06:15 am planning another tutorial. The usage is same. You

  • 01:06:19 can use SD 1.5 based models as well. You just need to change the resolution.

  • 01:06:24 I prefer using UniPC in all models. It is working best in my opinion.

  • 01:06:30 You can always test them with the grid and there is also image to image tab. So you can also use

  • 01:06:35 image to image. For example, let's use an image to image. Okay, let's convert one of the images

  • 01:06:41 into this pixart.

  • 01:06:42 Okay, let's convert this image. So you see there is use as init. When I click it,

  • 01:06:47 it will use this as

  • 01:06:49 an image to image. This is 1536 to 1536. So let's convert this into pixel art. This is the denoising

  • 01:06:57 strength. Let's make it 60 percent. Okay,

  • 01:07:00 it's fine. You see there are other options as well. Mask blur, mask shrink grow. You can click these to

  • 01:07:06 read them. These are the default settings. Okay, let's make test. I didn't do much testing with this

  • 01:07:13 and generate. Now this will use this as an image to image input image and it will generate it.

  • 01:07:19 You can also use edit image. This is inpainting of the Automatic1111 Web UI. Okay, we got it. So

  • 01:07:26 this is how it is turned. We can increase the strength. Let's make it 70%.

  • 01:07:31 And let's go to server to see the speed of the image generation with SDXL model.

  • 01:07:36 You see it is 3.5 it per second, which is equal to my Automatic1111 Web UI actually.

  • 01:07:43 And yes, this is the image. We can even change it further, but it will become

  • 01:07:49 really, really different in that case. So let's try 80% perhaps. Maybe it is

  • 01:07:53 because the resolution is very big. No, it is resizing actually the resolution. Let's see. Yes, the resolution is

  • 01:08:00 resized. Let's try again with the 80%. Okay, this is the image we got. It is becoming like

  • 01:08:06 more pixel art. We can also try this other stuff. Let's try this, for example. Let's try

  • 01:08:11 generate. Okay, it didn't make any difference. There is unsampler prompt. Let's see what it

  • 01:08:16 does. This is powerful for controlled image editing. Yeah, it is not very important. Let's

  • 01:08:22 see mask behavior. Maybe simple latent will do better. Okay, so this is how you use image to

  • 01:08:30 image. Let's try 90% and yes, you see now it is much more like a pixel art. Of

  • 01:08:36 course, the image is changed, but yes, it is working. So this is how you use LoRAs.

  • 01:08:43 When you click the edit image, this is the inpainting screen. Now the

  • 01:08:47 inpainting is the weakest part of the Stable Swarm UI, but it is getting

  • 01:08:52 developed and improved with an amazing speed. There is a programmer genius behind this Web UI. He is working

  • 01:09:00 relentlessly. Actually, he didn't sleep like 30 hours after the SD3 release. So this is the

  • 01:09:06 in-painting. Mask the area wherever you want to change. You see there is a radius and opacity.

  • 01:09:12 I'm going to radious like this and you can also see the opacity of masking like this. So the

  • 01:09:17 opacity defines how visible it is from here and the radius defines the radius of the masking. You

  • 01:09:25 see the mask is generated here. You can just delete this layer and make another

  • 01:09:30 mask. Opacity doesn't matter, but the radius matters how much you are masking.

  • 01:09:35 So let's make this like this and let's see. So I'm going to mask this area and

  • 01:09:41 I'm going to use prompt of viper snake tail. The LoRA is enabled. Now how much

  • 01:09:47 change I want? Let's change it 70%. I disabled all of these options. This

  • 01:09:53 matters. If you change them, it changes how the image is inpainted. I'm going to use default

  • 01:10:00 settings and hit generate. Let's see what we are going to get. By the way, I am now sorting

  • 01:10:06 the generations by date, so I will see the last one in here. This is inpainting

  • 01:10:12 and the new image has arrived here. When I click this, you see this is the new image

  • 01:10:18 and which was the original image. This was the original image. Let's see. OK, yes, this was the original image

  • 01:10:25 and this is the inpainted image. Of course, this wasn't a big inpaint.

  • 01:10:30 but this is how you inpaint parts. You can also use segmentation feature to inpaint. It is amazing.

  • 01:10:38 You can automatically mask it actually. Let's try a better image for segmentation test. Let's say this. Yes.

  • 01:10:45 Let's say reuse parameters. Actually, a cat blue and segment eyes and let's make the eyes different color.

  • 01:10:54 Yellow cat eyes. Okay, and everything set. Let's generate. Okay, there is also

  • 01:11:00 pixel art selected. Let's also disable it because it is going to load the Stable

  • 01:11:04 Diffusion 3 and let's generate. Okay, it is generating the original image, then

  • 01:11:09 it is supposed to inpaint the eyes with segmentation. Yes, first image generated.

  • 01:11:16 You see it segmented the eyes and it is inpainting. Okay, the inpainting is

  • 01:11:21 happening here. Oh, it is generating three images, not inpainting. Let's see. Okay, it did segment the eyes,

  • 01:11:30 but I don't see it is different. Weird. What can we do? Let's read back the documentation.

  • 01:11:36 It says this is like restore faces. Okay, it says that there is creativity and threshold.

  • 01:11:41 Yeah, let's try the creativity and threshold parameters as well. Maybe we need to provide them.

  • 01:11:46 So like this, 80% and 0.5. This is for segmenting it. Let's make the number of images. Okay, let's generate.

  • 01:11:55 Now it will mask. Yes. Okay, now I can see it is changing the eye. You see.

  • 01:12:01 Okay, we typed yellow, so it is generating the same. So let's make this not yellow but blue cat eyes.

  • 01:12:08 Let's generate again. You see it did skip the initial image generation because it was already

  • 01:12:13 generated and it directly went to the eyes. Okay. Yeah. It didn't change much. Let's make the strength 1.

  • 01:12:20 Let's see the difference. Okay. Yeah. It is regenerating. Weird. The prompt is not working as we expected.

  • 01:12:27 Let's look at the documentation one more time.

  • 01:12:30 Okay, segment text here, weird, it draws the eyes but not working as expected.

  • 01:12:38 Okay, why this didn't work is that there was a bug. I have reported this bug in their official

  • 01:12:46 channel and the developer already fixed it and pushed it to the repository. That is why it is super

  • 01:12:54 important for you to update the application and use it whenever you are using. So how

  • 01:13:00 we are going to update this application. First of all, close the CMD window that you have,

  • 01:13:05 then go back to your installation folder, enter inside it and you will see here, update Windows.bat

  • 01:13:12 file here, double click it, it will update the application. You see the update is happening

  • 01:13:19 right now. It is updating all the new changes and the build succeeded. So the update has been completed

  • 01:13:25 and it will automatically close the window. Then let's restart the application.

  • 01:13:30 from here. So this bug is fixed. Let's reuse parameters from here and then which new feature

  • 01:13:38 has arrived? In the regional prompting there is a new parameter which is segment threshold max.

  • 01:13:45 You see the maximum mask match value of segment before clumping. I will enable this and I will make this

  • 01:13:50 0.60. Then I will regenerate the image and let's see what we are going to get.

  • 01:13:58 It is loading the model since we

  • 01:14:00 restarted the application. There is also segment mask grow. You see you can also grow the mask of

  • 01:14:05 the segmentation. There is also segmentation mask blur. Save segment mask so you can see the

  • 01:14:12 segmented mask to understand how your segmentation is working. I will also actually do that in the next

  • 01:14:18 generation. Okay now it is inpainting and now yes, I can see that it is changing the eye's color

  • 01:14:26 as expected. Not perfectly but it is changing, not likely

  • 01:14:30 before. Actually, I did spend a lot of time yesterday for this and you see now it is like

  • 01:14:35 this. So we probably need to do some more things. Let's also save the segmentation mask to see

  • 01:14:42 what kind of mask we are getting and let's also reduce the this threshold value to 30 percent and

  • 01:14:49 let's change the eyes' color as red eyes and generate. Okay now we can see the segmentation

  • 01:14:56 mask it is looking accurate you see both of the eyes are

  • 01:15:00 masked when we reduced to the threshold to 30 percent. This is amazing. You know in Automatic1111

  • 01:15:07 Web UI we cannot do this directly and yes the eyes' colors are changed as you are seeing. Let's

  • 01:15:13 say glowing red cat eyes. Another one. This is just mind-blowingly amazing and we can see how it is

  • 01:15:20 changing. By the way once I added red cat eyes you see it changed

  • 01:15:25 too much so I only need to mask the eyes and

  • 01:15:30 change my prompt. So probably I need to make like as glowing red eyes and nothing else.

  • 01:15:36 Maybe we can play with this value to like 0.5 and try again. So play with these

  • 01:15:42 values until you get a satisfactory result. This is how you segment a certain part with just the

  • 01:15:49 prompt and change the

  • 01:15:52 that part. Moreover, what we can do is we can use this in the image to image or inpainting. Let's

  • 01:15:59 use it in the inpainting. Okay, so I click this image and I will say edit image and in here

  • 01:16:05 I will reduce the editing power to the zero. You see init image is selected. I will make

  • 01:16:11 the init image creativity zero. So I will only change the eyes. Okay, let's say like

  • 01:16:18 this. Why am I doing this? Because this way I can generate different seeds, eyes and

  • 01:16:24 this should work by the way. Let's see if it will be naturally looking. Okay.

  • 01:16:30 It is getting generated and yes, so you see it only changed the eyes according to my prompt.

  • 01:16:37 This is how you can also inpaint quickly with just prompting your existing images.

  • 01:16:44 This is just a mind-blowingly amazing feature.

  • 01:16:46 So I can just make the seeding random, generate multiple images and pick the very best one. Let's see.

  • 01:16:53 So what you need to play with is these values. You see segment threshold max.

  • 01:16:58 You can also try this segment mask grow, segment mask blur and see which one is working best.

  • 01:17:04 There is also a segmentation model you can select here. So segmentation model will be used when you use this.

  • 01:17:10 Currently, we have no segmentation model. When you click here you can see optionally specify a distinct

  • 01:17:16 model like the YOLO version model. There is also a regional object inpainting model.

  • 01:17:21 You know there are a lot of things and you can join the discord channel of the Swarm

  • 01:17:27 UI and ask them in here. The community is amazing, the developer is amazing. We are getting different images right now.

  • 01:17:34 When I refresh here, I can see the generated images. Also, the masks are saved, you see,

  • 01:17:40 because we did set save segmentation mask. So these are the images. Yes, it is not perfect.

  • 01:17:45 I need to play with the parameters more, but now it is working.

  • 01:17:49 Okay, this one is looking perhaps the best among them. Okay, there is another one getting generated right now.

  • 01:17:55 So the beauty of using edit image inpainting is that we can generate different images with different seeds.

  • 01:18:03 Therefore, we can get a better image. Okay, this one is looking. Yes, very cool looking.

  • 01:18:08 You see, so this is how you can inpaint.

  • 01:18:11 Another very crucial information that I am going to show you regarding upscale is the using tiling.

  • 01:18:18 I have asked this to the developer of the Swarm UI and Stable Diffusion 3 architecture is currently

  • 01:18:25 incapable of upscaling images as we do with

  • 01:18:30 Stable Diffusion 1 or Stable Diffusion XL. So it is not able to generate images at a higher

  • 01:18:36 resolution than it was trained on. What does this mean? So you see this is a raw

  • 01:18:41 image that has been generated with these parameters. The image resolution is like this 1344 to 768 and when I

  • 01:18:51 upscale it without using tiling what happens? Let me show you. So I'm going to upscale it to the 1.5

  • 01:19:00 times with 50% denoise refiner control percentage and I will not do tiling and I will use this

  • 01:19:08 refiner upscale method. I think this is the very best upscale method 4xLSDIRplusC. It is from

  • 01:19:16 the repository that I have shown you. Let's generate this image this way and I will

  • 01:19:20 show you the effect of without upscaling this image with Stable Diffusion 3 and what is

  • 01:19:26 the effect of it. Okay we've forgotten the refiner so

  • 01:19:29 enable and generate. Don't forget to enable. This is a common mistake. So this

  • 01:19:34 was the original image and this is the upscaled image.

  • 01:19:37 You see at the corners of the image, it is now blurred. This is because Stable

  • 01:19:43 Diffusion 3 is not able to generate images bigger than the

  • 01:19:48 its resolution, its trained resolution. You will notice this error if you don't

  • 01:19:53 do tiling. And what happens when we do tiling, let's use tiling,

  • 01:19:57 we may get seams. And what does seams mean?

  • 01:20:00 I don't know if in this image we will get but I have an image here that I

  • 01:20:05 can show you and when you look carefully you will notice there is another head

  • 01:20:11 here you see the head is here but there is also another head here also there is

  • 01:20:15 another head deformed head is appearing here this is called as seams. This happens

  • 01:20:21 when you use a refiner control percentage like 50%. So how are we going to fix this?

  • 01:20:30 I will also show you that after this is completed. You see when you do tiling, it is splitting the image

  • 01:20:36 into the tiles, upscaling each part individually, then it is merging all of the parts and the image

  • 01:20:42 is generated. Now I don't see that error previously. And in this image, what problem we have,

  • 01:20:49 you see the text is broken, like here in this image, we didn't get much like the other image,

  • 01:20:56 but this is the error. So what you can do, you can reduce the

  • 01:21:00 refiner control percentage like 30%, 35%. Let's try 35% to see and let's also try 30%. Okay. You see you

  • 01:21:09 are able to queue different generations so you can just change the parameters and hit generate and

  • 01:21:16 they will get added to the queue like this. So this is another amazing feature. Currently 2 running

  • 01:21:21 1 queued. Moreover, if you have multiple GPUs you can go to the server, backends and in here you can

  • 01:21:28 define multiple GPUs.

  • 01:21:30 You see currently GPU id 0 is used so what you need to do is you need to add another

  • 01:21:36 backend. So I will add another ComfyUI self-starting backend. You see. This was the original and then I am

  • 01:21:44 going to copy paste the parameters here. Okay there is no extra arguments, auto update, enable

  • 01:21:50 previews, GPU id 1. Yes it is looking good and save. So it is going to start another backend

  • 01:21:58 in my second GPU which I should be able to see here and also let's check out the CMD window

  • 01:22:05 okay it says that it is started. So now my second GPU should be also used.

  • 01:22:10 It is already being used by the video recording and I think it will be available in the next generation.

  • 01:22:17 Let's go back to the Generate. Okay all three images are generated. Let's look at them.

  • 01:22:22 So the error is still visible, also visible, and visible here as well. Was it visible in the original image?

  • 01:22:29 That also matters. So we need to check that too. Let's go to the top and refresh. And

  • 01:22:35 let's see. Okay, I think, yeah, this was the original image. So it was also visible in the original image.

  • 01:22:41 So it is not a something new. But you can see that this is how the images are generated, how

  • 01:22:46 much the denoise is making effect. You see, I think this one is looking the best. And this one

  • 01:22:52 has, let's see the parameters. Okay, this one has, okay, this one has refiner control

  • 01:22:59 percentage 30%. So this was the lowest denoise strength of the Automatic1111

  • 01:23:07 equivalent, which is refiner control percentage. This way you can try. Since we

  • 01:23:12 started another instance, now let's see if it will run both of them. Okay, let's generate

  • 01:23:18 this 25% and let's generate 40%. So we have two current generations and two running. That

  • 01:23:25 means it will run on both of the GPUs. Okay. No, it didn't start yet.

  • 01:23:31 I think it is loading onto the second GPU right now. Yeah, probably. Let's just see what happens.

  • 01:23:37 Maybe we need to entirely restart the Stable Swarm UI.

  • 01:23:40 I am not sure, but this is the way of having multiple GPUs.

  • 01:23:45 By the way, it will not combine the power of two GPUs.

  • 01:23:49 It will just queue each generation on each GPU. So let's see the view logs.

  • 01:23:56 Okay, I guess the logs of this is not ready. Okay, it says self starting running backend one. Yes, now

  • 01:24:03 I can see, so it started, this is the second

  • 01:24:05 GPU logs. Let's go to the backend. Okay, this is one generated. Okay, let's make another try to see,

  • 01:24:12 for example, this image. Let's reuse parameters. It is using the tiling and 30 percent. Let's make this

  • 01:24:20 35 percent, generate, and let's make it 40 percent and generate. Now 40 generations are running

  • 01:24:28 14 queued because it is generating 20 images. And let's go to the server backend. Okay, let's update it. Yes,

  • 01:24:36 now it says loading one new model, loading in low VRAM mode. Why? Because my second GPU has

  • 01:24:42 lower VRAM. And now,

  • 01:24:43 yes, I can see that. So it queued both of the operations in both of the GPUs, and

  • 01:24:50 it is generating everything on both of them. I am going to cancel because I don't want my video

  • 01:24:55 frame to be dropped because I am using the video recording on the second GPU. Okay, let's just

  • 01:25:00 cancel all the operations. Yes, all the operations are canceled. This is how you use multiple GPUs

  • 01:25:05 as well. One another very important thing is the colors saturation. You see this color is extremely

  • 01:25:13 saturated. So when you get such saturated colors, what you can do? What you can do is reduce the CFG

  • 01:25:21 scale. Okay, so I clicked the reuse parameters from here, and

  • 01:25:25 I am going to reduce the CFG scale to 5, and let's see the difference.

  • 01:25:30 Okay, let's hit generate. These are all Stable Diffusion 3 experimentation that I am doing

  • 01:25:35 right now. So this is for Stable Diffusion 3. It depends on each model. Moreover, when you reduce the

  • 01:25:41 CFG scale, its prompt following will be also reduced. So CFG scale higher is better to follow prompting,

  • 01:25:51 but this is what we can have. Okay, it is also using my second GPU as well right

  • 01:25:55 now since I did set dual GPUs from the server

  • 01:25:59 configuration, from the backend. So it is generating two images at the same time right now on my

  • 01:26:06 both of the GPUs. Okay, the first image is generated. Let's refresh. You see the color

  • 01:26:12 saturation is much more degraded. However, the prompt following became much worse, so

  • 01:26:18 there is a trade-off between the two, but as you generate more images, you can still get very

  • 01:26:24 good images. You see the other image with the different seed looks much better.

  • 01:26:30 I'm just waiting it. So you can reduce the CFG scale, generate multiple images, and still get a very

  • 01:26:37 accurate prompt following images because Stable Diffusion 3 has a very powerful text encoders. Clip + T5. Okay,

  • 01:26:49 the second image is also generated, and this is the image. Not the very best image

  • 01:26:54 because it is according to the prompt. But you understand the logic. Let's cancel this. So what are the

  • 01:27:00 best settings for Stable Diffusion 3 that I have found? I think CFG scale 7 is working fine,

  • 01:27:05 but in some cases, it may over saturate the image. I'm using 40 steps, and for sampling, I am using

  • 01:27:13 UniPC, scheduler normal, text encoders Clip + T5, and for refiner upscale, I use 30 percent. I

  • 01:27:21 use refiner steps 40. I use refiner method post apply, refiner upscale 1.5, and tiling

  • 01:27:30 and as an upscaler model, I am using this one.

  • 01:27:34 But you can still compare all of them to see which one is most you are liking.

  • 01:27:39 Now I will show you one another amazing quick feature, which is applying upscale preset into your existing images.

  • 01:27:48 So let me show you my preset. This is my upscale preset.

  • 01:27:51 It also has initial image generation resolution. You can also disable it, but it is enabling refiner model upscale.

  • 01:28:00 So since I have used the same generation of the resolution, what I'm going to do

  • 01:28:06 is like this. I go to the image history, and these are my raw images. And let's

  • 01:28:11 say I liked this image, and I want to upscale it. So I go to the preset and

  • 01:28:15 select my preset and I just hit generate, and it will regenerate this image and

  • 01:28:21 upscale it. So this is another very convenient way of using presets to upscale

  • 01:28:27 your images quickly. The liked ones. You don't need

  • 01:28:30 to upscale everything. You can select the liked ones and then you can upscale them this way very

  • 01:28:35 easily. So we have shown a lot of stuff. There is also LoRA extractor, pickle to safetensors. You see,

  • 01:28:41 you can convert models, Clip tokenization. This is also very useful because some people using some

  • 01:28:47 rare tokens when doing training, like my awesome model, they are typing like this. However, this is

  • 01:28:55 actually a lot of, a lot of token. So when you type this here, you

  • 01:28:59 will see that this is actually three tokens, my awesome model, or they are using some random

  • 01:29:04 stuff like this. You see each one is getting into different tokens.

  • 01:29:09 That is why we are using a rare token as OHWX.

  • 01:29:13 This is a rare token, 48993 ID, and it's a single token.

  • 01:29:20 You see the letter "1" is this token, 272, "a" letter is 320.

  • 01:29:26 As you go bigger in numbers, it will be more

  • 01:29:30 rare usually. It is expected. Okay, I think we have shown everything. Okay, this is it

  • 01:29:37 for today. In the Stable Swarm UI Github repository, you will find their

  • 01:29:42 Discord link. Just type Discord and this is their Discord link. You can join and

  • 01:29:48 chat with them. You see this is their Discord. I am also active in their

  • 01:29:52 Discord. You should also join our Discord channel as well. You will see the Discord

  • 01:29:56 channel here in this post. The link of this post will be

  • 01:30:00 in the description of the video. You can also go to our Patreon exclusive index and read

  • 01:30:05 our Patreon posts, our scripts. Please also star our repository. So go to our repository

  • 01:30:12 from this link, star it, fork it, and watch it. And you can also be a sponsor of me. Hopefully,

  • 01:30:18 see you in future amazing stories. Ask me any question that you have and I will try to answer

  • 01:30:23 them. This Web UI is just amazing. Hopefully, I will also look at these other

  • 01:30:30 features it has and I will make further tutorials for this amazing Web UI. You see, there is

  • 01:30:36 also even regional prompting and other stuff. But for today, this is it. Hopefully, see you later.

Clone this wiki locally