-
-
Notifications
You must be signed in to change notification settings - Fork 363
Zero to Hero Stable Diffusion 3 Tutorial with Amazing SwarmUI SD Web UI that Utilizes ComfyUI
Full tutorial link > https://www.youtube.com/watch?v=HKX8_F1Er_w
Do not skip any part of this tutorial to master how to use Stable Diffusion 3 (SD3) with the most advanced generative AI open source APP SwarmUI. Automatic1111 SD Web UI or Fooocus are not supporting the #SD3 yet. Therefore, I am starting to make tutorials for SwarmUI as well. #StableSwarmUI is officially developed by the StabilityAI and your mind will be blown after you watch this tutorial and learn its amazing features. StableSwarmUI uses #ComfyUI as the back end thus it has all the good features of ComfyUI and it brings you easy to use features of Automatic1111 #StableDiffusion Web UI with them. I really liked SwarmUI and planning to do more tutorials for it.
🔗 The Public Post (no login or account required) Shown In The Video With The Links ➡️ https://www.patreon.com/posts/stableswarmui-3-106135985
00:00:00 Introduction to the Stable Diffusion 3 (SD3) and SwarmUI and what is in the tutorial
00:04:12 Architecture and features of SD3
00:05:05 What each different model files of Stable Diffusion 3 means
00:06:26 How to download and install SwarmUI on Windows for SD3 and all other Stable Diffusion models
00:08:42 What kind of folder path you should use when installing SwarmUI
00:10:28 If you get installation error how to notice and fix it
00:11:49 Installation has been completed and now how to start using SwarmUI
00:12:29 Which settings I change before start using SwarmUI and how to change your theme like dark, white, gray
00:12:56 How to make SwarmUI save generated images as PNG
00:13:08 How to find description of each settings and configuration
00:13:28 How to download SD3 model and start using on Windows
00:13:38 How to use model downloader utility of SwarmUI
00:14:17 How to set models folder paths and link your existing models folders in SwarmUI
00:14:35 Explanation of Root folder path in SwarmUI
00:14:52 VAE of SD3 do we need to download?
00:15:25 Generate and model section of the SwarmUI to generate images and how to select your base model
00:16:02 Setting up parameters and what they do to generate images
00:17:06 Which sampling method is best for SD3
00:17:22 Information about SD3 text encoders and their comparison
00:18:14 First time generating an image with SD3
00:19:36 How to regenerate same image
00:20:17 How to see image generation speed and step speed and more information
00:20:29 Stable Diffusion 3 it per second speed on RTX 3090 TI
00:20:39 How to see VRAM usage on Windows 10
00:22:08 And testing and comparing different text encoders for SD3
00:22:36 How to use FP16 version of T5 XXL text encoder instead of default FP8 version
00:25:27 The image generation speed when using best config for SD3
00:26:37 Why VAE of the SD3 is many times better than previous Stable Diffusion models, 4 vs 8 vs 16 vs 32 channels VAE
00:27:40 How to and where to download best AI upscaler models
00:29:10 How to use refiner and upscaler models to improve and upscale generated images
00:29:21 How to restart and start SwarmUI
00:32:01 The folders where the generated images are saved
00:32:13 Image history feature of SwarmUI
00:33:10 Upscaled image comparison
00:34:01 How to download all upscaler models at once
00:34:34 Presets feature in depth
00:36:55 How to generate forever / infinite times
00:37:13 Non-tiled upscale caused issues
00:38:36 How to compare tiled vs non-tiled upscale and decide best
00:39:05 275 SwarmUI presets (cloned from Fooocus) I prepared and the scripts I coded to prepare them and how to import those presets
00:42:10 Model browser feature
00:43:25 How to generate TensorRT engine for huge speed up
00:43:47 How to update SwarmUI
00:44:27 Prompt syntax and advanced features
00:45:35 How to use Wildcards (random prompts) feature
00:46:47 How to see full details / metadata of generated images
00:47:13 Full guide for extremely powerful grid image generation (like X/Y/Z plot)
00:47:35 How to put all downloaded upscalers from zip file
00:51:37 How to see what is happening at the server logs
00:53:04 How to continue grid generation process after interruption
00:54:32 How to open grid generation after it has been completed and how to use it
00:56:13 Example of tiled upscaling seaming problem
01:00:30 Full guide for image history
01:02:22 How to directly delete images and star them
01:03:20 How to use SD 1.5 and SDXL models and LoRAs
01:06:24 Which sampler method is best
01:06:43 How to use image to image
01:08:43 How to use edit image / inpainting
01:10:38 How to use amazing segmentation feature to automatically inpaint any part of images
01:15:55 How to use segmentation on existing images for inpainting and get perfect results with different seeds
01:18:19 More detailed information regarding upscaling and tiling and SD3
01:20:08 Seams perfect explanation and example and how to fix it
01:21:09 How to use queue system
01:21:23 How to use multiple GPUs with adding more backends
01:24:38 Loading model in low VRAM mode
01:25:10 How to fix colors over saturation
01:27:00 Best image generation configuration for SD3
01:27:44 How to apply upscale to your older generated images quickly via preset
01:28:39 Other amazing features of SwarmUI
01:28:49 Clip tokenization and rare token OHWX
-
00:00:00 Greetings everyone, in this massive tutorial I am going to show you how to install
-
00:00:05 Stable Swarm UI and start using Stable Diffusion 3.
-
00:00:09 Stable Swarm UI is officially developed by Stability AI and the developer is just amazing.
-
00:00:15 I will show how to download and start using Stable Diffusion 3.
-
00:00:20 I will show you advanced features of Stable Swarm UI, like segmentation and automatically inpainting any part
-
00:00:27 with just prompting and automatically masking. I will show you very best configuration for how to
-
00:00:33 use Stable Diffusion 3. I will show
-
00:00:35 you Wildcard feature of Stable Swarm UI. I will show you how to use LoRAs with Stable Swarm UI.
-
00:00:41 I will show you the amazing grid generator feature of the Stable Swarm UI. This is many times better
-
00:00:46 than the X/Y/Z plot of Automatic1111 Web UI. You will be amazed after you see the grid feature
-
00:00:54 of Stable Swarm UI. I will show you how to use model downloader to automatically download from
-
00:01:00 CivitAI or from Hugging Face. I will show you how to use multiple GPUs if you have at
-
00:01:05 the same time. I will show you amazing image history feature of the Stable Swarm UI.
-
00:01:11 It is just mind-blowing. You will see it. I will show you how to use image to image feature
-
00:01:15 of Stable Swarm UI. You see from init image. I will show you how to do inpainting feature
-
00:01:21 of the Stable Swarm from init image. I will show you models browser feature of Stable
-
00:01:26 Swarm UI. It is just amazing. You will see it. I will show you how
-
00:01:30 to download very best upscaler models and use them in your workflow.
-
00:01:35 Moreover, I will show you very best upscaling configuration for the Stable Diffusion 3.
-
00:01:40 It is a little bit different than the Stable Diffusion XL or Stable Diffusion 1.5.
-
00:01:45 I will show you advanced prompt syntax of the Stable Swarm UI. It is very powerful.
-
00:01:50 I will give you some information regarding the model structure of the Stable Diffusion 3.
-
00:01:56 Information regarding each one of the Stable Diffusion 3 model
-
00:02:00 files and the text encoders. I will also do a comparison of them. I will show you how to
-
00:02:06 contact the developer of the Stable Swarm UI, ask questions and how he fixes the Stable Swarm
-
00:02:13 UI immediately. This tutorial will be on Windows. I am also going to produce tutorials for cloud
-
00:02:19 services for who doesn't have GPUs on their computer. However, you still need to watch
-
00:02:24 this tutorial fully to learn how to use Stable Swarm UI. Moreover, the optimization of
-
00:02:30 Stable Swarm UI is just mind-blowing. Let me show you before we begin. Currently, these are my VRAM
-
00:02:36 usages. Because the models are loaded, let's generate images with the very best configuration
-
00:02:41 of Stable Diffusion 3 by using both of the text encoders and see the VRAM usage. You see,
-
00:02:47 it is running under 6 GB GPUs as well. So if you have a 6 GB or
-
00:02:52 above GPU in your computer, you can use Stable Diffusion 3 with Stability AI developed Stable Swarm UI.
-
00:02:59 It is just working amazing. If you have a better GPU, of course, it will be faster, it will use more VRAM,
-
00:03:05 but it works with even under 6GB VRAM having GPUs when the best configuration is used.
-
00:03:12 Because at the backend, Stable Swarm uses ComfyUI and also I will show you how
-
00:03:17 to get more information in the logs panel as well.
-
00:03:20 As a final thing, I suggest you to not skip any part of this video. Every part
-
00:03:26 of this video will be super important. I will show additional features as well,
-
00:03:30 like queue system. So do not skip, watch this video even if you are going to use this
-
00:03:36 application on cloud services still you need to watch this video to learn how to
-
00:03:40 use Stable Swarm UI. So I have prepared this amazing public post that all of the
-
00:03:46 links of this tutorial will be in. This post will get updated. Moreover, this is a
-
00:03:52 public post so that you can see the content of this post without even being
-
00:03:58 the free member of my Patreon account.
-
00:04:01 It is easier for me to manage. So I am making the post details here. Before we begin
-
00:04:06 installing Stable Swarm UI, I want to show you some of the features of Stable Diffusion 3.
-
00:04:12 So you see there is Stable Diffusion 3 official page link here. Let's open it. This is the link.
-
00:04:17 This is on Hugging Face, Stable Diffusion 3 medium model. When we scroll down a little bit,
-
00:04:23 we can see the model architecture of Stable Diffusion 3 What is different with Stable Diffusion 3? You see
-
00:04:30 it uses 3 models: Clip-G, Clip-large and T5. The power of Stable Diffusion 3 comes from T5 XXL.
-
00:04:41 Also a better VAE it has. Moreover, U-Net is now multiple MM-DiT blocks which are Multi-Modal
-
00:04:50 Diffusion transformer blocks. Moreover, when you click files and versions in this repository you
-
00:04:56 will get to this page. Now this page is important to understand.
-
00:05:00 You will see that there are 4 different safetensors files, medium safetensors, including Clips safetensors,
-
00:05:08 including Clips and T5, fp16 version safetensors and fp8 version safetensors files.
-
00:05:16 So what is these files? The medium safetensors file is the raw model of Stable Diffusion 3 medium model.
-
00:05:24 What does it mean? It means that it only contains these MM-DiT blocks, Multi-Modal Diffusion transformer blocks and also VAE.
-
00:05:34 So it doesn't include any of these text encoders.
-
00:05:37 This version include Clips, includes also the Clips which are text encoders.
-
00:05:43 When we go to the text encoders folder here, you will see each text encoder individually. Clip-G, Clip-Large, T5-XXL.
-
00:05:52 So this is the model that contains both of the Clip-G and Clip-Large models.
-
00:05:57 There is also include Clips plus T5-XXL fp16 version. That model is including these two plus T5-XXL fp16 and
-
00:06:07 these are individual files. However, for this tutorial we are not going to
-
00:06:11 download any of these. We are only going to download a SD3 medium safetensors file
-
00:06:16 and the rest will be automatically downloaded by Swarm UI. I just wanted to
-
00:06:20 give some extra information. So how we are going to install this amazing Stable
-
00:06:25 Swarm UI? When you click this link you will get to the official repository
-
00:06:30 page of Stable Swarm UI. Installation on Windows is so easy. And also I will make another tutorial
-
00:06:37 for how to install and use on cloud services if you don't
-
00:06:42 have a powerful GPU. So I will show you Massed Compute, RunPod and a free Kaggle account if it works,
-
00:06:49 but it will be in another video.
-
00:06:51 Still, you need to watch this tutorial to learn how to use the amazing Web UI application.
-
00:06:57 So in this repository page, go to the installing on Windows section here. For this to work,
-
00:07:05 we need to have installed Git and
-
00:07:07 .NET 8 first. So click these links, you will see that Git for Windows. Just download it and then next,
-
00:07:14 next, next, that is it, you don't need to do anything else. And download .NET version 8
-
00:07:19 is important. You see there are different installers for Linux, Mac OS, Windows, and all.
-
00:07:26 So which one you need for this, you need to install Windows
-
00:07:30 x64. So click this Windows x64 link here, you will get to this page and your download should
-
00:07:38 have started. If it didn't start, you can click this direct link to download it, then open it
-
00:07:45 from your downloads folder, you will get to this page, click install, it will ask you permission,
-
00:07:50 click Yes, then it will install it and that's it. You don't need to have installed Python for
-
00:07:55 Stable Swarm UI because Stable Swarm UI works with
-
00:08:00 ComfyUI at backend and it installs an isolated Python version to work with it. So it installs a
-
00:08:08 portable Python version. And you don't need to have system wide installed Python for Stable Swarm UI.
-
00:08:15 Since I already had this, it reinstalled it and close and we are ready after you have installed
-
00:08:22 Git and .NET 8. How you can verify the Git you have installed, just open a CMD window and
-
00:08:28 type Git and you should get a message like this. Okay, then what we are going to do,
-
00:08:33 all you need to do is download this
-
00:08:35 bat file, click here to download, then copy that bat file, right click and copy or you can cut,
-
00:08:42 then move into the disk drive where you want to install, I will install it into my R drive,
-
00:08:47 do not install it into downloads, music, documents, users, install it directly into a drive.
-
00:08:54 Also do not install it onto cloud drives like One-drive, then in here,
-
00:09:00 make a new folder where you want to install. When making a new folder, do not use space characters.
-
00:09:06 So I am going to make the name as SwarmUI like this and enter inside that folder and paste the
-
00:09:13 install Windows.bat file here. You see my name doesn't have any space characters, it is directly
-
00:09:19 inside this drive folder. You should follow these instructions, then double click the bat file,
-
00:09:25 you will get to this screen likely. Click more info and run anyway. Then it
-
00:09:29 will clone and install everything automatically for you after this. You don't need to do anything else.
-
00:09:36 It will start a webpage for installation like this. You see Stable Swarm UI installer. Click agree. Click customize settings.
-
00:09:43 Click next. You can select your theme. I am going to use modern light for this tutorial. Click next.
-
00:09:51 Then don't change anything here. Click next. Okay, in here we are going to use ComfyUI local. Click next.
-
00:09:58 You can pick the models that you want.
-
00:10:01 You don't need to download anything if you already have the models on your another folder.
-
00:10:06 But I am going to download Stable Diffusion XL 1.0 base model for this tutorial. Click next.
-
00:10:12 And then, yes I am sure, install now.
-
00:10:14 So it is going to install ComfyUI as a backend automatically into an isolated python environment.
-
00:10:21 And these are the other settings. You can change all of the settings later.
-
00:10:25 You can see the progress here. It is an amazing installer by the way.
-
00:10:28 During the installation if you get this error it means that there is a problem with the remote server. So you
-
00:10:36 can reset your computer, reset your internet modem, and restart the installation. This is
-
00:10:42 an internet-related problem. So therefore, I have closed the installation and I am restarting
-
00:10:49 the installation. So I also deleted the already generated folders. This time, I have enabled
-
00:10:55 the Warp VPN. This is the VPN of Cloudflare, a free VPN.
-
00:11:00 And my installation became many times faster right now.
-
00:11:04 It was previously 600 kilobits per second, and now I am downloading with 36 megabytes per second.
-
00:11:12 You can download Warp from just typing Warp to Google and you will get to this page
-
00:11:18 you see Warp just install it and start it.
-
00:11:20 If you get errors during the installation, and it will become many times faster to install.
-
00:11:27 So the installation is continuing.
-
00:11:29 It already passed the step 1 and I had an error in this step 1 previously. Alright, now it is downloading
-
00:11:36 the SDXL base model since we have picked it but if you already have the models, you don't need to
-
00:11:42 re-download them. I will show you how to use your existing models as well. So the installation has
-
00:11:48 been completed. This page automatically started and opened. You see it is saying that backends
-
00:11:54 are still loading on the server and now it is gone. You can also see the
-
00:12:00 progress in the CMD window. You see it started on 7822 port and this model not found is not
-
00:12:08 important. Let's close it. So this is the interface to use the Swarm UI. I know it
-
00:12:14 may be a little bit overwhelming at first, but don't worry. Also, it didn't obey our selected template. First, make
-
00:12:23 a few settings then start downloading the Stable Diffusion 3 and start using it. First of all, let's go
-
00:12:29 to the users from here in this menu in the user settings. I am going to change
-
00:12:35 my theme. Okay, let me zoom in. Here. So you can change the themes from here. I am
-
00:12:40 going modern light and that's it. Then you see there are so many other settings that
-
00:12:47 you can read and change them. There is one other setting that I am going to
-
00:12:51 change which is save format of generated images. I am going to use PNG because PNG is the loseless data.
-
00:13:00 So it is best and you see it's already saving the metadata.
-
00:13:04 Also, you will notice that there are question mark icons here.
-
00:13:08 So when you click these question mark icons, it will show you the description of what each thing does.
-
00:13:15 You see each one is like this, so you can click them to see the information regarding each option.
-
00:13:21 This is the only settings that I'm changing: PNG output and the template. Okay, let's save.
-
00:13:28 Then first of all we need to download Stable Diffusion 3. To download it, go to the utilities tab here
-
00:13:35 and in this section, you will see model downloader. So this is a very convenient way of downloading
-
00:13:40 models from Hugging Face or CivitAI. So return back to the Patreon post, and in here you see
-
00:13:47 SD3 download model. Right-click and copy link address. Return back to the model downloader,
-
00:13:52 paste it, and it says that URL appears to be a valid Hugging Face download link and which
-
00:13:58 name you want to save it. Let's say Stable Diffusion 3 medium and click download and it will start
-
00:14:03 downloading that model into the accurate folder. This is the base model. If you are downloading different
-
00:14:09 stuff, you need to select them like LoRA, VAE, embedding, control net, and where are the models
-
00:14:14 saved? Where should we save it? When you click server and server configuration, you
-
00:14:21 will see the path settings related to the file paths. There is a model root like models
-
00:14:27 you can give your Automatic1111 Web UI model
-
00:14:30 root as well. SD model folder. This is where the models are put. You see the root folder is being
-
00:14:37 the models folder here, so it includes all of the models. Then SDModelFolder is the Stable
-
00:14:44 Diffusion folder where the safetensors files are downloaded. Then there is also VAE folder here.
-
00:14:52 Currently, the Stable Diffusion 3 has an embedded VAE, so we don't need external VAE.
-
00:14:57 And then there is also SD Embedding Folder, SD ControlNets Folder, upscale models folder, TensorRT folder.
-
00:15:05 So you can set the folder paths from here.
-
00:15:08 You can give your Automatic1111 Web UI model folder as well.
-
00:15:12 But once you do that, change and save, then you should restart the Stable Swarm UI to
-
00:15:19 not get any errors. Okay, let's return back to model downloader and it is done.
-
00:15:24 Then let's return back to the generate. This is where we generate the images.
-
00:15:29 How we use this interface. First of all, let's go to the models tab here. This will display
-
00:15:36 our currently available models. You see Stable Diffusion XL base is here, then I will
-
00:15:42 click this refresh button, and it will refresh the interface and you see Stable Diffusion 3
-
00:15:47 also has arrived. So you need to pick your model from either here or from here. You
-
00:15:53 see left model selection dropbox. Let's select the model from here. And when I click and select here, you see
-
00:15:59 it is already selected from here as well. Then there are core parameters.
-
00:16:04 Images and each core parameter has a description like this. How many images we want to generate.
-
00:16:09 Let's generate 3 images. Do you want a static seed or not?
-
00:16:13 When you set it as minus 1, it will generate a random seed for every time. And number of steps.
-
00:16:21 I find that 40 steps is best. CFG scale.
-
00:16:25 When you click this icon, you can see the effect of CFG scale. I prefer
-
00:16:29 the 7 which is default. Variation seed is used to slightly change the output of the image.
-
00:16:37 You can see that, variation seed and variation seed strength. Currently, we don't need that.
-
00:16:42 Currently, we don't need anything else. We just need to set the resolution because
-
00:16:45 this model is 1024x1024 pixels by default like the SDXL base model.
-
00:16:52 So I am going to set this aspect ratio custom from here to set any width and height.
-
00:16:58 You can also set this as 1:1 and you see it's automatically 1024 to 1024. And there is also sampling.
-
00:17:06 I find that UniPC is the best sampler for both SDXL and Stable Diffusion 3. SD3 scheduler normal is best.
-
00:17:16 I have done a lot of testing before making this video. And SD3 text encoders. Now this is important.
-
00:17:22 When you don't enable anything, it is going to use Clip text encoders,
-
00:17:27 which I have explained it in the beginning. However, the true power comes from T5.
-
00:17:32 So first of all, let's make a test with Clip only text encoder.
-
00:17:37 Then I will show you the difference of other text encoders. Seamless tileable, you see this is what it does.
-
00:17:43 So always click these icons to see what they do.
-
00:17:46 Initial image, this is image to image of Automatic1111 Web UI, we don't need it yet.
-
00:17:51 Refiner is used to improve image after it is generated, like upscaling.
-
00:17:57 I will also show you that, so we don't need that right now. So all other options are here.
-
00:18:02 We don't need any of them right now because we are doing a beginning test.
-
00:18:06 So I have copied this prompt from CivitAI and pasted it here then hit the generate to start generation.
-
00:18:14 When you first time generate an image it is going to download the Clip models into the accurate
-
00:18:20 folders because we didn't download them and we are using SD3 so it is downloading the Clip model
-
00:18:26 and they are downloaded then it will show you that
-
00:18:30 3 current generations, 2 running 1 queued because we are generating 3 images so we
-
00:18:36 will get 3 images and where are these Clip models downloaded? They are downloaded inside here.
-
00:18:41 Currently you need to have exactly same naming for to work otherwise it will redownload the models. So
-
00:18:47 these are the naming. Not same as uploaded on Hugging Face folder and the images are generated.
-
00:18:55 So how do we see images. You can click here to see the images also you see it is
-
00:19:00 here or you can go to the image history and refresh and you can see every image here.
-
00:19:06 So this Web UI image saving and reusing is extremely good. When I click here, it will
-
00:19:12 show me generated image like this. This is a pretty good image, by the way. You see it's
-
00:19:17 looking amazing and this is it. So this one is pretty good. This one is not very accurate
-
00:19:23 and this one is like this. However, remember that this was only Clip only generation.
-
00:19:30 So now we are going to test other ones. I am going to use the seed of this image.
-
00:19:35 You see this one. So to regenerate this image, I will click reuse parameters here.
-
00:19:41 So it is going to load everything of this image into the parameters.
-
00:19:45 I am going to change the image size to 1. I am only going to generate 1 image.
-
00:19:50 And you see the seed is set. So I can regenerate this image.
-
00:19:54 When I click generate, it should regenerate the same image. Let's generate and see the result.
-
00:20:00 By the way, currently on the CMD window, I am not able to see the generation speed.
-
00:20:05 So it says it took 15 seconds to generate. And is it the same image? Let's refresh.
-
00:20:10 Yes, this one and this one. So we regenerated. So how can I also see the image generation speed?
-
00:20:17 Let's go to the server. And in here, go to the logs. And in here you see there is info,
-
00:20:24 make it debug. And when you make it debug, it will also show you every step.
-
00:20:29 And this is my it per second with Stable Diffusion 3.
-
00:20:33 And how much VRAM it is using.
-
00:20:35 To see the VRAM usage, I have opened a CMD window and I will type pip install nvitop.
-
00:20:42 For this to work, you need to have Python 3.10 installed.
-
00:20:45 Ok, it is installed. Then just type nvitop and it will show me the memory usage.
-
00:20:51 So, currently I am using 8.7 GB. However, this is not the what SD3 uses. To calculate it, I will
-
00:21:00 restart the Stable Swarm UI. So let's close this. And this is my VRAM usage, you need to try to reduce
-
00:21:08 your VRAM usage before starting to use the application, you can reduce this as low as 500 megabytes on
-
00:21:14 Windows 10. So we are using 2.2 gigabytes before starting the Stable Swarm UI. So let's go back to
-
00:21:22 the folder, let's just launch Windows.bat file. This was our parent folder. And this is where the launch.bat file
-
00:21:30 exists and it reloaded the UI then go to the image history refresh let's click
-
00:21:36 this reuse parameters so everything is set and generate and let's see the VRAM
-
00:21:43 usage. So it was 2.5 gigabytes. So currently we are using 9 gigabytes VRAM
-
00:21:50 when generating the image. We should also see the peak. Okay the peak was 11.5
-
00:21:57 gigabyte when it was decoding the latent image with the VAE. So it was around 11.5 gigabyte. So in total,
-
00:22:05 it uses around 9 gigabyte VRAM. Now let's see the difference of the other text encoders. I am going to
-
00:22:12 try only T5. T5 is extremely powerful. However, it is also going to use more VRAM on my machine.
-
00:22:19 If your machine has a lower VRAM, I think it is going to load it onto the RAM memory
-
00:22:25 instead of VRAM. So you still should be able to use it. Just try and see. Okay.
-
00:22:29 Let's generate this time it will download the this text encoder model. By default it
-
00:22:36 is downloading the fp8 version not fp16 version and it is using that. However you can also
-
00:22:43 use fp16 version. How you can use it? You can go to the files and versions that I have shown
-
00:22:49 you in the beginning and download this text encoder model. This is fp16 version. After downloading
-
00:22:56 it move it into the models folder, inside
-
00:23:00 clip and paste it here. Then you need to rename it exactly as this one. You see
-
00:23:06 this is the rename that you need otherwise it will not work. Currently
-
00:23:09 this is Stable Swarm UI requirement, this is how it works. However you probably
-
00:23:16 will not gain anything from it because FP8 version is working as good as FP16
-
00:23:22 version. So we are waiting the text encoder to be automatically downloaded
-
00:23:27 by the Stable Swarm UI. It is downloading pretty fast
-
00:23:30 right now and let's watch the VRAM usage as well. Since Stable Swarm UI is using
-
00:23:35 ComfyUI it has huge amount of performance tweaking, performance
-
00:23:40 improvements so if your VRAM is not as high as mine it will probably load it
-
00:23:45 into the RAM memory to make VRAM optimizations and it still should work.
-
00:23:51 You should try it and this was the VRAM usage on my computer with only T5 and
-
00:23:57 this is the result we got. So how we can compare?
-
00:24:00 This is the result with T5 and this is the result with clip only version. So T5 versus clip only.
-
00:24:09 So T5 only versus clip only images.
-
00:24:14 Now let's regenerate it with the best configuration which is, let's go to here, which is Clip plus T5.
-
00:24:23 So this is the best configuration and let's generate. By the way, we need to decide
-
00:24:29 its accuracy we need to look at the prompt and see how well it is matching to the prompt.
-
00:24:36 So this is a very detailed prompt. When you're using T5, you can write a very detailed prompt
-
00:24:42 because this text encoder is extremely powerful. And you see when I use both of them, I get
-
00:24:48 even more amazing image like this. It is very detailed. Of course, there are some mistakes,
-
00:24:54 but it is very powerful. I can generate several images and pick the very best one. So for
-
00:24:59 example, let's generate 10 images with this configuration and see
-
00:25:04 the result. I'm going to click here to set the seed random and let's generate 10 images. You see it
-
00:25:09 says that 10 current generations, 2 running, 3 queued. And we can see the progress in the server,
-
00:25:15 in the logs. Let's look at the speed of generation when we use the best text encoder. Let's make
-
00:25:22 this debug to see them. Let's scroll down. So currently I am getting 2.5 it per second on
-
00:25:30 RTX 3090 GPU, it is using 18GB VRAM currently on my system. I have 24GB with RTX 3090.
-
00:25:40 And it is very good speed, 2.5 it per second. This is a very decent speed,
-
00:25:45 considering that we are using a very powerful text encoder. Let's go back to the generate.
-
00:25:50 We can also see how it is being generated. And we can see the generated images already.
-
00:25:56 Okay, these are very, very good, very high detail images. Wow, this one is also really good, looking really good.
-
00:26:04 And since this is a new model, we need to figure out how to prompt it accurately.
-
00:26:08 This is just a 2 billion parameters model. Hopefully, Stability AI will also release much more powerful 8 billion
-
00:26:16 parameters model. And it will be hopefully many times better than this model.
-
00:26:21 And also there will be community fine-tuned models.
-
00:26:25 They will be very powerful with these text encoders and with the VAE
-
00:26:30 of the Stable Diffusion 3 because the VAE of the Stable Diffusion 3 is many times
-
00:26:35 powerful what I mean by that? In this research paper there is a VAE channels comparison. When
-
00:26:42 you scroll down you will see that 4 channel VAE able to regenerate this image, 8 channel
-
00:26:49 is able to regenerate this image, and 16 channel VAE is able to regenerate this image and SD3
-
00:26:57 Stable Diffusion 3 is using 16 channel
-
00:26:59 VAE. So therefore, its VAE is extremely powerful. And if you wonder what is Stable Diffusion
-
00:27:06 VAE, this is the description of it. Just pause the video and read it if you want. Okay, our
-
00:27:12 10 images are generated. Let's look at them. So this one, okay, this one is pretty cool.
-
00:27:19 This one, this one, and these images. Now how we can further improve this image, we
-
00:27:27 change how we prompt and we can also
-
00:27:30 upscale this image. I also have done some upscaling testing and now I will show you how to upscale your
-
00:27:37 images to get even better quality results. So when you return back to this page you will see that
-
00:27:43 there is Helaman upscalers with thumbnails. When you click here you will get to this page open
-
00:27:50 model db maybe you know already this website where the upscaler models are listed. You can download
-
00:27:58 the upscaler that will be useful for you. And you can see which upscaler is working on which kind of
-
00:28:05 images. You see there are so many different upscalers for each task. I'm going to use a new upscaler.
-
00:28:12 You will see that there is download very best new upscalers released sorted by name. When
-
00:28:18 we go to this link, this will open this guy's GitHub page and you will see the latest released
-
00:28:25 upscalers. There is 4x real web photo. This is a very good upscaler. I will download this one.
-
00:28:31 When you click that link, you will get to this page and there are some examples of upscaling like this.
-
00:28:38 In the very bottom of the GitHub page, you will see this link. 4x real web photo version 4.pth.
-
00:28:48 This is the file that we need. Just download it. There is also a safetensors version.
-
00:28:53 I wonder if it is working. Let's also download it and we can test both of them.
-
00:28:56 You can go to your downloads folder, cut, move
-
00:29:00 into the models, inside StableSwarmUI into the upscale models and paste it there.
-
00:29:07 You see this is the folder path where you need to put. Then in here we will use the refiner.
-
00:29:14 We will enable it and we are going to change the refiner upscale method. Okay it is not visible yet
-
00:29:21 so I'm going to restart to get them. Just close this CMD window go back to the StableSwarmUI
-
00:29:28 launch Windows.bat file. Okay, it is reloaded very fast. And whichever image you want to upscale, for example,
-
00:29:36 let's upscale maybe, let's see. Okay, let's say let's upscale this image. So click reuse
-
00:29:44 parameters. So it did reset every parameter. Then in here, I will use this. Okay, it also
-
00:29:51 supports safetensors very good. This safetensors file, this is a safe file, it cannot contain any malicious code.
-
00:29:59 Refiner control percentage. Now this is very important. This is same as denoising that
-
00:30:05 we have in Automatic 1111 Web UI. So this will decide how much change you want. I will make
-
00:30:12 this 50% and you see there is refiner method post apply, step swap, step swap noisy. When
-
00:30:19 you click this icon, you will see the difference of them. Refiner steps how many steps you
-
00:30:23 want to use, you can set it. So I will just set it as 40
-
00:30:28 as well like this and you can change the refiner model.
-
00:30:31 I am going to use the same model as the base model, but you can use any model for refinement
-
00:30:36 process. What this does this do is, based on your refiner upscale method, it is going to use that new
-
00:30:43 model during the upscale. Latent and pixel are different. You can test them and see the differences.
-
00:30:50 And then there is also refiner do tiling. If you do tiling, it will make seaming. What does seaming means?
-
00:30:57 Seaming means that, let's say, it will generate multiple heads of this image.
-
00:31:03 However, it will also reduce the VRAM usage.
-
00:31:05 But with this upscaler, I am not going to use refiner to do tiling.
-
00:31:09 And I am going to upscale 1.5 times. You see there is the upscale times.
-
00:31:14 Let's make it 1.5 and let's generate. So currently, it is generating 10 images because it is set as 10
-
00:31:22 last time. I am going to just cancel the operation by clicking this X icon.
-
00:31:26 Then I will set the number of images to 1. How did I understand?
-
00:31:30 Because it's shown here. Let's hit generate. You see one current generation, one running and now
-
00:31:36 first it will generate base image, then it will use this upscaler to upscale. Since I am using the
-
00:31:43 both of the CLIP and T5 text encoder it is using a lot of VRAM initially, then it will start
-
00:31:49 upscaling process. Okay, it didn't start. Why? Because I didn't enable it. Don't forget to enable these
-
00:31:57 features. Let's enable it and generate.
-
00:32:00 And it will generate. So where are these images saved? They are saved inside
-
00:32:04 output folder, inside local, inside raw, and the date. And we can see all the
-
00:32:10 generated images are saved here. Also, there is image history. You can search,
-
00:32:16 you can sort by name, date, reverse sort. You can see the depth like this. This is
-
00:32:21 a very, very convenient way to use. Very, very good. There are also presets,
-
00:32:26 wildcards. I will also show them in this tutorial. Okay, the upscaling process started.
-
00:32:32 So it is using the same amount of VRAM, nothing new.
-
00:32:35 Let's go to the server and see the speed of upscale. Let's go to the debug.
-
00:32:41 By the way, when you use tiled upscaling, it is the same speed as the original because
-
00:32:46 it tiles them into 1024 pixels and upscale.
-
00:32:51 However, it causes some issues when you have a higher rate of refiner control percentage.
-
00:32:57 And let's refresh here and the image has arrived. Okay, wow, just wow. Look at this quality.
-
00:33:04 So let's open this in a new tab and let's also open the original image.
-
00:33:09 This was the original. Let's compare. So this was the original image and this is the upscaled image. You see?
-
00:33:15 Now I will show you a comparison on ImageSLI.com. Let's make a new album. Then let's add the images.
-
00:33:21 So this is the first image and this is the second image. Let's upload. I like this comparison website.
-
00:33:28 Okay, let's make this full screen.
-
00:33:30 Okay, so the left one was the original image, low resolution 1024 and the right one is the 1536 upscaled.
-
00:33:39 You can see how much detail, sharpness, focus, clarity is added. This is just mind-blowing.
-
00:33:46 This upscaler is also very good. There are so many other upscalers, you can try them.
-
00:33:51 You see amazing, just amazing. You can see how much difference it makes.
-
00:33:55 You see, it is just mind-blowingly amazing quality. It is just amazing.
-
00:34:01 Then there is also a link here to download all of the upscaler models.
-
00:34:06 When you go to this link on the Patreon post, you will get to this page.
-
00:34:10 In here, this developer uploaded all of the models as a zip file.
-
00:34:16 You see, there are two zip files because he had to split them under 2 gigabytes.
-
00:34:22 So download both of them and extract them into the upscaler models folder
-
00:34:27 and you will be able to use all of the upscaler models.
-
00:34:30 of this guy. So what other features this application has? One of the very best
-
00:34:34 features of this application is presets. You can generate as many presets as you
-
00:34:40 want and use them with one click. So when I click create new preset here, I can set
-
00:34:45 the preset image like this and I can set what prompt I want. You see there is
-
00:34:50 value. When you set value it is going to append new prompt that you have let's
-
00:34:56 say this. So my new prompt will get here and this will be the rest of the prompt.
-
00:35:03 Then there are other parameters that you can set. Currently it is going to use my set parameters here.
-
00:35:09 So the sampling should be like here, you see. Then click save.
-
00:35:14 Okay, we need to set a name, let's say 3D. And save, and it is set.
-
00:35:20 Now I can change the parameters here like this. And when I click this, you see it says one
-
00:35:27 and overriding zero parameters. And because it didn't set all the parameters accurately.
-
00:35:33 So we need to select the parameters from here. Okay, these are the best parameters.
-
00:35:38 And if you want refiner step to automatically applied, you can also select it.
-
00:35:43 So let's also select it like this 0.5. Let's say refiner steps 40. Okay, and refiner upscale method.
-
00:35:51 So you can pretty much set every parameters here and I will make it upscale it 1.5 times.
-
00:35:58 You see like this refiner upscale and resolution is like this. Let's also set it to be perfect
-
00:36:04 like this. I will also set the steps count and CFG scale. Okay, everything is set. Then let's just save
-
00:36:11 it. Okay, it is set and saved. It also shows here and let's re-set it. Okay, it says
-
00:36:18 overriding 13 parameters. You can see the overridden ones and let's just type a new prompt like a cat
-
00:36:25 and generate. So it will put a cat into here value and the rest will be also applied.
-
00:36:31 This is how the presets work. We are getting the cat, and it will also do refiner step
-
00:36:37 because it is set in here. You can also scroll down with your mouse or here
-
00:36:42 to see all the set parameters of this preset. And the first image is generated.
-
00:36:49 You see, I am generating 10 images, so I am going to click here to cancel.
-
00:36:53 There is also this arrow. When you click this arrow, you will see generate, generate forever,
-
00:36:58 generate previews, interrupt current session, interrupt all sessions and other things.
-
00:37:03 So you can also use generate forever to generate unlimited number of images.
-
00:37:08 And this is the cat image we got. It is looking very good actually.
-
00:37:13 And you will notice that there are some mistakes in the borders.
-
00:37:18 This happens when you don't use tiled upscaling.
-
00:37:22 However, when I do the tiled upscaling, it may cause some other problems. So let's say reuse parameters.
-
00:37:30 So all the parameters of this is set and I will change the refiner do tiling here and generate.
-
00:37:38 So it should fix the error at the borders, however, we may get seaming problem.
-
00:37:44 And seaming problem is that it will repeat some of the subjects in the image.
-
00:37:50 You see, it says that this can fix some visual artifacts from scaling but also introduce others, e.g., seams.
-
00:37:57 Let's see if we will get seaming problem in this image.
-
00:38:00 Okay, let's just wait. When doing tiled upscaling you will notice that it is split into
-
00:38:06 the tiles like this and it is upscaling each tile then it will merge them all.
-
00:38:12 So you can also see the progress here. Okay, it is done and still generating ten images so let's
-
00:38:18 just stop it. And this is the image. I don't see a noticeable seaming in this image and you
-
00:38:24 can see that those blurry lines at the borders are gone. So this
-
00:38:29 fixes that issue. So let's make a comparison. Let's refresh the images history. This is
-
00:38:35 without tiling. This is with tiling and this one is without tiling. Let's open them. So
-
00:38:41 you see this is without tiling. We can notice these errors at the borders and this is with
-
00:38:47 tiling and you see there are no such things. And when we compare both, they look both perfect.
-
00:38:53 I think for this image with these settings, tiling worked better. So it is up to you to
-
00:38:59 test and see in your case whether it is working good or not.
-
00:39:02 Now I will show you another very cool feature. I have prepared 275 presets with their generated thumbnail images.
-
00:39:12 So click here to go to that post and in this post, you will see that there are several things.
-
00:39:18 First of all, I also have shared the styles file for Automatic1111 Web UI.
-
00:39:24 Moreover, I am sharing the necessary files to prepare these presets.
-
00:39:29 Let's download the zip file and then let's extract it into our downloads folder. Enter inside the extracted folder.
-
00:39:38 You see there are python files. I will show you the content of those files as well.
-
00:39:43 So this is the convert Fooocus csv to Swarm preset. This is how it is done.
-
00:39:49 Then there is also convert Fooocus styles to the Automatic1111 Web UI style.
-
00:39:54 This is how it is done. You just set the folder path, output csv file path.
-
00:39:59 In here also we set the CSV file. It will be overwritten. And then there is also generate presets thumbnail.
-
00:40:07 So this is how I generated thumbnails. For generating thumbnails, first of all, you need to generate all the thumbnails
-
00:40:14 with the grid generation I will show. This is not mandatory.
-
00:40:19 So, you see there is Stable Swarm UI presets json file which we are going to load it.
-
00:40:24 Go to presets menu and click import presets. Then from here, click choose file.
-
00:40:29 Enter inside your downloaded file and select this presets json file and open it.
-
00:40:34 Then you can also enable overwrite existing presets if you have the same preset name which I don't have now.
-
00:40:40 Ok click import and you see all the presets of the amazing Fooocus are now available on the Stable Swarm UI.
-
00:40:50 So this is how I prepared and generated them. For example, let's generate this image with the dark fantasy.
-
00:40:57 Ok, to generate it with dark fantasy,
-
00:41:00 I'm going to just type a cat and click here, and that style is applied and hit Generate.
-
00:41:07 By the way, this doesn't have currently upscale enabled. Okay, it is generating the image.
-
00:41:13 We are still at 10 images because we reused the parameters.
-
00:41:17 Moreover, it also applies the settings that you have, which are not conflicting with the applied style.
-
00:41:25 Also, it is also applying the refiner step right now, because in this
-
00:41:30 presets, there are no such things. Let me show you one of these presets. It has the prompt,
-
00:41:36 it has the negative prompt, 40 steps, CFG scale 7, sampler is UniPC, scheduler normal, and it uses
-
00:41:43 Clip plus T5 encoder like this. You can also edit this JSON file before import and change the
-
00:41:52 settings as you wish. Okay, the image is generating and getting upscaled, and we got the first image.
-
00:41:58 You see with this style,
-
00:42:00 I was able to generate this amazing image, as you are seeing right now. It is a
-
00:42:04 very, very good quality. Okay, let's just
-
00:42:06 cancel it. Now. What other things there are? One another great feature of ComfyUI
-
00:42:12 is that the model browser. When you click the models tab, it will list you all of
-
00:42:17 the models. You see there are also
-
00:42:20 categorical filtering as well. So when you click root here, it will show you all.
-
00:42:25 When you click the folder name, it will show you under that folder. You can also see the folder structure
-
00:42:30 here. You can also filter by name. There is also cards, small cards, big cards. So you can also
-
00:42:36 change these as well. Thumbnails, small thumbnails. This applies to presets too, like big cards like
-
00:42:44 this or small thumbnails or thumbnails. You can change both of them. And in image history, you
-
00:42:50 can also change them to small thumbnails, big thumbnails, or big cards as well. So it is totally
-
00:42:56 up to you to change these as well. It is very versatile. In the models you can also set
-
00:43:04 image from the edit metadata, and in here you can say use this image and save
-
00:43:10 so it will use that image as an image of the model.
-
00:43:15 For some reason, it didn't save with an accurate size. Yeah, I'm going to tell this to the developer,
-
00:43:22 so hopefully it will be fixed. Also, when you click here, you see there is also create TensorRT engine
-
00:43:29 and set as refiner. Hopefully, in another tutorial, I will show you more detailed stuff regarding
-
00:43:36 these as well. Okay, after I reported this error, the developer already fixed. So
-
00:43:42 let's also try the fixed version. I'm going to update the application one more time. Let's
-
00:43:47 go to the folder update Windows.bat file, you see one file changed, he already fixed
-
00:43:54 it and done and then restart the application. Okay, let's return back here.
-
00:44:00 Click here, edit metadata and use image, save. Okay, it says server has updated,
-
00:44:07 so we need to refresh it. Okay, I did refresh the page. Let's click edit metadata, use image.
-
00:44:15 And yes, you see it is already fixed, amazing. The updates and the communication between developer
-
00:44:22 is just amazing with Stable Swarm UI. Now, what other things there are?
-
00:44:27 First of all, I suggest you to read some of the documentation which is full prompting syntax. This is
-
00:44:34 super important. When you click this link, you will see that advanced prompt syntax here. It explains everything weighting
-
00:44:42 of the prompts, alternating, from to, randomization, wildcards, repeat, textual inversion embeddings
-
00:44:51 LoRAs embeddings, presets, automatic segmentation and refining. This
-
00:44:55 is extremely useful to inpaint faces. This is like after detailer extension.
-
00:45:01 You see clear transparency and break keywords. So this is extremely important and when
-
00:45:06 you go to the app folder here, features, you will also see other usages like control net. When you
-
00:45:12 read here, you will see the control net. Hopefully I will make control net tutorial as well. I don't
-
00:45:16 want to put everything into one tutorial because it would be very, very long and presets read me
-
00:45:23 video. So make sure to read them. There is also docs app folder and you will see even more readme files.
-
00:45:31 You should read them to learn them. What another things that you can do let's also do
-
00:45:36 wildcard. Wildcard is simply putting random prompts. You see wildcards are list of random
-
00:45:42 prompt segments one entry per line. Let's say random color like this and let's say blue, red,
-
00:45:51 yellow. Okay and let's save and let's refresh. Okay random color arrived. So what we are going to do
-
00:45:58 is a cat. So when I
-
00:46:00 type this letter you see it shows all of the available wildcards so I'm going to use the
-
00:46:06 wildcard that I have made which is random color let's type it. Okay. Okay I don't see random color
-
00:46:13 here yet let's refresh maybe I need to restart. Okay I can't see it maybe I need to click here.
-
00:46:18 Yes. After I clicked this it appended it into the prompt like this: a cat and
-
00:46:25 random color. You need have different settings to get randoms each
-
00:46:30 time so let's disable the refiner and let's generate 3 images with random seeding
-
00:46:35 from here. Okay, everything set and generate and our 3 images are generated. Let's
-
00:46:40 see the prompts used for each one of them. Okay, this one used original prompt a cat
-
00:46:46 wild random color. To see the full details click the images and in the bottom we can
-
00:46:51 see which is used. Okay, a cat blue. You see this was a blue randomization. Let's move.
-
00:46:58 This was also a cat
-
00:46:59 blue and this was a cat red. So this is how wildcards working. Just read the documentation
-
00:47:07 and you will see. This interface, this web UI is extremely advanced there are so many
-
00:47:12 stuff. So as a next step, I will show you very powerful thing which is inside tools.
-
00:47:17 And in here, when you click here, you will see grid generator. This grid generator is
-
00:47:23 equal to X/Y/Z plot of Automatic1111 Web UI. But this one is much more powerful, advanced and versatile.
-
00:47:30 I will show you how to use it right now. Before showing you, what I want to do is,
-
00:47:35 I want to put all of the upscalers into the upscalers folder.
-
00:47:40 So, since we have downloaded them, let's cut them and move them into the upscale folder,
-
00:47:47 which is inside Stable Swarm UI, models, upscale models here.
-
00:47:52 Okay, you need to put both of the zip files like this from downloads and extract here,
-
00:47:57 and it will extract all of the upscalers as a
-
00:48:00 safetensors like this. You see. Let's sort them by size. Okay i will use the
-
00:48:06 biggest ones and i will do some testing to see which one is performing better. Okay there are just so
-
00:48:11 many let's also yes
-
00:48:13 to all yeah it is done. So now i need to restart the Stable Swarm UI i will just
-
00:48:19 do that let's close it let's return back and launch Windows.bat file and it is started. Okay. So
-
00:48:27 what should we test. Let's upscale this dragon image,
-
00:48:30 re-use parameters so all parameters are set and let's go to the tools. Click here,
-
00:48:36 select grid generator. Now there are three options of output type I suggest
-
00:48:41 you to use web page. This is the best one. This give you so many options. You can
-
00:48:46 also just generate a grid image or just images as well. So when you use just
-
00:48:51 images they will be just generated as an image and saved in the outputs folder.
-
00:48:55 When you generate grid image it will just generate a grid image but when you
-
00:49:00 generate a webpage it can continue to generation if you just interrupt it for some reason and
-
00:49:08 you can filter so many different options to see. So you see output folder name will be given like this.
-
00:49:13 I will just give a custom name myself. Let's say upscale testing like this.
-
00:49:19 So it will be saved inside View/local/grids upscale_testing and there is continue on error
-
00:49:24 so it will continue generation. Then when you click here you will see all
-
00:49:30 of the options that you can select. You see there are so many options. So let's test several things
-
00:49:35 to be able to compare. Let's test steps. Let's make it 20 and also , and 40. If you don't
-
00:49:43 know how to type them, you can just click examples and it will fill with the examples.
-
00:49:47 Let's just delete them. 20 and 40. Okay, then go to the refiner upscale, select it and click
-
00:49:53 examples. You see these are the upscale resolutions. You don't need to set them if you set them here.
-
00:49:59 So we don't need to set everything here. So what we are going to test,
-
00:50:04 we are going to go Refiner, enable it. Refiner control percentage, actually I will test this too.
-
00:50:10 And let's set the steps 40 from here, so it will use that.
-
00:50:13 Refiner upscale 1.5, so it will use that. Okay, so what I am going to test is, Refiner upscale method.
-
00:50:22 And when I hit the backspace in here, it will show me all of the options.
-
00:50:25 You see all of the options are here. There are so many options, since we downloaded everything.
-
00:50:30 So which options we want to test. Let's go to models, upscale models.
-
00:50:35 And the biggest size are, for example, let's see. Yeah, 4X real web photo. This is a very good model.
-
00:50:42 Let's select it. Okay, 4X real web photo. Okay, here you see 4X real web photo version 4, like this.
-
00:50:51 Let's try 4X LSD. Okay, here. Let's put comma then, 4X, you can type it also LS.
-
00:51:00 Okay, it lists all of them. I think this one. Yes.
-
00:51:04 So I'm going to compare two methods, two different steps count. And I will also test refiner do tiling.
-
00:51:11 Then in here I will click fill. So true and false is filled.
-
00:51:15 Okay, so it is going to do how many testing it is going to do 2 multiply it with 2
-
00:51:20 multiplied with 2, 8 different testing, because I'm going to generate one image. It
-
00:51:25 will read all of the other parameters, whatever I set here.
-
00:51:30 And just click generate grid and you see 9 current generations 5 queued waiting on model
-
00:51:37 load we can also go to the server, logs and debug here and we can see what is
-
00:51:45 happening. This is at the back end running with the ComfyUI, but this is saving us huge amount of time,
-
00:51:52 effort. We don't need to use ComfyUI those annoying these nodes, so it is handling
-
00:52:00 everything for us automatically. So let's go back to the server. You can see the what is happening here.
-
00:52:06 ComfyUI back-end direct web socket request. 2.6 it per second. Let's go to the generate.
-
00:52:12 9 current generations, 2 running, 3 queued. And let's see the server. Okay, so it is generating images.
-
00:52:19 Let's go to the grids inside here. And inside output, inside local, inside grids. We can see the
-
00:52:27 folder, upscale testing, and we can see the generated images. Currently none is generated because upscaling will take more time.
-
00:52:37 Okay. Okay, it says that, yeah, completed generation 1 over 10, refiner do tiling True.
-
00:52:43 So these were the settings and it is saved. Let's go to 201. You see, it shows upscaler method name
-
00:52:51 and this was the upscaled image. Okay, the others are also getting generated.
-
00:52:57 So we just need to wait and it is running. Let's say something happened and you are interrupted.
-
00:53:04 So how you can continue? To continue, first of all, I will just cancel the operation.
-
00:53:10 You see it is canceling. Yes, generation session interrupted.
-
00:53:13 Then I will click load grid config and it will show you the history.
-
00:53:19 Currently I only have one history of grid upscale testing, load grid config.
-
00:53:23 Then make sure that you have the same output folder name to continue. It says output will override
-
00:53:29 existing folder. Okay, it's fine. And hit generate grid. Now it should skip what it
-
00:53:36 has generated and continue. We can see that on the CMD window. Okay, it says skipped one
-
00:53:42 files because only one generation has been completed. So it will generate the remaining
-
00:53:48 seven files. This is amazing. This is just amazing. I wish Automatic1111 Web UI also
-
00:53:54 had this grid generator. This is just amazing. You can put many things here like
-
00:54:00 X/Y/Z and any others. We are not limited to test only 3.
-
00:54:05 I can add as many as here and I can compare all of them after it.
-
00:54:09 I will show you, so this is just very best grid comparison tool available.
-
00:54:15 Okay, so the grid generation has been completed. I had to restart application one time and continue
-
00:54:22 because VRAM was full and because we have used very heavy upscaling methods.
-
00:54:28 So how we are going to see this grid
-
00:54:30 generation after it has been completed. Click here to open the folder and you see it is already opened.
-
00:54:37 Now, the interface may be a little bit hard to understand in the beginning. When you click
-
00:54:43 advanced settings, you will see that there are amazing options to hide each one of the parameter
-
00:54:50 as you wish like here. Also, you can change which parameter will be displayed like. For example,
-
00:54:57 when you set both of them to upscale method, it will show upscale method here and here.
-
00:55:02 Refiner do tiling, it will change the order. So you can play with these orders,
-
00:55:08 display different one of them as you wish. So let's set them. So this true is
-
00:55:15 tiled and this false is not tiled. There is also the upscaler method is displayed here. Let's see.
-
00:55:23 Okay. So this is not what we want. We need to change it to the version we want.
-
00:55:28 Okay, now it displays steps on the left. It displays false and true and the upscaling method. Okay, this
-
00:55:36 is the upscaling method, 20 steps and 40 steps, and we have tiled it True, and we have the other
-
00:55:45 one. So this upscaler didn't work as we wanted obviously. Model 4x
-
00:55:50 real web photo version 4. Okay, this is weird.
-
00:55:54 because this one is perfect, so we have an error somewhere. Okay, the error is the number of steps.
-
00:55:59 Okay, 20 steps resulted in this abomination, unfortunately, but 40 steps made this one, and this has tiled true.
-
00:56:10 and now I will show you tiling true effect. You see there is eyes and another head here.
-
00:56:17 So this is called a seaming. When you do tiled upscaling, you need to go with lower denoise strength.
-
00:56:25 So the denoise strength is refiner control percentage. You need to go lower
-
00:56:30 to prevent it. You need to go like 30% instead of 50%. However, when you don't do
-
00:56:38 tiled upscale, which is the second option is here you see false. This false is the tiled
-
00:56:44 upscale here. So I can also disable it from here, you see false and true. So when
-
00:56:50 tiled upscale is false, we won't get that seaming. So in this image, actually the seaming
-
00:56:56 is much more worse. You see, there are seaming here, seaming here. Seaming here, seaming here.
-
00:57:01 And when that tiled upscale is disabled, the seaming is extremely reduced. However, now we can see the
-
00:57:08 degrade in quality at the borders because we did upscale a huge resolution. But in the bottom, with 40 steps,
-
00:57:18 it is looking many times better. 40 steps is a very good spot. You can go to 100 steps.
-
00:57:23 It is working even better in more steps. So this is it. And then there is this third one,
-
00:57:29 which is this another upscaling model, model for 4x LSDIRHAT hat. This is a very heavy model by the way,
-
00:57:37 it requires huge amount of VRAM. This is the upscale with tiling,
-
00:57:42 and you see there are a lot of seaming, repetition of the subject, the head is also generated here
-
00:57:47 and also here. And in the bottom you see there is also a head here because of the tiling.
-
00:57:53 When we don't do tiling, this is the result and this is the result. So how we can actually
-
00:57:59 compare them. To compare them I'm going to remove the 20 steps. I'm going to remove the
-
00:58:07 tiling true and now I can compare side by side two models and decide which one is looking
-
00:58:14 better. Actually they are looking both of them very good. So this is the first and this
-
00:58:19 is the second image. First and second image. Okay now it is more visible like this. You
-
00:58:25 see. Okay in the right one there is also another nose here.
-
00:58:30 you see and this is the one, so let's copy and use imagesli to compare. Okay,
-
00:58:36 let's click new album.
-
00:58:37 and can we paste image? No, so what can we do? So this is the model 4x real webp.
-
00:58:45 Lets save this
-
00:58:47 as model 4x webp and let's save this one as model 4x LSDIRHAT hat. Okay,
-
00:58:56 like this let's add the images and upload. I really like this
-
00:59:00 website, is really cool and we can see now. Okay, now the left one is 4X LSD. This is
-
00:59:07 very heavy VRAM and the right one is 4X real webp. And I can say that yeah, LSD added
-
00:59:15 more details. I can see that it is more sharp. You can see on the scale, for example, let
-
00:59:20 me show you. The scale is looking more detailed. The eyes are also looking more detailed. Let's
-
00:59:26 see in the close, yes, yes, it added more.
-
00:59:29 more details, that's for certainly. It is more sharp, more focused. However, it also
-
00:59:35 added some other nose here, but I think it is looking very very good. Yeah, this
-
00:59:41 upscale is very very good, but it requires a lot of VRAM. So if you don't
-
00:59:45 have a 24 gigabyte VRAM, you probably won't be able to upscale, but it is just
-
00:59:49 amazing, amazing quality you see. So this is how you can use grid. You can use this
-
00:59:55 for anything. I actually done even bigger tests. Let me show it was inside
-
01:00:00 output, inside grids. This is my another installation. You see there are a lot of testing.
-
01:00:05 Okay, this one is 140 megabytes. And you see I have tested a lot of different options here.
-
01:00:11 I have tested the sampler steps, CFG value. So I can set them differently from prompt,
-
01:00:18 text encoder, CFG steps. You can display any one of them as you wish. This is a really,
-
01:00:24 really powerful tool. Just play with it to understand it better. Okay, one another very useful
-
01:00:30 feature of the Stable Swarm UI is the image history. In the image history there
-
01:00:35 is already a folder structure like raw outputs. They can be categorized as by the
-
01:00:41 folder names like grid outputs, the grid testing names from here you see upscale
-
01:00:46 testing or the raw outputs. Moreover, you can also filter them by a prompt that
-
01:00:52 you used. Let's say cat and it will filter all the images that contains cat word.
-
01:01:00 Actually it is doing not only a word search but it is doing a wildcard search.
-
01:01:04 So delicate is also displayed. You see cat is here and delicate is here.
-
01:01:10 So this is how useful it is. If you are remembering a word among your generations,
-
01:01:17 you can search them very quickly from filtering. You can also sort them by date or by name.
-
01:01:23 You can also reverse sort. You can click here and display all the contents of that folder.
-
01:01:29 You can search among that folder.
-
01:01:31 You can go to another folder and search in there. This is just amazing. This is very easy to use.
-
01:01:38 Automatic1111 Web UI is missing this feature, but you see with Stable Swarm UI,
-
01:01:43 you can quickly find your generated images, reuse parameters and regenerate them. This is just amazing.
-
01:01:50 Just click and see them. The image history is just amazing feature of
-
01:01:53 Stable Swarm UI. And if you are wondering how Stable Swarm UI is
-
01:01:57 able to do that when you go to the
-
01:02:00 output, go to the local, go to the raw, you will see that each folder has database files
-
01:02:06 in the end. Image metadata and image metadata log. So it is using these files to improve
-
01:02:14 image history management. But this is just amazing. You can also directly delete images
-
01:02:20 from the gallery, you see there is an icon in every image. When you click this icon, you
-
01:02:26 will see that star, open in folder, download, or delete
-
01:02:30 option. So when you click delete, it will directly delete the image. This is a
-
01:02:34 very convenient way to use as well. Moreover, you can star them too. So let's say I want
-
01:02:41 to star this so I can star that image and your starred images will appear here. You see
-
01:02:46 when I click it, I will be able to quickly find my starred image. This is amazing. You
-
01:02:52 can also directly star images from here as well. For Stable Diffusion 3, upscale 2x is not working so you
-
01:03:00 use the way that I have shown for upscaling. Also, you see the more option is available
-
01:03:06 here as well, as well as this icon here for each image. Okay, now as a next step, I will
-
01:03:14 show you how to use SDXL models and LoRAs on this amazing Web UI. Okay, for example,
-
01:03:21 let's try pixel art XL on the base SDXL model. Let's go to the utilities and model downloader.
-
01:03:29 Let's just paste the link. Okay, it says model. So we need to just go to the full link like this
-
01:03:36 and just paste. Okay, it works. Yeah. So this is it. You see it automatically recognized
-
01:03:41 it is LoRA model and just download. Sometimes these downloads may be behind a login. In
-
01:03:49 that case, you have to manually download it. So after you download it, you just need to
-
01:03:53 put them inside models, inside LoRA in here. You see, this is where you need to put them.
-
01:03:59 So it is getting downloaded and done. Let's go to the generate option. Let's turn
-
01:04:04 off the refiner and let's set
-
01:04:07 this as a one. So I'm going to use SDXL base model. So from here or from here, you
-
01:04:13 can select. Let's select SDXL base and let's say a photo of a realistic image of a snake with
-
01:04:20 dragon horns, a simple thing. Then let's go to the LoRAs, refresh, it's already here and just
-
01:04:26 click it and you see this LoRA is already
-
01:04:30 selected. You can also put it in here by typing LoRA and the name of the LoRA.
-
01:04:37 When I hit the space character, it automatically recognized. So you can
-
01:04:41 use this as a LoRA activation or you can use the LoRA activation from here,
-
01:04:47 which I prefer. Okay, it is activated and then if you want to set a strength, you
-
01:04:52 see this strength of the LoRA is set here like this. So let's make the strength
-
01:04:57 1 and it is enabled and let's
-
01:05:00 just generate. Yes, oh, by the way, it is in the grid. So just cancel it, go to the
-
01:05:05 tools. You need to disable the grid generator from here too this. Now it is generate. Okay, now let's
-
01:05:11 see the generation. We can see that it is loading the model right now because we switched it to the
-
01:05:16 official SDXL base model, which we downloaded in the beginning and it is generated. Okay, this is the
-
01:05:23 image we got. A photorealistic image. Let's change the photorealistic image because it is affecting. Let's try like
-
01:05:30 this and we can see the pixel art LoRA here. Okay, it applied the strength as. It
-
01:05:35 says that LoRA zero is here. So you can apply multiple LoRAs and the weight, the power of the LoRA
-
01:05:41 was 1 and you see
-
01:05:43 it is turned into pixel art. How can I be sure? I can use the same seed from here
-
01:05:49 and I can disable the LoRA and I can regenerate the image. This time I should get a
-
01:05:56 different image, not a pixel art because the LoRA will not be applied. And this is the image without the
-
01:06:02 LoRA. You see, completely different image.
-
01:06:05 So I can enable the LoRA again and this is how it works. You can also
-
01:06:10 enable multiple embeddings. There are also control nets, but for control net, I
-
01:06:15 am planning another tutorial. The usage is same. You
-
01:06:19 can use SD 1.5 based models as well. You just need to change the resolution.
-
01:06:24 I prefer using UniPC in all models. It is working best in my opinion.
-
01:06:30 You can always test them with the grid and there is also image to image tab. So you can also use
-
01:06:35 image to image. For example, let's use an image to image. Okay, let's convert one of the images
-
01:06:41 into this pixart.
-
01:06:42 Okay, let's convert this image. So you see there is use as init. When I click it,
-
01:06:47 it will use this as
-
01:06:49 an image to image. This is 1536 to 1536. So let's convert this into pixel art. This is the denoising
-
01:06:57 strength. Let's make it 60 percent. Okay,
-
01:07:00 it's fine. You see there are other options as well. Mask blur, mask shrink grow. You can click these to
-
01:07:06 read them. These are the default settings. Okay, let's make test. I didn't do much testing with this
-
01:07:13 and generate. Now this will use this as an image to image input image and it will generate it.
-
01:07:19 You can also use edit image. This is inpainting of the Automatic1111 Web UI. Okay, we got it. So
-
01:07:26 this is how it is turned. We can increase the strength. Let's make it 70%.
-
01:07:31 And let's go to server to see the speed of the image generation with SDXL model.
-
01:07:36 You see it is 3.5 it per second, which is equal to my Automatic1111 Web UI actually.
-
01:07:43 And yes, this is the image. We can even change it further, but it will become
-
01:07:49 really, really different in that case. So let's try 80% perhaps. Maybe it is
-
01:07:53 because the resolution is very big. No, it is resizing actually the resolution. Let's see. Yes, the resolution is
-
01:08:00 resized. Let's try again with the 80%. Okay, this is the image we got. It is becoming like
-
01:08:06 more pixel art. We can also try this other stuff. Let's try this, for example. Let's try
-
01:08:11 generate. Okay, it didn't make any difference. There is unsampler prompt. Let's see what it
-
01:08:16 does. This is powerful for controlled image editing. Yeah, it is not very important. Let's
-
01:08:22 see mask behavior. Maybe simple latent will do better. Okay, so this is how you use image to
-
01:08:30 image. Let's try 90% and yes, you see now it is much more like a pixel art. Of
-
01:08:36 course, the image is changed, but yes, it is working. So this is how you use LoRAs.
-
01:08:43 When you click the edit image, this is the inpainting screen. Now the
-
01:08:47 inpainting is the weakest part of the Stable Swarm UI, but it is getting
-
01:08:52 developed and improved with an amazing speed. There is a programmer genius behind this Web UI. He is working
-
01:09:00 relentlessly. Actually, he didn't sleep like 30 hours after the SD3 release. So this is the
-
01:09:06 in-painting. Mask the area wherever you want to change. You see there is a radius and opacity.
-
01:09:12 I'm going to radious like this and you can also see the opacity of masking like this. So the
-
01:09:17 opacity defines how visible it is from here and the radius defines the radius of the masking. You
-
01:09:25 see the mask is generated here. You can just delete this layer and make another
-
01:09:30 mask. Opacity doesn't matter, but the radius matters how much you are masking.
-
01:09:35 So let's make this like this and let's see. So I'm going to mask this area and
-
01:09:41 I'm going to use prompt of viper snake tail. The LoRA is enabled. Now how much
-
01:09:47 change I want? Let's change it 70%. I disabled all of these options. This
-
01:09:53 matters. If you change them, it changes how the image is inpainted. I'm going to use default
-
01:10:00 settings and hit generate. Let's see what we are going to get. By the way, I am now sorting
-
01:10:06 the generations by date, so I will see the last one in here. This is inpainting
-
01:10:12 and the new image has arrived here. When I click this, you see this is the new image
-
01:10:18 and which was the original image. This was the original image. Let's see. OK, yes, this was the original image
-
01:10:25 and this is the inpainted image. Of course, this wasn't a big inpaint.
-
01:10:30 but this is how you inpaint parts. You can also use segmentation feature to inpaint. It is amazing.
-
01:10:38 You can automatically mask it actually. Let's try a better image for segmentation test. Let's say this. Yes.
-
01:10:45 Let's say reuse parameters. Actually, a cat blue and segment eyes and let's make the eyes different color.
-
01:10:54 Yellow cat eyes. Okay, and everything set. Let's generate. Okay, there is also
-
01:11:00 pixel art selected. Let's also disable it because it is going to load the Stable
-
01:11:04 Diffusion 3 and let's generate. Okay, it is generating the original image, then
-
01:11:09 it is supposed to inpaint the eyes with segmentation. Yes, first image generated.
-
01:11:16 You see it segmented the eyes and it is inpainting. Okay, the inpainting is
-
01:11:21 happening here. Oh, it is generating three images, not inpainting. Let's see. Okay, it did segment the eyes,
-
01:11:30 but I don't see it is different. Weird. What can we do? Let's read back the documentation.
-
01:11:36 It says this is like restore faces. Okay, it says that there is creativity and threshold.
-
01:11:41 Yeah, let's try the creativity and threshold parameters as well. Maybe we need to provide them.
-
01:11:46 So like this, 80% and 0.5. This is for segmenting it. Let's make the number of images. Okay, let's generate.
-
01:11:55 Now it will mask. Yes. Okay, now I can see it is changing the eye. You see.
-
01:12:01 Okay, we typed yellow, so it is generating the same. So let's make this not yellow but blue cat eyes.
-
01:12:08 Let's generate again. You see it did skip the initial image generation because it was already
-
01:12:13 generated and it directly went to the eyes. Okay. Yeah. It didn't change much. Let's make the strength 1.
-
01:12:20 Let's see the difference. Okay. Yeah. It is regenerating. Weird. The prompt is not working as we expected.
-
01:12:27 Let's look at the documentation one more time.
-
01:12:30 Okay, segment text here, weird, it draws the eyes but not working as expected.
-
01:12:38 Okay, why this didn't work is that there was a bug. I have reported this bug in their official
-
01:12:46 channel and the developer already fixed it and pushed it to the repository. That is why it is super
-
01:12:54 important for you to update the application and use it whenever you are using. So how
-
01:13:00 we are going to update this application. First of all, close the CMD window that you have,
-
01:13:05 then go back to your installation folder, enter inside it and you will see here, update Windows.bat
-
01:13:12 file here, double click it, it will update the application. You see the update is happening
-
01:13:19 right now. It is updating all the new changes and the build succeeded. So the update has been completed
-
01:13:25 and it will automatically close the window. Then let's restart the application.
-
01:13:30 from here. So this bug is fixed. Let's reuse parameters from here and then which new feature
-
01:13:38 has arrived? In the regional prompting there is a new parameter which is segment threshold max.
-
01:13:45 You see the maximum mask match value of segment before clumping. I will enable this and I will make this
-
01:13:50 0.60. Then I will regenerate the image and let's see what we are going to get.
-
01:13:58 It is loading the model since we
-
01:14:00 restarted the application. There is also segment mask grow. You see you can also grow the mask of
-
01:14:05 the segmentation. There is also segmentation mask blur. Save segment mask so you can see the
-
01:14:12 segmented mask to understand how your segmentation is working. I will also actually do that in the next
-
01:14:18 generation. Okay now it is inpainting and now yes, I can see that it is changing the eye's color
-
01:14:26 as expected. Not perfectly but it is changing, not likely
-
01:14:30 before. Actually, I did spend a lot of time yesterday for this and you see now it is like
-
01:14:35 this. So we probably need to do some more things. Let's also save the segmentation mask to see
-
01:14:42 what kind of mask we are getting and let's also reduce the this threshold value to 30 percent and
-
01:14:49 let's change the eyes' color as red eyes and generate. Okay now we can see the segmentation
-
01:14:56 mask it is looking accurate you see both of the eyes are
-
01:15:00 masked when we reduced to the threshold to 30 percent. This is amazing. You know in Automatic1111
-
01:15:07 Web UI we cannot do this directly and yes the eyes' colors are changed as you are seeing. Let's
-
01:15:13 say glowing red cat eyes. Another one. This is just mind-blowingly amazing and we can see how it is
-
01:15:20 changing. By the way once I added red cat eyes you see it changed
-
01:15:25 too much so I only need to mask the eyes and
-
01:15:30 change my prompt. So probably I need to make like as glowing red eyes and nothing else.
-
01:15:36 Maybe we can play with this value to like 0.5 and try again. So play with these
-
01:15:42 values until you get a satisfactory result. This is how you segment a certain part with just the
-
01:15:49 prompt and change the
-
01:15:52 that part. Moreover, what we can do is we can use this in the image to image or inpainting. Let's
-
01:15:59 use it in the inpainting. Okay, so I click this image and I will say edit image and in here
-
01:16:05 I will reduce the editing power to the zero. You see init image is selected. I will make
-
01:16:11 the init image creativity zero. So I will only change the eyes. Okay, let's say like
-
01:16:18 this. Why am I doing this? Because this way I can generate different seeds, eyes and
-
01:16:24 this should work by the way. Let's see if it will be naturally looking. Okay.
-
01:16:30 It is getting generated and yes, so you see it only changed the eyes according to my prompt.
-
01:16:37 This is how you can also inpaint quickly with just prompting your existing images.
-
01:16:44 This is just a mind-blowingly amazing feature.
-
01:16:46 So I can just make the seeding random, generate multiple images and pick the very best one. Let's see.
-
01:16:53 So what you need to play with is these values. You see segment threshold max.
-
01:16:58 You can also try this segment mask grow, segment mask blur and see which one is working best.
-
01:17:04 There is also a segmentation model you can select here. So segmentation model will be used when you use this.
-
01:17:10 Currently, we have no segmentation model. When you click here you can see optionally specify a distinct
-
01:17:16 model like the YOLO version model. There is also a regional object inpainting model.
-
01:17:21 You know there are a lot of things and you can join the discord channel of the Swarm
-
01:17:27 UI and ask them in here. The community is amazing, the developer is amazing. We are getting different images right now.
-
01:17:34 When I refresh here, I can see the generated images. Also, the masks are saved, you see,
-
01:17:40 because we did set save segmentation mask. So these are the images. Yes, it is not perfect.
-
01:17:45 I need to play with the parameters more, but now it is working.
-
01:17:49 Okay, this one is looking perhaps the best among them. Okay, there is another one getting generated right now.
-
01:17:55 So the beauty of using edit image inpainting is that we can generate different images with different seeds.
-
01:18:03 Therefore, we can get a better image. Okay, this one is looking. Yes, very cool looking.
-
01:18:08 You see, so this is how you can inpaint.
-
01:18:11 Another very crucial information that I am going to show you regarding upscale is the using tiling.
-
01:18:18 I have asked this to the developer of the Swarm UI and Stable Diffusion 3 architecture is currently
-
01:18:25 incapable of upscaling images as we do with
-
01:18:30 Stable Diffusion 1 or Stable Diffusion XL. So it is not able to generate images at a higher
-
01:18:36 resolution than it was trained on. What does this mean? So you see this is a raw
-
01:18:41 image that has been generated with these parameters. The image resolution is like this 1344 to 768 and when I
-
01:18:51 upscale it without using tiling what happens? Let me show you. So I'm going to upscale it to the 1.5
-
01:19:00 times with 50% denoise refiner control percentage and I will not do tiling and I will use this
-
01:19:08 refiner upscale method. I think this is the very best upscale method 4xLSDIRplusC. It is from
-
01:19:16 the repository that I have shown you. Let's generate this image this way and I will
-
01:19:20 show you the effect of without upscaling this image with Stable Diffusion 3 and what is
-
01:19:26 the effect of it. Okay we've forgotten the refiner so
-
01:19:29 enable and generate. Don't forget to enable. This is a common mistake. So this
-
01:19:34 was the original image and this is the upscaled image.
-
01:19:37 You see at the corners of the image, it is now blurred. This is because Stable
-
01:19:43 Diffusion 3 is not able to generate images bigger than the
-
01:19:48 its resolution, its trained resolution. You will notice this error if you don't
-
01:19:53 do tiling. And what happens when we do tiling, let's use tiling,
-
01:19:57 we may get seams. And what does seams mean?
-
01:20:00 I don't know if in this image we will get but I have an image here that I
-
01:20:05 can show you and when you look carefully you will notice there is another head
-
01:20:11 here you see the head is here but there is also another head here also there is
-
01:20:15 another head deformed head is appearing here this is called as seams. This happens
-
01:20:21 when you use a refiner control percentage like 50%. So how are we going to fix this?
-
01:20:30 I will also show you that after this is completed. You see when you do tiling, it is splitting the image
-
01:20:36 into the tiles, upscaling each part individually, then it is merging all of the parts and the image
-
01:20:42 is generated. Now I don't see that error previously. And in this image, what problem we have,
-
01:20:49 you see the text is broken, like here in this image, we didn't get much like the other image,
-
01:20:56 but this is the error. So what you can do, you can reduce the
-
01:21:00 refiner control percentage like 30%, 35%. Let's try 35% to see and let's also try 30%. Okay. You see you
-
01:21:09 are able to queue different generations so you can just change the parameters and hit generate and
-
01:21:16 they will get added to the queue like this. So this is another amazing feature. Currently 2 running
-
01:21:21 1 queued. Moreover, if you have multiple GPUs you can go to the server, backends and in here you can
-
01:21:28 define multiple GPUs.
-
01:21:30 You see currently GPU id 0 is used so what you need to do is you need to add another
-
01:21:36 backend. So I will add another ComfyUI self-starting backend. You see. This was the original and then I am
-
01:21:44 going to copy paste the parameters here. Okay there is no extra arguments, auto update, enable
-
01:21:50 previews, GPU id 1. Yes it is looking good and save. So it is going to start another backend
-
01:21:58 in my second GPU which I should be able to see here and also let's check out the CMD window
-
01:22:05 okay it says that it is started. So now my second GPU should be also used.
-
01:22:10 It is already being used by the video recording and I think it will be available in the next generation.
-
01:22:17 Let's go back to the Generate. Okay all three images are generated. Let's look at them.
-
01:22:22 So the error is still visible, also visible, and visible here as well. Was it visible in the original image?
-
01:22:29 That also matters. So we need to check that too. Let's go to the top and refresh. And
-
01:22:35 let's see. Okay, I think, yeah, this was the original image. So it was also visible in the original image.
-
01:22:41 So it is not a something new. But you can see that this is how the images are generated, how
-
01:22:46 much the denoise is making effect. You see, I think this one is looking the best. And this one
-
01:22:52 has, let's see the parameters. Okay, this one has, okay, this one has refiner control
-
01:22:59 percentage 30%. So this was the lowest denoise strength of the Automatic1111
-
01:23:07 equivalent, which is refiner control percentage. This way you can try. Since we
-
01:23:12 started another instance, now let's see if it will run both of them. Okay, let's generate
-
01:23:18 this 25% and let's generate 40%. So we have two current generations and two running. That
-
01:23:25 means it will run on both of the GPUs. Okay. No, it didn't start yet.
-
01:23:31 I think it is loading onto the second GPU right now. Yeah, probably. Let's just see what happens.
-
01:23:37 Maybe we need to entirely restart the Stable Swarm UI.
-
01:23:40 I am not sure, but this is the way of having multiple GPUs.
-
01:23:45 By the way, it will not combine the power of two GPUs.
-
01:23:49 It will just queue each generation on each GPU. So let's see the view logs.
-
01:23:56 Okay, I guess the logs of this is not ready. Okay, it says self starting running backend one. Yes, now
-
01:24:03 I can see, so it started, this is the second
-
01:24:05 GPU logs. Let's go to the backend. Okay, this is one generated. Okay, let's make another try to see,
-
01:24:12 for example, this image. Let's reuse parameters. It is using the tiling and 30 percent. Let's make this
-
01:24:20 35 percent, generate, and let's make it 40 percent and generate. Now 40 generations are running
-
01:24:28 14 queued because it is generating 20 images. And let's go to the server backend. Okay, let's update it. Yes,
-
01:24:36 now it says loading one new model, loading in low VRAM mode. Why? Because my second GPU has
-
01:24:42 lower VRAM. And now,
-
01:24:43 yes, I can see that. So it queued both of the operations in both of the GPUs, and
-
01:24:50 it is generating everything on both of them. I am going to cancel because I don't want my video
-
01:24:55 frame to be dropped because I am using the video recording on the second GPU. Okay, let's just
-
01:25:00 cancel all the operations. Yes, all the operations are canceled. This is how you use multiple GPUs
-
01:25:05 as well. One another very important thing is the colors saturation. You see this color is extremely
-
01:25:13 saturated. So when you get such saturated colors, what you can do? What you can do is reduce the CFG
-
01:25:21 scale. Okay, so I clicked the reuse parameters from here, and
-
01:25:25 I am going to reduce the CFG scale to 5, and let's see the difference.
-
01:25:30 Okay, let's hit generate. These are all Stable Diffusion 3 experimentation that I am doing
-
01:25:35 right now. So this is for Stable Diffusion 3. It depends on each model. Moreover, when you reduce the
-
01:25:41 CFG scale, its prompt following will be also reduced. So CFG scale higher is better to follow prompting,
-
01:25:51 but this is what we can have. Okay, it is also using my second GPU as well right
-
01:25:55 now since I did set dual GPUs from the server
-
01:25:59 configuration, from the backend. So it is generating two images at the same time right now on my
-
01:26:06 both of the GPUs. Okay, the first image is generated. Let's refresh. You see the color
-
01:26:12 saturation is much more degraded. However, the prompt following became much worse, so
-
01:26:18 there is a trade-off between the two, but as you generate more images, you can still get very
-
01:26:24 good images. You see the other image with the different seed looks much better.
-
01:26:30 I'm just waiting it. So you can reduce the CFG scale, generate multiple images, and still get a very
-
01:26:37 accurate prompt following images because Stable Diffusion 3 has a very powerful text encoders. Clip + T5. Okay,
-
01:26:49 the second image is also generated, and this is the image. Not the very best image
-
01:26:54 because it is according to the prompt. But you understand the logic. Let's cancel this. So what are the
-
01:27:00 best settings for Stable Diffusion 3 that I have found? I think CFG scale 7 is working fine,
-
01:27:05 but in some cases, it may over saturate the image. I'm using 40 steps, and for sampling, I am using
-
01:27:13 UniPC, scheduler normal, text encoders Clip + T5, and for refiner upscale, I use 30 percent. I
-
01:27:21 use refiner steps 40. I use refiner method post apply, refiner upscale 1.5, and tiling
-
01:27:30 and as an upscaler model, I am using this one.
-
01:27:34 But you can still compare all of them to see which one is most you are liking.
-
01:27:39 Now I will show you one another amazing quick feature, which is applying upscale preset into your existing images.
-
01:27:48 So let me show you my preset. This is my upscale preset.
-
01:27:51 It also has initial image generation resolution. You can also disable it, but it is enabling refiner model upscale.
-
01:28:00 So since I have used the same generation of the resolution, what I'm going to do
-
01:28:06 is like this. I go to the image history, and these are my raw images. And let's
-
01:28:11 say I liked this image, and I want to upscale it. So I go to the preset and
-
01:28:15 select my preset and I just hit generate, and it will regenerate this image and
-
01:28:21 upscale it. So this is another very convenient way of using presets to upscale
-
01:28:27 your images quickly. The liked ones. You don't need
-
01:28:30 to upscale everything. You can select the liked ones and then you can upscale them this way very
-
01:28:35 easily. So we have shown a lot of stuff. There is also LoRA extractor, pickle to safetensors. You see,
-
01:28:41 you can convert models, Clip tokenization. This is also very useful because some people using some
-
01:28:47 rare tokens when doing training, like my awesome model, they are typing like this. However, this is
-
01:28:55 actually a lot of, a lot of token. So when you type this here, you
-
01:28:59 will see that this is actually three tokens, my awesome model, or they are using some random
-
01:29:04 stuff like this. You see each one is getting into different tokens.
-
01:29:09 That is why we are using a rare token as OHWX.
-
01:29:13 This is a rare token, 48993 ID, and it's a single token.
-
01:29:20 You see the letter "1" is this token, 272, "a" letter is 320.
-
01:29:26 As you go bigger in numbers, it will be more
-
01:29:30 rare usually. It is expected. Okay, I think we have shown everything. Okay, this is it
-
01:29:37 for today. In the Stable Swarm UI Github repository, you will find their
-
01:29:42 Discord link. Just type Discord and this is their Discord link. You can join and
-
01:29:48 chat with them. You see this is their Discord. I am also active in their
-
01:29:52 Discord. You should also join our Discord channel as well. You will see the Discord
-
01:29:56 channel here in this post. The link of this post will be
-
01:30:00 in the description of the video. You can also go to our Patreon exclusive index and read
-
01:30:05 our Patreon posts, our scripts. Please also star our repository. So go to our repository
-
01:30:12 from this link, star it, fork it, and watch it. And you can also be a sponsor of me. Hopefully,
-
01:30:18 see you in future amazing stories. Ask me any question that you have and I will try to answer
-
01:30:23 them. This Web UI is just amazing. Hopefully, I will also look at these other
-
01:30:30 features it has and I will make further tutorials for this amazing Web UI. You see, there is
-
01:30:36 also even regional prompting and other stuff. But for today, this is it. Hopefully, see you later.
