Generate Studio Quality Realistic Photos By Kohya LoRA Stable Diffusion Training Full Tutorial

Generate Studio Quality Realistic Photos By Kohya LoRA Stable Diffusion Training - Full Tutorial

Full tutorial link > https://www.youtube.com/watch?v=TpuDOsuKIBo

#Kohya SS web GUI DreamBooth #LoRA training full tutorial. You don't need technical knowledge to follow this tutorial. In this tutorial I have explained how to generate professional photo studio quality portrait / self images for free with Stable Diffusion training.

Our Discord server ⤵️

https://bit.ly/SECoursesDiscord

If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰 ⤵️

https://www.patreon.com/SECourses

Technology & Science: News, Tips, Tutorials, Tricks, Best Applications, Guides, Reviews ⤵️

https://www.youtube.com/playlist?list=PL_pbwdIyffsnkay6X91BWb9rrfLATUMr3

Playlist of #StableDiffusion Tutorials, Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img ⤵️

https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3

Gist file used in tutorial ⤵️

https://github.com/FurkanGozukara/Stable-Diffusion/blob/main/Tutorials/Generate-Studio-Quality-Realistic-Photos-By-Kohya-LoRA-Stable-Diffusion-Training-Full-Tutorial.md

How to install Python and Git tutorial ⤵️

https://youtu.be/B5U7LJOvH6g

Master DreamBooth tutorial to learn rare tokens, instance prompt, class prompt and such ⤵️

https://youtu.be/Bdl-jWR3Ukc

How to fix distant faces with inpainting ⤵️

https://youtu.be/sRdtVanSRl4

How to install and use Automatic1111 Web UI for Stable Diffusion ⤵️

1 : https://youtu.be/AZg6vzWHOTA

2 : https://youtu.be/AZg6vzWHOTA

How many classification images performs best for DreamBooth training ⤵️

https://youtu.be/Tb4IYIYm4os

How LoRA training actually works tutorial ⤵️

https://youtu.be/mfaqqL5yOO4

Watch this tutorial to understand how token thing actually works ⤵️

https://youtu.be/dNOpWt-epdQ

00:00:00 Introduction to Kohya LoRA Training and Studio Quality Realistic AI Photo Generation

00:02:40 How to download and install Kohya’s GUI to do Stable Diffusion training

00:05:04 How to install newer cuDNN dll files to increase training speed

00:06:43 How to upgrade to the latest version previously installed Kohya GUI

00:07:02 How to start Kohya GUI via cmd

00:08:00 How to set DreamBooth LoRA training parameters correctly

00:08:10 How to use previously downloaded models to do Kohya LoRA training

00:08:35 How to download Realistic Vision V2 model

00:08:49 How to do training with Stable Diffusion 2.1 512px and 768px versions

00:09:44 Instance / activation and class prompt settings

00:10:18 What kind of training dataset you should use

00:11:46 Explanation of number of repeats in Kohya DreamBooth LoRA training

00:13:34 How to set best VAE file for better image generation quality

00:13:52 How to generate classification / regularization images via Automatic1111 Web UI

00:16:53 How to prepare captions to images and when you do need image captions

00:17:48 What kind of regularization images I have used

00:18:04 How to set training folders

00:18:57 Best LoRA Training settings for minimum amount of VRAM having GPUs

00:21:47 How to save state of training and continue later

00:22:44 How to save and load Kohya Training settings

00:23:31 How to calculate 1 epoch step count when considering repeating count

00:24:41 How to decide how many epochs when repeating count considered

00:26:00 Explanation of command line parameters displayed during training

00:28:19 Caption extension changing

00:29:24 After when we will get a checkpoint and checkpoints will be saved where

00:29:57 How to use generated LoRA safetensors files in SD Automatic1111 Web UI

00:30:45 How to activate LoRA in Stable Diffusion web UI

00:31:30 How to do x/y/z checkpoint comparison of LoRA checkpoints to find best model

00:33:29 How to improve face quality of generated images with high res fix

00:36:00 18 Different training parameters experiments I have made and their results comparison

00:36:42 How to test 18 different LoRA checkpoints with x/y/z plot

00:39:18 How to properly set number of epochs and save checkpoints when reducing repeating count

00:40:36 How to use checkpoints of Kohya DyLora, LoCon, LyCORIS/LoCon, LoHa in Automatic1111 Web UI

00:42:12 How to install Torch 1.13 instead of 1.12 and newer xFormers compatible with this version

00:43:06 How to make Kohya scripts to use your second GPU instead of your primary GPU

Dreambooth LoRA training is a method for training large language models (LLMs) to generate images from text descriptions. It is a combination of two techniques: Dreambooth and LoRA.

Dreambooth is a method for generating images from text descriptions by iteratively updating the image to match the text description. It works by first generating a random image, then using a text-to-image model to generate a new image that is closer to the text description. This process is repeated until the image is sufficiently close to the text description.

LoRA is a method for improving the performance of Dreambooth by using a latent representation of the image. LoRA works by first generating a latent representation of the image. This latent representation is then used to train a text-to-image model.

Video Transcription

00:00:00 Hello everyone.
00:00:01 I am thrilled to be here today to share with you an exciting tutorial on generating your
00:00:05 own high-quality photos.
00:00:07 The purpose of this tutorial is to provide you with an easy and accessible way to generate
00:00:12 photos that look like they were taken in a studio.
00:00:15 Best of all, it is completely free and open-source.
00:00:17 We all know that professional photo studios can be expensive and sometimes the results
00:00:22 may not be what we expected.
00:00:24 By following this tutorial, you will no longer need to spend a fortune on studios.
00:00:29 Moreover, you don't need any technical knowledge to follow along as I will guide you through
00:00:33 every step of the process.
00:00:35 We will be using Realistic Vision v2 as our model and Kohya web graphical user interface
00:00:41 for training.
00:00:42 Specifically, we will be using Kohya LoRA to train our model.
00:00:45 For photo generation, we will use Automatic1111 web UI.
00:00:49 But don't worry if you are not familiar with these terms.
00:00:52 I will provide clear explanations of each step and guide you through the process.
00:00:57 By following the correct parameters for LoRA training that I will show you, you will be
00:01:01 able to compose ultra-high quality realistic photos of yourself, just like the ones you
00:01:06 are seeing on the screen right now.
00:01:08 But that is not all.
00:01:10 This tutorial is not just limited to realistic photo generation.
00:01:13 You can also use the technique that I will show you to generate stylized images, train
00:01:17 art styles, animals, objects, and much more.
00:01:20 To make it easier for you to navigate this tutorial I will add YouTube chapters so you
00:01:25 can jump to the specific sections that interest you the most.
00:01:29 Additionally, I will provide subtitles in most languages to make it accessible to as
00:01:33 many people as possible.
00:01:35 Lastly, I want to let you know that I have shared the prompts I used to generate these
00:01:40 images in a Gist file which I will introduce in the following parts of the video.
00:01:45 Please check out the description of the video for all of the links.
00:01:48 Thank you for joining me today and let's get started on making some amazing photos.
00:01:53 To be able to follow this tutorial, you need to install three packages first.
00:01:58 The first one that you need to install is Microsoft Visual C++ Redistributable.
00:02:03 The link is here.
00:02:05 Also, I have posted the links on a Gist file and the link of this Gist file will be in
00:02:11 the description of the video.
00:02:12 The second thing that you need to install is Python 3.10.9.
00:02:17 The download link of Python is at the bottom also in the Gist file.
00:02:22 And the final thing is Git for Windows.
00:02:25 If you don't know how to install Python and Git and use them, I have an excellent tutorial
00:02:30 on my channel how to install Python, set up virtual environments, and other things.
00:02:35 Just watch this tutorial and you will be able to follow this tutorial without any issues.
00:02:40 We will begin with cloning the GitHub repository into our hard drive.
00:02:44 So copy the URL of the GitHub.
00:02:47 Enter inside the folder where you want to clone them.
00:02:50 I will clone into my F drive.
00:02:52 Type cmd.
00:02:53 Start your cmd inside that folder.
00:02:56 Type git clone.
00:02:57 Paste the URL and it will get cloned into your drive like this.
00:03:01 Then enter inside the cloned folder.
00:03:03 Then double click setup.bat file.
00:03:07 It is here.
00:03:08 If you are not able to see the file extensions, go to view menu and click file name extensions
00:03:14 here.
00:03:15 Double click the setup.bat file.
00:03:17 It will start the installation process within a new cmd window.
00:03:22 It will ask you whether to use Torch version 2 or Torch version 1.12.
00:03:27 I have tested both of them on my RTX 3090 graphic card and I found that Torch version
00:03:34 1.12 is working better.
00:03:37 So I will go with option one.
00:03:39 Then choose v1 like this.
00:03:41 The installation process may take a while.
00:03:44 Just be patient.
00:03:45 It depends on your hard drive speed and also your internet connection speed.
00:03:50 Then after a while it will ask you in which compute environment are you running.
00:03:55 Select option this machine.
00:03:56 Then it will ask you these options.
00:03:59 Select no distributed training.
00:04:01 Then it will ask you this one.
00:04:03 Do you want to run your training on CPU only.
00:04:06 If you have a GPU card which has over 8 GB VRAM then select no.
00:04:12 This is important.
00:04:14 Also it will ask you do you wish to optimize your script with Torch Dynamo.
00:04:18 Select no.
00:04:19 Then it will ask you do you want to use deep speed.
00:04:21 Select no.
00:04:22 Then it will ask you, what GPUs by ID should be used for training.
00:04:27 Type all.
00:04:28 Hit enter.
00:04:29 Okay this is important.
00:04:31 Not all of the cards do support BF16.
00:04:34 If you have a lower VRAM then you should pick FP16 or BF16 depending on your card.
00:04:41 So you can pick FP16 to be sure.
00:04:43 I have over 8 GB VRAM therefore, I am selecting no.
00:04:48 And the installation is completed.
00:04:50 After you hit that button, it will close the CMD window.
00:04:53 By the way this PDF file is posted on our Patreon page.
00:04:58 So if you also want to access that, you can become our Patreon supporter.
00:05:01 I would appreciate that very much.
00:05:03 Okay as the next step, we are going to update our cuDNN files.
00:05:08 Why?
00:05:09 Because on the KohyaSS GitHub repository, they mentioned that cuDNN 8.6 dll files will
00:05:16 significantly increase your training speed for the newer cards.
00:05:21 And how you gonna do it?
00:05:22 There is this link to download them.
00:05:25 Click that and download the files.
00:05:26 I also put this link into the Gist file as well.
00:05:30 If you are using Torch version 2 you don't need to install this because this is only
00:05:34 necessary for Torch version 1.12.
00:05:37 Once the file is downloaded cut it.
00:05:40 Paste into your main installation folder.
00:05:42 Right click extract here.
00:05:44 It will be extracted as cuDNN windows like this.
00:05:47 Then enter inside vmv folder.
00:05:50 Enter inside scripts folder.
00:05:52 Open CMD.
00:05:53 You see currently I am inside F kohya_ss venv scripts.
00:05:57 Type activate.
00:05:59 Don't forget this.
00:06:00 Once you typed activate.
00:06:01 You will get virtual environment activated like this.
00:06:05 And then copy this command from the Gist file.
00:06:08 Right click copy.
00:06:10 Paste it here like this while virtual environment is activated.
00:06:13 Okay it says it can't find the py file because our pathing is different.
00:06:20 This py file is inside tools and in here you will see cuDNN install py.
00:06:26 While pressing left shift key, right click click copy as path.
00:06:30 Then type python.
00:06:33 Right click and paste the file path and hit enter and it will install the necessary cuDNN
00:06:39 file.
00:06:40 So you see these are the file versions right now.
00:06:42 Let's say the script got updated and you want to upgrade it.
00:06:46 All you need to do is go to your installation folder and in here you will see upgrade.bat
00:06:52 file.
00:06:53 Double click it and it will upgrade to the latest version like this.
00:06:56 Now we are ready to use Kohya ss on our windows.
00:07:00 All we need to do is using this command prompt.
00:07:03 Now I will explain you what are these doing.
00:07:06 The first one is the ip that it will be started on.
00:07:10 The second one is the port that it will be started and in browser will start it within
00:07:16 browser.
00:07:17 There is also --share but you don't need this if you don't want to share your started Kohya
00:07:22 ss GUI.
00:07:23 So now we will just copy this command.
00:07:26 Open a new cmd window inside our installed folder.
00:07:30 Paste it and hit enter and it will display you the values and then it will automatically
00:07:34 open the Kohya ss GUI like this.
00:07:37 Now we can start doing our LoRA training to obtain awesome fantastic studio quality images
00:07:45 of ourselves or with other styles as we wish.
00:07:48 It is totally up to you.
00:07:50 There are DreamBooth training, DreamBooth LoRA training, and DreamBooth textual inversion
00:07:54 training.
00:07:55 In this tutorial I will only focus on DreamBooth LoRA training.
00:07:58 So this is the main interface of LoRA training.
00:08:01 It allows you to quickly select the models that you want to use.
00:08:05 It will download the model if you choose like this.
00:08:08 However if you have previously downloaded models, then you can pick by using this icon
00:08:14 here.
00:08:15 Click here to select your model.
00:08:16 The model is inside my 0 models folder.
00:08:19 I will just pick it from here.
00:08:22 Realistic vision version 20, version 20 safetensors file and you see now the path is like this
00:08:28 and the model quick pick is custom.
00:08:30 Save trained model as safetensors.
00:08:33 You can download this model from here.
00:08:35 The direct link to download the model is also available in the Gist file.
00:08:40 Just right click and go to and it will automatically start downloading the safetensors files for
00:08:46 you like this.
00:08:47 Okay after we selected our model, then we will set up our training folder.
00:08:52 By the way: if you use version 2 model like sd 2.1 version, you need to pick this and
00:08:58 if the model is 768 pixels, then you need to check this mark as well.
00:09:05 Don't forget that.
00:09:06 So let's say you have selected Stability AI Stable Diffusion 2.1 version.
00:09:11 Then you need to check version 2 and also v parameterization.
00:09:15 If you use only Stable Diffusion 2.1 base, then you don't need to check v parameterization.
00:09:21 You are also seeing that it is automatically checking this or unchecking this.
00:09:25 So this is how you can do training on 2.1 version.
00:09:30 However I will be doing the training on this model and I need to uncheck these two because
00:09:35 it is SD 1.5 version model.
00:09:39 Then go to the tools and in here we will set up our training files.
00:09:43 So we will begin with instance prompt.
00:09:46 You may wonder what is instance prompt.
00:09:48 This is our activation prompt to be able to generate images.
00:09:53 This will be a rare token.
00:09:55 Therefore, I will use ohwx.
00:09:58 If you want to learn more about rare tokens and all of this stuff.
00:10:02 I have zero to hero Stable Diffusion DreamBooth tutorial.
00:10:05 You can watch this and learn much more about rare tokens and how to use them and what are
00:10:11 they doing, what are they for.
00:10:13 And the class prompt.
00:10:14 Since I am going to train myself the class prompt will be man.
00:10:18 Okay training images.
00:10:19 The training images is the images of yourself that you want to do training.
00:10:24 Click the folder icon.
00:10:25 Select the folder where your own images are put.
00:10:30 So here this is my training folder.
00:10:32 Now I will show you the content of it.
00:10:34 So this is my Kohya training folder.
00:10:37 I have 14 images.
00:10:39 The important thing is that you shouldn't have repeating backgrounds or clothings.
00:10:44 I have some repeating backgrounds and clothings in these images, but it is fine.
00:10:48 The model is still able to learn very well.
00:10:51 All other images have, you see, different backgrounds and different clothings.
00:10:55 This is really important.
00:10:57 Only the repeating pattern should be the thing that you want to teach and in this case it
00:11:02 is my face.
00:11:03 You can also include some distance shots.
00:11:05 With that way the model will be able to learn that distance and the pose.
00:11:10 However if it is too distant shot like this one, then the face will not be very well.
00:11:16 In that case, we can use impainting to fix our faces after the initial image generation.
00:11:22 The model will learn whatever the pose you have in your training dataset.
00:11:27 So include the poses that you want to teach to the model.
00:11:31 Also having too many images in the training dataset is not good.
00:11:35 You should have high quality images and they don't have to be too many.
00:11:40 So this training dataset is working very well.
00:11:43 It is not the best dataset because of the quality of the images.
00:11:47 Okay now number of repeats.
00:11:49 This is really hard to understand at the beginning.
00:11:52 Normally one epoch means that one time each one of the images are trained.
00:11:58 However with Kohya SS they changed how one epoch works.
00:12:02 So when you set 40 repeats, it will train each training image 40 times and it will count
00:12:09 that as a one epoch.
00:12:11 So why this is useful?
00:12:13 Let's say you are going to train multiple instances with different instance prompts
00:12:18 and the data you have is unbalanced.
00:12:20 Like let's say person A has 50 images, person B have 25 images.
00:12:26 Then with repeat count, you can balance the images training.
00:12:30 Also someone has already asked this question in the Kohya SS development GitHub repository
00:12:38 and the answer of the author Kohya SS is like this: It is not also very clear.
00:12:43 However you can read this and try to understand what he is meaning.
00:12:47 Moreover according to my experimentation, this is also equal to number of classification
00:12:53 per instance images in the DreamBooth training.
00:12:55 So when you set this 40, it will use 40 different classification / regularization images per
00:13:01 training image.
00:13:02 Then we need regularization images.
00:13:05 You need to generate your regularization images yourself.
00:13:08 So for generating our regularization images, we are going to use Automatic1111 web UI.
00:13:13 I already have two excellent tutorials for how to install and use Automatic1111 web UI
00:13:19 on pc.
00:13:20 You can watch both of these awesome tutorials to learn how to install and use Automatic1111
00:13:26 web UI on your computer.
00:13:28 My Automatic1111 web UI instance is started.
00:13:32 So for generating classification images, I suggest you to change your VAE file, go to
00:13:37 the settings, go to the Stable Diffusion tab, and select your SD VAE from here.
00:13:42 Default VAE.
00:13:43 You can download this VAE file from this link which is available in the Gist file.
00:13:48 After downloading VAE file, put it inside models VAE folder and restart your web UI
00:13:53 and you will be able to select it from here.
00:13:56 Okay how we are going to generate our regularization images.
00:14:00 You will just use the class prompt to generate them.
00:14:03 Since I am teaching myself as a man I am using class prompt as a man.
00:14:07 Let's say you want to train your daughter, then you should use girl as a class prompt.
00:14:11 If you are going to train your wife then you can use woman as a class prompt.
00:14:17 If you are going to train your son then you can use boy as a class prompt.
00:14:22 If you are going to train other things such as if you want to train a drawing style, then
00:14:26 you can use aesthetic word.
00:14:28 If you are going to train a tank image, then you can use tank.
00:14:32 Select the model that you are going to do training on in the Stable Diffusion checkpoint.
00:14:37 Type your class prompt.
00:14:38 You don't need to change anything else.
00:14:41 I am also not using any negative prompt.
00:14:43 Make the batch count and batch size like this and hit generating.
00:14:48 The number of classification / regularization images that you need totally depends on number
00:14:52 of training images you have.
00:14:55 So it should be minimum number of repeats multiplied with number of training images
00:14:59 you have.
00:15:00 In my case, it is 14 multiplied by 40 and it is 560.
00:15:07 After you generated the images, click this icon and it will list you the folder where
00:15:12 the images are generated.
00:15:14 Enter inside the latest folder.
00:15:16 In here, just select the generated images and put them into a folder.
00:15:20 So currently my regularization images are located inside my F drive here in man realistic
00:15:27 vision version 2, 2071 images folder.
00:15:30 Like this.
00:15:31 The model was pretty biased.
00:15:33 Therefore, I have cleaned the generated images and posted the generated images on our Patreon.
00:15:39 So if you are our Patreon user, you can directly download the generated classification images
00:15:43 and use them.
00:15:44 So this is the number of repeatings for per classification images that you need.
00:15:49 I am leaving them as 1 and in here you are selecting the destination training directory.
00:15:54 This is where your trained model files will be located.
00:15:57 Also, it will copy the training images and the regularization images there with the correct
00:16:03 naming.
00:16:04 So for selecting it click this folder icon.
00:16:06 I will make a new folder in my F drive as kohya training video like this.
00:16:12 I will select the folder and you see it is selected like this.
00:16:16 Then click prepare training data.
00:16:18 It will generate 4 folders like this.
00:16:21 In the image folder you will see your training images named like this: 40 ohwx man.
00:16:28 So how this naming is happening.
00:16:31 So the first is the number of repeats.
00:16:33 The script will read this and decide how many times it will repeat based on this, then it
00:16:40 will write the merged instance prompt and the class prompt like this.
00:16:44 So it will use ohwx man as a training prompt.
00:16:48 If you change the folder name, it will use the new folder name that you have made.
00:16:53 It totally depends on the folder name for this.
00:16:56 Moreover, from utilities tab, you can also add captions to your images.
00:17:00 However, for training a person, I don't suggest you to use captions.
00:17:05 It supports a lot of captioning like basic captioning, BLIP captioning, GIT captioning,
00:17:10 and other captionings.
00:17:12 The captioning is necessary when you do fine tuning, which means you have a lot of images,
00:17:17 different concepts, and you are captioning each image to train all of them at once into
00:17:23 your LoRA model.
00:17:25 I may make another video for how to do fine tuning, but in this video you don't need captioning
00:17:30 to train yourself or training a style.
00:17:33 So when you train a single thing, you don't need captioning usually.
00:17:37 In the log folder, the logs of the training will appear.
00:17:41 In the model folder the saved checkpoints of the model will appear.
00:17:45 In the reg folder you will see regularization images.
00:17:49 So it is 1.
00:17:50 It means one time repeating and inside man the regularization images are there.
00:17:55 The naming of the images are not important.
00:17:58 If you use captions then the caption txt files has to be with the same name of the images.
00:18:04 Then click copy info to folders tab.
00:18:08 It will copy the folders here like this.
00:18:10 You see image folder or output folder.
00:18:13 You can also manually select each folder by clicking here and selecting folders.
00:18:19 For example, let's select the image folder from here.
00:18:22 Click select folder and the folder is selected.
00:18:25 So all folders are set.
00:18:26 In here we are selecting the model output name.
00:18:30 I will name this as test one so it will be the name of our generated LoRA files.
00:18:37 You can also add training comment metadata to your model files like ohwx is the instance
00:18:44 prompt and man is the class prompt.
00:18:48 So use ohwx man to be able to use this model.
00:18:54 Then finally the training parameters.
00:18:56 I have done 18 tests and I will show you the best parameters I have obtained.
00:19:02 So everything is by default right now.
00:19:05 I have tested with fp16.
00:19:07 However, if you have a newer card, you can also use bf16, but fp16 works on all of the
00:19:14 cards.
00:19:15 You can set the number of cpu threads.
00:19:17 I have used cache latent.
00:19:19 It will increase your VRAM usage, so be careful with that.
00:19:22 You can set a seed like this.
00:19:24 This is only necessary if you do multiple trainings and you want to compare them.
00:19:28 The learning rate, learning scheduler I didn't change them, I didn't change learning warm
00:19:33 up of steps.
00:19:34 I didn't change the optimizer.
00:19:35 You can also use other optimizers, but some of them will increase your VRAM usage so you
00:19:40 may get out of memory error if you have 8 gigabyte VRAM having graphic card.
00:19:45 I didn't use any extra arguments, I didn't change text encoder learning rate UNET learning
00:19:50 rate.
00:19:51 The best network rank that I have found is 128.
00:19:55 Now this will significantly increase the space it takes.
00:19:59 Let me show you.
00:20:00 So when you make the rank 8, it will take only 9 megabytes of hard drive space.
00:20:06 When it is 16, it will take 18 megabytes.
00:20:09 When it is 32, it will take 36 megabytes and with the best settings it is taking 148 megabytes
00:20:17 as you are seeing.
00:20:18 So the network rank dimension 128 is working best.
00:20:21 Network alpha 1 is working best.
00:20:24 I will also show you the comparison of different settings.
00:20:27 Enable buckets.
00:20:28 Max resolution this is important.
00:20:31 If you want to train your model with 512 and 512 don't change this.
00:20:36 However, you can also use higher resolution training like 768 768.
00:20:42 This will increase your VRAM usage.
00:20:46 Also, if you use higher resolution training images, they will get downscaled to max resolution.
00:20:53 So if my training images were 1024 1024, they would be automatically downscaled to 512 and
00:21:01 512.
00:21:02 Stop text encoder training 0.
00:21:03 I am doing text encoder training up to the end.
00:21:06 You see there are just so many parameters that you can test.
00:21:10 Therefore in the beginning just stick to these parameters and later you may try with them
00:21:16 to see what kind of effect they are making.
00:21:18 Then there is advanced configuration.
00:21:20 In here I didn't play with these.
00:21:22 You see there are just so many parameters and they are for experienced users.
00:21:27 However, if you get out of memory error, you can check gradient checkpointing and memory
00:21:32 efficient attention.
00:21:33 These two will reduce your VRAM usage.
00:21:36 Also, you can uncheck use xFormers to get better quality.
00:21:40 However, I made these experiments for minimum VRAM having graphic cards so I used the xFormers.
00:21:47 And let's say you want to continue your training later.
00:21:51 To be able to do that, you need to click save training state.
00:21:54 When you save training state, it will generate training state files at every saving epoch.
00:22:00 Each saved state will take 4 gigabytes hard drive on your computer so be careful with
00:22:05 that then by selecting it from here like this select folder you can continue training from
00:22:12 the last save training state.
00:22:14 However, you don't need that usually and you can also set max train epoch when you do continuing.
00:22:20 In the sample images config you can also generate sample images.
00:22:24 They will be saved based on every n epochs or n steps and you can type the sample prompt
00:22:29 here.
00:22:30 I didn't use any sampling because I used x/y/z plot comparison after the training.
00:22:36 I will show you how to do that and I prefer that.
00:22:39 This also increases VRAM usage during training.
00:22:43 Once you are set with all the parameters, go to the configuration file here.
00:22:47 Click save as, select the folder where you want to save.
00:22:51 I will save it inside Kohya training video.
00:22:54 Give a file name like test 1 click save and it will generate a test 1 json file like this.
00:23:01 At a later time you can load your settings.
00:23:03 Let me demonstrate.
00:23:04 I will refresh this page.
00:23:06 Go to the DreamBooth LoRA, go to configuration click open, select the test 1 json and it
00:23:11 will automatically load every one of the settings we have made as you are seeing right now.
00:23:17 So these were the settings we did set.
00:23:19 So the only thing I have changed is network rank dimension 128 and nothing else and we
00:23:26 also need to set number of epoch and save every n epochs.
00:23:31 This is also extremely important and a little bit hard to do.
00:23:36 So we did set repeats number as 40 and I have 14 images.
00:23:41 The Kohya script is working a little bit different to calculate one epoch.
00:23:46 So with 14 images and 40 repeats, one epoch will be equal to 560 steps.
00:23:51 Now this is a really high number when you are teaching your face.
00:23:56 Let's say if I had 20 training images then it would become 800 steps for one epoch.
00:24:02 Therefore, I suggest you to reduce number of repeats if you have too many training images,
00:24:07 but you shouldn't also have too many training images to teach yourself.
00:24:11 Since one epoch will equal to 560 steps, I will save every one epoch.
00:24:17 Moreover, since we are using classification images, it will halve the number of epochs
00:24:22 we write here.
00:24:23 Therefore, when we set as epoch count 2 and save every n epochs, it will only make single
00:24:31 checkpoint after 14 multiplied with 40 multiplied with 2 number of steps which is equal to 1120
00:24:40 number of steps.
00:24:41 So this will be equal to 40 epochs of DreamBooth training in SD web UI DreamBooth extension.
00:24:48 Therefore, I will train up to 12 epoch which is actually up to 6 epochs because it is halved
00:24:55 since we are using classification images and I will save every 1 epoch.
00:24:59 Just click save, it will be saved inside this json file.
00:25:03 This is how you decide number of epochs based on number of training images.
00:25:08 You can reduce to this number of repeating and increase number of total epochs or save
00:25:14 every n epoch.
00:25:16 For example, if I make number of repeating to 10 then I need to change this to 48 epoch
00:25:23 and change this to save every 4 epoch.
00:25:26 Because now one epoch will be 10 steps training of every training images.
00:25:31 I know this is a little bit confusing so you can use just the same numbers I have.
00:25:37 Repeats 40, save every n epoch is 1 and total number of epochs 12.
00:25:43 Click save.
00:25:44 Before starting training, you can click print training command here in the bottom.
00:25:49 It will print you the command that it is going to use for training like this.
00:25:54 Then you can click train model and it will use same command to start training.
00:26:00 So now it is caching the latents.
00:26:02 I will explain you in a moment all of the things written here.
00:26:06 Okay, training has started.
00:26:08 Now I will copy everything with ctrl c and I will explain you every one of the parameters.
00:26:14 Okay now we are seeing the copy pasted command line parameters in here so let me explain
00:26:21 you them.
00:26:22 So once we clicked print prompt.
00:26:23 It scanned the folders and found 14 images for training and it displays 560 steps.
00:26:31 So this is the number of steps it takes for one epoch and since we are going to train
00:26:37 up to 12 epochs, you see the max train steps are 6720 and since we didn't change warm-up
00:26:46 steps, the 10% of warm-up steps are written here.
00:26:51 Also once the training has started it was displaying found directory like this: 14 images.
00:26:56 Found directory of regularization images 2071 images.
00:27:01 560 train images with repeating.
00:27:04 2071 regularization images because repeating was 1 for regularization images.
00:27:10 Some of the regularization images are not using, however it is not displaying how many
00:27:14 is not used.
00:27:15 So this is the target resolution.
00:27:17 Batch size 1.
00:27:18 You can also increase batch size depending on how much VRAM you have.
00:27:22 Min bucket resolution maximum bucket resolution.
00:27:25 This is the max resolutions that it is allowing in per bucket.
00:27:29 So let's say if I had images in training folder like this, they would be used as it is because
00:27:36 our max resolution is 512 and 512.
00:27:40 Also bucket no upscale which means that it won't upscale this image.
00:27:44 Okay now here we are seeing subset zero of data set zero.
00:27:48 Kohya script allows you to train multiple concepts as I said.
00:27:52 So it is scanning the folder image and using all of the folders there are.
00:27:57 We only have ourselves.
00:27:59 Therefore, it has found only one directory.
00:28:02 Image count.
00:28:03 Number of repeats.
00:28:04 This is read from the directory name.
00:28:06 The other settings are all default and you see it shows that class tokens is ohwx man
00:28:12 because it did read here.
00:28:14 Actually ohwx is our rare token and man is our class token.
00:28:19 The caption extension: it is using .caption.
00:28:23 You can change this caption extension to .txt from here.
00:28:27 Then it will look the caption files with txt extension named same as the image names.
00:28:35 Here we are seeing the values for the regularization / classification images and then it starts
00:28:40 loading images.
00:28:41 You see it has loaded 574.
00:28:44 Why?
00:28:46 Because it is doing 40 times repeating plus the original images which is 560 plus 14.
00:28:55 This is how this script is working.
00:28:57 However, here we are seeing 1120.
00:29:00 Why?
00:29:02 Because we are using classification images therefore 560 coming from training and 560
00:29:09 coming from classification images.
00:29:11 Which means 1120 steps is becoming one epoch.
00:29:16 So this is being 1 epoch to be completed.
00:29:19 It is also showing LoRA network rank, alpha value, text encoder modules, UNET modules,
00:29:25 and other parameters.
00:29:27 So currently we are seeing the number of steps here.
00:29:29 After 1120 the one epoch will be completed and we will get a checkpoint.
00:29:38 The checkpoint will be saved in this folder.
00:29:40 The training has been completed.
00:29:42 We have five checkpoints.
00:29:44 This is the first checkpoint, second, third, fourth, and in the final you see it doesn't
00:29:50 display the checkpoint counter, it is just displaying as test1.
00:29:54 The test1 naming is coming from model output name in the settings.
00:29:59 So how are we going to use these and do testing.
00:30:03 First I will change this name same as these other naming so that we can do x/y/z plot
00:30:09 naming like this.
00:30:11 Then cut them.
00:30:12 Go to your Automatic1111 web UI installation.
00:30:13 Go to models, go to LoRA folder, paste them inside LoRA folder.
00:30:20 So this is the folder.
00:30:21 Run your Automatic1111 web UI.
00:30:23 You see my web UI is using Python 3.10.9 version and it is working very well.
00:30:29 These are other values that I am using with my Automatic1111 web UI.
00:30:33 So web UI instance started and the Realistic Vision is selected and I have written my prompt
00:30:40 like this photo of ohwx man wearing a suit.
00:30:43 However, then I need to activate the LoRA.
00:30:46 How am I gonna do that?
00:30:48 In here you will see show hide extra networks.
00:30:51 When you click it, it will open this window in here.
00:30:55 Go to the LoRA and in here you can click refresh and it will refresh all of the model files
00:31:00 like this.
00:31:01 When you click one of them, they will get inserted into your prompt like this.
00:31:05 So this will activate this LoRA.
00:31:08 This is the test1 LoRA with the third checkpoint.
00:31:13 Then when I click generate it will generate the image.
00:31:16 You shouldn't see any error here.
00:31:19 Then let's close the show hide extra networks and the image is generated.
00:31:23 Not very good.
00:31:24 We need to add some negative prompts.
00:31:26 So here some negative prompts.
00:31:28 Let's try another one and yes, a pretty decent one.
00:31:32 This is also only the third checkpoint.
00:31:34 I need to do x/y/z plot comparison and see which checkpoint is working best.
00:31:41 Checkpoints are the number of epochs that we did training.
00:31:44 So how am I going to do x/y/z checkpoint comparison: go to the bottom, click script select x/y/z
00:31:50 plot.
00:31:51 In here select Prompt SR which means that this is search and replace.
00:31:56 We are going to replace this final number which is indicating the checkpoint number.
00:32:02 Write here like checkpoint or whatever you want, copy.
00:32:06 The first value will be checkpoint.
00:32:09 Then we can type the numbers which is starting from 1, 2, 3, 4, 5.
00:32:14 We have 5 checkpoints because we trained up to 200 epochs.
00:32:18 How did we do that?
00:32:20 Our repeat count was 40.
00:32:22 We saved every 1 epoch.
00:32:25 We trained up to 12 epochs.
00:32:29 Since we used classification images, the number of epochs were halved.
00:32:35 Therefore, we trained up to 6 epochs actually.
00:32:40 Okay, I noticed a mistake I made.
00:32:42 Actually, I trained up to 10 epochs.
00:32:44 It wasn't 12.
00:32:45 You didn't see that part.
00:32:46 So we 5 five epochs, not 6.
00:32:49 Sorry about that mistake and now we can do comparison, click, generate.
00:32:54 It will generate with the same seed all of these checkpoints.
00:32:58 Okay, results are generated.
00:33:01 Let's look at the checkpoints.
00:33:03 So here the first one is the original model Realistic vision version 2.
00:33:08 The first one is the first checkpoint, second checkpoint, third checkpoint, fourth checkpoint,
00:33:15 and the fifth checkpoint.
00:33:17 The face can be certainly improved.
00:33:19 Let's do it.
00:33:20 I also did put positive prompt and negative prompt into the Gist file.
00:33:24 By clicking raw you can see the Gist file like this.
00:33:29 Okay, for fixing the face, I will open a fresh page like this.
00:33:33 I will copy the command.
00:33:35 I will use the last checkpoint as it looks best.
00:33:38 Copy the negative prompt as well.
00:33:40 Copy the seed, paste the seed, then click high res.
00:33:44 Fix click, restore faces.
00:33:47 Set the high res steps as you wish like 50.
00:33:50 Change the denoising strength as you wish.
00:33:52 If you want to get a little bit different improved image then you can make this high
00:33:57 or you can make this low.
00:33:59 Then click hit generate and it will generate the improved image.
00:34:03 The face is much more improved.
00:34:06 I will show you now and here we got the improved image.
00:34:09 So you see from this 512 resolution 512 to this one improved hd resolution image.
00:34:18 If you think that the face is not looking very like you then all you need to do is use
00:34:24 random seed, generate bunch of images, find a good seed and use high res fix that is looking
00:34:32 similar to you, that is like your style.
00:34:35 You see with this way I can generate a lot of high quality images and the one that is
00:34:41 looking like me I can improve it with this methodology.
00:34:44 Moreover, you can play with the prompt and change it until you get your desired output.
00:34:51 The generated images quality is really good, even though they are 512 and 512.
00:34:57 With high res fix I can get much better quality.
00:35:00 The similarity is also very decent.
00:35:03 I could also do more training and try to get better images if I wanted.
00:35:07 So you see it is generating me inside a suit.
00:35:11 A fancy suit.
00:35:13 Let's also try some other stylized images.
00:35:15 Even though this model is not trained for digital drawings, anime like drawings, it
00:35:22 is still able to generate digital drawing images.
00:35:26 So this is my new prompt.
00:35:27 I also modified the negative prompt.
00:35:29 I have applied high res as well and this is the result you are seeing right now.
00:35:34 It is pretty decent.
00:35:35 Of course with more prompting and more trials I can get much better images and you should
00:35:41 use base model depending on what you are aiming.
00:35:45 If you are aiming stylized images, you can use base 1.5 pruned ckpt or anime like models
00:35:53 or other models.
00:35:54 If you're aiming studio quality photo shooting then you should use Realistic Vision.
00:35:59 So now I will show you all of the experiments I have done.
00:36:04 I made 18 experiments.
00:36:06 I have trained up to 200 epochs in terms of SD web UI DreamBooth extension.
00:36:12 So how you can compare all of the test results.
00:36:17 You see these are the testing results.
00:36:20 I have 18 json files.
00:36:22 When you open one of the json file, you can see the settings.
00:36:26 All of the user settings are located in here and in the model folder we are seeing all
00:36:32 of the generated safetensors files.
00:36:35 You see when you reduce the number of repeats, the file name significantly changes like this.
00:36:42 Therefore, I have renamed them properly for testing with x/y/z plot.
00:36:47 So after I have renamed them, we are going to use 2 parameters in SR Prompt.
00:36:54 The first parameter is the checkpoint name 1, 2, 3, 4, 5 and the second parameter is
00:37:00 the version.
00:37:01 The test version from 1 to 18 and this is the prompt that I have tested.
00:37:08 You see it will replace version keyword here.
00:37:11 So the LoRA file will become test1, test2, test3, test4 up to test18 and the second keyword
00:37:19 value it is going to change is checkpoint.
00:37:21 So it will become first checkpoint, second checkpoint, and third checkpoint.
00:37:26 So the result is generated.
00:37:28 Let me open it in the full resolution.
00:37:30 So here we are seeing the full comparison of all of the tests.
00:37:35 The y columns are checkpoints and the x columns are the test name.
00:37:40 This is test one results: let me show you the native resolution.
00:37:45 So in the test one, it is the default settings that comes with the Kohya SS installation.
00:37:51 So this is the result we got with the test one, this is test two result.
00:37:56 In the test two, I have changed network alpha value to four and all other settings are default.
00:38:02 So this is test three result.
00:38:04 In the test three network rank is 16 and all other settings are default.
00:38:10 Network alpha returned back to 1.
00:38:12 So this is test four result.
00:38:14 In the test four, network rank is 32.
00:38:18 This is suggested network rank in the Colab of Kohya SS.
00:38:22 However, I didn't find this as the best one.
00:38:25 So this is result of test five.
00:38:27 In the test five, the network rank is 64.
00:38:31 So this is the result of test six.
00:38:32 This is the best settings that I have found.
00:38:36 It is very good quality and it is network rank 128.
00:38:41 All other settings are default.
00:38:43 Then I have tested network alpha with using the network rank 128.
00:38:50 So this is when network alpha becomes two.
00:38:52 You see the quality starts degrading.
00:38:55 As we increase network alpha, it becomes very bad.
00:38:59 You see it is very very bad.
00:39:01 So this is the result of network rank 128.
00:39:04 This is number of vectors that it is holding and this is the network alpha.
00:39:10 Network alpha is a value that is used during training.
00:39:13 You may look up the paper of the LoRA for learning more.
00:39:17 Then I have tested the repeat counts that we did set in the folder name.
00:39:23 This is repeat count 20.
00:39:25 When you reduce the repeat count, I had to increase number of epochs and save checkpoint
00:39:30 epoch count.
00:39:32 So when I do repeat count 20 I make the epoch count 20 and save every n epoch 2.
00:39:39 So when I reduce repeat count to 10 the epoch count increases 40.
00:39:44 Save every n epoch 4.
00:39:46 In this way it always do training with the same number of steps and each time when an
00:39:53 image is training it is one step.
00:39:55 So always trained same number of steps with these repeat count epoch and save every n
00:40:01 epoch settings.
00:40:02 So this is 20 repeating count.
00:40:04 This is 10 repeating count.
00:40:07 This is 5 repeating count and this is repeating count 1.
00:40:12 In my DreamBooth experimentation, I had found that 50 classification / regularization images
00:40:18 per instance were performing best.
00:40:21 This is also similar if you ask me.
00:40:24 If you wonder which video is that.
00:40:25 This is the video of where I have tested number of classification images for every training
00:40:31 image.
00:40:32 You can watch it.
00:40:33 It is pretty useful.
00:40:34 In the Kohya ss, you will also see other LoRA types like Kohya DyLoRA, LyCORIS/LoCon, LyCORIS/LoHa,
00:40:39 Kohya LoCon.
00:40:43 If you use these, then the number of parameters that you need to test increases.
00:40:49 For example, when I select LyCORIS/LoCon now, we need to test convolution rank, dimension,
00:40:56 and convolution alpha value.
00:40:58 Moreover, the files generated with these can't be directly used with Automatic1111 web UI.
00:41:05 So how can you use the generated files with them?
00:41:08 Go to the extensions tab, go to the available, load from look for Kohya ss additional networks,
00:41:15 install it.
00:41:16 After installing, apply and restart UI.
00:41:19 Now in the text to image tab, you will see additional networks option here.
00:41:24 From here, you can select the model names.
00:41:27 To see the model names, click refresh models.
00:41:30 By the way, these model files has to be put inside another folder.
00:41:35 You may ask where.
00:41:37 They have to be put inside extensions, inside SD web UI additional networks, and inside
00:41:43 models, inside LoRA.
00:41:46 You have to put the generated model files here.
00:41:50 Generated LoRA files here.
00:41:52 After you put them, click refresh models here and they should appear as you are seeing right
00:41:57 now.
00:41:58 Then you need to click enable, select the model.
00:42:00 Once you do this, you don't need to type it anymore in here.
00:42:04 You can type as ohwx man and we should get results.
00:42:08 And here we are getting result.
00:42:10 By the way, this is not the best test model, therefore the result is not very good.
00:42:14 If you wish, you can also upgrade your Torch version.
00:42:18 So how you can do it?
00:42:19 Go to your Kohya installation, go to virtual environment, scripts, activate script.
00:42:25 Then do torch uninstall, then copy this command posted on our Gist file.
00:42:31 Copy, paste it.
00:42:33 Execute the script.
00:42:34 It will upgrade your Torch version to 1.13.
00:42:38 After this installation don't forget to replace cuDNN dll files again.
00:42:44 Also, you can upgrade your xFormers.
00:42:46 I have xFormers 0.0.18 that is compatible with Torch version 13.
00:42:53 Just copy paste it and run it and it will also upgrade your xformers to a much newer
00:42:59 version.
00:43:00 This may significantly increase your training speed.
00:43:03 After installation, just restart your Kohya ss.
00:43:06 I couldn't find a way to make Kohya ss use my second graphic card.
00:43:12 I have RTX 3090 and RTX 3060.
00:43:16 My default graphic card is RTX 3090 TI.
00:43:20 So if I want to start Kohya ss on my second card, I need to set cuda visible devices system
00:43:28 wide.
00:43:29 Open a new cmd window, type this command and execute it.
00:43:33 This will hide your first graphic card for every application.
00:43:37 Therefore, after doing this, start your Kohya ss, then change back like this.
00:43:44 So you will be able to use both of the cards again.
00:43:47 So this is the way of using Kohya ss on the second graphic card if you wish for.
00:43:52 I hope you have enjoyed.
00:43:53 Please like subscribe, leave a comment.
00:43:56 Ask me anything that you wish.
00:43:58 I have excellent tutorials on my channel that will teach you Stable Diffusion, DreamBooth
00:44:03 training LoRA training.
00:44:05 For example in this video, I didn't explain how LoRA training actually works.
00:44:10 I have this excellent tutorial video for that.
00:44:13 I have explained the details.
00:44:15 In this excellent tutorial, I have explained how Stable Diffusion, how rare tokens works.
00:44:20 Actually, after you watch this textual inversion tutorial that I have on my channel, you will
00:44:25 have much more information about how Stable Diffusion works, how tokens works, what is
00:44:30 ohwx, why we are using rare tokens.
00:44:34 Also, I have ControlNet videos and other amazing videos.
00:44:37 In the video description and also in the comment section of the video let me show you.
00:44:42 You will see our discord link like this and also in the pinned comment of this video you
00:44:48 will see our discord link and Patreon link like this.
00:44:52 You can just click them and join our discord server and join our Patreon.
00:44:57 In our discord server we have over 2,200 members.
00:45:02 So the best way to contact me is joining our discord server.
00:45:06 If you also support me on Patreon, I would appreciate that very much.
00:45:10 I am posting some useful stuff on Patreon as well.
00:45:13 Moreover, with join button on Youtube you can support me.
00:45:17 If you support me from Patreon and join our discord channel, just message me there.
00:45:22 Then I will give you Patreon user rank in the discord channel.
00:45:23 Hopefully see you in another video.

Uh oh!

Generate Studio Quality Realistic Photos By Kohya LoRA Stable Diffusion Training Full Tutorial

Generate Studio Quality Realistic Photos By Kohya LoRA Stable Diffusion Training - Full Tutorial

Full tutorial link > https://www.youtube.com/watch?v=TpuDOsuKIBo

Video Transcription

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!