-
-
Notifications
You must be signed in to change notification settings - Fork 363
Automatic1111 Stable Diffusion DreamBooth Guide Optimal Classification Images Count Comparison Test
Automatic1111 Stable Diffusion DreamBooth Guide: Optimal Classification Images Count Comparison Test
Full tutorial link > https://www.youtube.com/watch?v=Tb4IYIYm4os
Sign up RunPod: https://bit.ly/RunPodIO. Our Discord : https://discord.gg/HbqgGaZVmr. New best training settings for DreamBooth training in Automatic1111 Web UI. If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰 https://www.patreon.com/SECourses
Playlist of #StableDiffusion Tutorials, #Automatic1111 and Google Colab Guides, #DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img:
https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3
Easiest Way to Install & Run Stable Diffusion Web UI on PC by Using Open Source Automatic Installer:
How to use Stable Diffusion V2.1 and Different Models in the Web UI - SD 1.5 vs 2.1 vs Anything V3:
Zero To Hero Stable Diffusion DreamBooth Tutorial By Using Automatic1111 Web UI - Ultra Detailed:
Sketches into Epic Art with 1 Click: A Guide to Stable Diffusion ControlNet in Automatic1111 Web UI:
Ultimate RunPod Tutorial For Stable Diffusion - Automatic1111 - Data Transfers, Extensions, CivitAI:
8 GB LoRA Training - Fix CUDA & xformers For DreamBooth and Textual Inversion in Automatic1111 SD UI:
2400 Photo Of Man classification images:
https://drive.google.com/file/d/1qBf8VyUbmPNalKqm076yOsQjE8BrcG7R/view
00:00:00 Introduction to Best Settings of DreamBooth training experiment
00:00:56 How to close initially started Web UI instance on RunPod Stable Diffusion template
00:02:20 Which RunPod machine you should pick for DreamBooth training and why
00:02:48 The used versions in this experiment such as Automatic1111 version, xformers version, DreamBooth version
00:04:20 Best DreamBooth settings for 0 classification images
00:04:45 How to continue DreamBooth training from a certain checkpoint
00:05:12 Used command line arguments for best DreamBooth training
00:05:20 Used extensions list for best DreamBooth training
00:05:45 Starting to set parameters for 0 classification images - equal to fine tuning
00:06:45 Used training dataset and what dataset features you need
00:07:45 Setting concepts tab of DreamBooth training
00:08:00 When you should use FileWords and why you should use for fine tuning and how to do fine tuning
00:10:15 Best training setup parameters for DreamBooth training when using classification images
00:11:28 How to calculate number of steps for each epoch
00:13:17 All trainings are completed
00:13:49 Comparison of sample and sanity sample images generated during training
00:13:55 Analysis of 0x classification samples
00:14:41 Analysis of 1x classification samples
00:15:14 Analysis of 2x classification samples
00:15:36 Analysis of 5x classification samples
00:16:12 Analysis of 10x classification samples
00:16:34 Analysis of 25x classification samples
00:16:45 Analysis of 50x classification samples
00:17:28 Analysis of 100x classification samples
00:17:49 Analysis of 100x classification samples
00:18:09 Comparing each checkpoint in all of the trained models
00:18:46 How to use x/y/z plot to check different training checkpoints
00:19:51 All grids are generated and how did i download them
00:20:40 Analysis of 0x classification x/y/z grid images
00:21:58 Analysis of 1x classification x/y/z grid images
00:23:10 Analysis of 2x classification x/y/z grid images
00:24:03 Analysis of 5x classification x/y/z grid images
00:25:00 Analysis of 10x classification x/y/z grid images
00:25:36 Analysis of 25x classification x/y/z grid images
00:26:15 Analysis of 50x classification x/y/z grid images
00:27:27 Analysis of 100x classification x/y/z grid images
00:28:02 Analysis of 100x classification x/y/z grid images
00:29:00 Summary of the experiment
00:29:40 Very important speech part
Text-Guided View Synthesis
Our technique can synthesized images with specified viewpoints for a subject cat (left to right: top, bottom, side and back views). Note that the generated poses are different from the input poses, and the background changes in a realistic manner given a pose change. We also highlight the preservation of complex fur patterns on the subject cat's forehead.
Property Modification
We show color modifications in the first row (using prompts a [color] [V] car''), and crosses between a specific dog and different animals in the second row (using prompts a cross of a [V] dog and a [target species]''). We highlight the fact that our method preserves unique visual features that give the subject its identity or essence, while performing the required property modification.
Accessorization
Outfitting a dog with accessories. The identity of the subject is preserved and many different outfits or accessories can be applied to the dog given a prompt of type "a [V] dog wearing a police/chef/witch outfit''. We observe a realistic interaction between the subject dog and the outfits or accessories, as well as a large variety of possible options.
-
00:00:00 Greetings everyone.
-
00:00:01 In this video I am going to conduct a massive experiment of number of classification images
-
00:00:06 effect when doing Stable Diffusion DreamBooth training.
-
00:00:09 In the community: there are widely varying numbers for how many classification images
-
00:00:13 to use per training instance image.
-
00:00:15 In the official paper 200 classification regularization images are used per training image.
-
00:00:20 So in this video I am going to conduct the experiments written here.
-
00:00:23 I will use 9 RunPod instances to do 9 different DreamBooth training.
-
00:00:28 So all of my instances are running right now.
-
00:00:30 I have prepared one instance, then cloned all of them between.
-
00:00:34 To do this I have used runpodctl command.
-
00:00:37 I have zipped the Stable Diffusion web UI folder, venv folder and the classification
-
00:00:41 images folder, then sent them between the different RunPods.
-
00:00:46 Moreover when you start your RunPod with the template, it starts a hidden web UI instance
-
00:00:53 but you are not able to close it from the jupyter interface.
-
00:00:57 So for closing it first I have changed the relauncher.py like this.
-
00:01:03 So I have added a while loop break here.
-
00:01:06 When it is closed it won't relaunch again and again.
-
00:01:10 And to close the initial hidden web UI I have used this kill command.
-
00:01:15 With this command, you are killing the running web UI instance on the port 3000.
-
00:01:21 Then we will manually launch these web UIs.
-
00:01:26 So in my My Pods section currently you see: CPU utilization and memory usage are zero.
-
00:01:32 That means that no instance of Automatic1111 web UI currently running on your pod.
-
00:01:37 So if you don't know what is Stable Diffusion, what is Automatic1111 web UI I have excellent
-
00:01:42 tutorials for them.
-
00:01:43 In this tutorial you can learn how to install and run Automatic1111 web UI on your computer.
-
00:01:49 In this tutorial, you will learn how to do DreamBooth training from zero to hero.
-
00:01:54 In this tutorial I am explaining what is new awesome fantastic ControlNet and how to use
-
00:01:59 it on Automatic1111 web UI.
-
00:02:02 And this is the ultimate RunPod tutorial.
-
00:02:05 So if you are interested in this, you can watch this playlist.
-
00:02:09 Now I will begin with starting my web UI instances in all of the RunPods.
-
00:02:15 All Automatic1111 instances are started.
-
00:02:18 Let me show you the versions and the pods that I am using.
-
00:02:21 So it is really important to pick your pod correct for DreamBooth training.
-
00:02:26 I have chosen RTX A4500 pods.
-
00:02:30 Why?
-
00:02:31 Because as you can see, these pods have 62GB RAM and 20GB VRAM.
-
00:02:38 So having more RAM is really important when doing DreamBooth training.
-
00:02:42 If your pods do not have enough sufficient amount of RAM then you may get gradio killed
-
00:02:49 error which is extremely annoying.
-
00:02:51 So the versions I am using are python revision 3.10.9 for this experiment.
-
00:02:57 DreamBooth revision is this one and the SD Web UI revision is this one.
-
00:03:02 I am using xformers 0.0.17.dev464.
-
00:03:08 Why?
-
00:03:10 Because either you have to use 0.0.14 or 0.0.17 version xformers, otherwise, DreamBooth training
-
00:03:18 will not work.
-
00:03:20 This is a very commonly question that I have been getting asked.
-
00:03:23 On Windows you should downgrade your xformers to 0.0.14 revision and I am explaining that
-
00:03:30 in this video.
-
00:03:32 On Unix you should upgrade your xformers to 0.0.17 development revision and in this video
-
00:03:38 I am explaining that.
-
00:03:40 I have pre-prepared 2400 classification images.
-
00:03:45 To generate these images a simple prompt used which is our classification regularization
-
00:03:51 prompt.
-
00:03:52 The prompt used is photo of man and I have used sampling steps as 40 and nothing else
-
00:03:57 is different.
-
00:03:58 I have used version 1.5 pruned ckpt file and in the settings and in the Stable Diffusion
-
00:04:05 settings the default vae used as the newest vae the best vae available as you can see
-
00:04:11 right now here.
-
00:04:13 So I will share the link of this classification data set in the description as a zip file.
-
00:04:18 You can download and use them if you want.
-
00:04:21 Now I will show you two of the settings.
-
00:04:24 First one is zero classification and the second one will be 1x classification.
-
00:04:29 The rest will be same.
-
00:04:30 First we will begin with generating our training model.
-
00:04:35 This one will be 0x plus.
-
00:04:38 I will pick the file from here 1.5 pruned ckpt file as a source checkpoint.
-
00:04:45 Sometimes I am getting asked that how you can continue your training from certain checkpoint.
-
00:04:51 Just make a new model and pick your source checkpoint from here.
-
00:04:57 Like this and it will generate a new training model from that certain checkpoint that you
-
00:05:02 want to continue.
-
00:05:03 It is the same thing and I am not touching the other things.
-
00:05:06 They are default and best settings.
-
00:05:08 One another thing that I want to mention is that I have only used these command line arguments
-
00:05:14 as you can see nothing else is special or different and these are the only extensions
-
00:05:19 I am using right now they are the latest version.
-
00:05:22 Okay all model files are generated in all of the RunPod instances.
-
00:05:28 For example, this one is 0x classification images.
-
00:05:30 This one is 1x classification images 2x, 5x, 10x, 25x, 50x, 100x, and 200x.
-
00:05:41 So I will show two of the setups first.
-
00:05:44 Let's begin with no classification images.
-
00:05:47 This will be basically a fine tuning.
-
00:05:49 First I will click performance wizard.
-
00:05:51 Then I'm not going to use any classification images.
-
00:05:53 I am going to train 200 epochs.
-
00:05:56 I will save model every 25 epochs.
-
00:05:58 I will save preview image every five epochs.
-
00:06:00 I'm not going to change batch size or gradient accumulation steps.
-
00:06:03 These will affect your training success rate as well.
-
00:06:08 Because this is mini batches versus higher batches like full batches.
-
00:06:12 This is a debated topic in the machine learning.
-
00:06:15 I will use gradient checkpointing.
-
00:06:17 Why?
-
00:06:18 Because in this video, I am going to set up the settings as you can do in your computer
-
00:06:23 with only 12 gigabyte vram having graphic card.
-
00:06:27 Therefore, I will use gradient checkpointing.
-
00:06:30 The learning rate I see that the learning rate is increased when we click performance
-
00:06:35 wizard or by default I am going to use a lower learning rate like this.
-
00:06:39 I'm not going to use center crop or apply horizontal flip.
-
00:06:43 My images are already prepared by me.
-
00:06:46 This is my training data set.
-
00:06:47 You see every background is different.
-
00:06:50 Clothings are different.
-
00:06:51 Only face is common in the images so you should make common only the things that you want
-
00:06:57 to teach to the model.
-
00:06:58 The sanity sample prompt will be photo of ohwx man by tomer hanuka.
-
00:07:05 So the ohwx is our rare token.
-
00:07:08 Man is our class token.
-
00:07:09 However, this is zero classification model so we are not going to have any class token
-
00:07:15 in this particular one.
-
00:07:17 So for zero classification images, it will be only photo of ohwx by tomer hanuka to see
-
00:07:23 how it performs during training.
-
00:07:25 I'm not going to use ema because this is as I said, 12 gigabyte VRAM having experiment.
-
00:07:32 I am going to use bf16.
-
00:07:34 This is supposed to have better precision, but if your graphic card is not supporting
-
00:07:38 this then you should pick fp16.
-
00:07:40 We are using xformers and these are the versions I am showing once again.
-
00:07:45 The other values are will be like this.
-
00:07:47 In the concepts I will set the data set directory like this.
-
00:07:52 This is the data set directory that is containing these training images.
-
00:07:56 We are not setting any classification because this is 0x class.
-
00:08:00 I'm not using any [FileWords] because [FileWords] is something that you want to use when you
-
00:08:06 want to do fine tuning.
-
00:08:09 That you want to improve quality of lots of tokens like you are teaching castles, rivers,
-
00:08:15 mountains, and other things and you have beautiful images, then you should caption those images
-
00:08:20 with the keywords that you want to improve and associate with them.
-
00:08:24 If I want to show you an example, let's say you want to improve castle images and you
-
00:08:30 have this image for fine tuning, then you should caption this file as awesome fantastic
-
00:08:36 castle in a beautiful forest with a awesome river.
-
00:08:41 And when doing DreamBooth training, all of these tokens will get associated with this
-
00:08:47 image and they will get improved along with the unet and the text encoder.
-
00:08:51 This is when it is useful to use [FileWords].
-
00:08:54 So when you use [FileWords] like this, it will read the caption of that particular training
-
00:09:00 image and replace here with that caption.
-
00:09:04 So your instance prompt for that image will become the caption that you have used.
-
00:09:09 But for teaching faces I am just only using a rare token and I am not using any captions
-
00:09:14 or other things, we are not going to use class prompt for this training.
-
00:09:19 For sample, I will use photo of ohwx and I am not adding class token and you see class
-
00:09:25 images per instance and other settings.
-
00:09:28 I think we need to set this one as zero for it to work correctly and in the saving I will
-
00:09:34 save them into a sub directory.
-
00:09:36 I will also generate a ckpt file during saving.
-
00:09:39 So these are my saving settings.
-
00:09:42 So all settings are ready.
-
00:09:43 I will click save settings and hit train.
-
00:09:46 Now in the terminal window we can see the training has started.
-
00:09:50 You see number of batches each epoch is 12 because we are not using any classification
-
00:09:56 regularization images.
-
00:09:57 Number of epochs 200, text encoder epochs is 150, because we did set ratio of text encoder
-
00:10:05 training is 75 percent.
-
00:10:07 The other settings are displayed here.
-
00:10:09 It is pretty fast on this graphic card as you can see the loss is looking good.
-
00:10:14 The used vram is 9.6 gigabytes.
-
00:10:17 Now I will set up the next one which is 1x classification images.
-
00:10:23 This one is simply same.
-
00:10:25 What is different in this one.
-
00:10:27 In this one the only different thing is I am adding the class prompt into the sample
-
00:10:33 prompt and also I am adding classification data set directory like this as you can see
-
00:10:39 and in the instance prompt I am going to use ohwx man.
-
00:10:42 Ohwx is our rare token.
-
00:10:44 Man is our class token and the prompt that I have used to generate classification regularization
-
00:10:50 images is photo of man.
-
00:10:53 So you should write same thing because the reason is that we are wanting to keep the
-
00:10:58 underlying context of the model as much as possibly same.
-
00:11:02 We want to have prior-preservation loss, so the sample image prompt will be photo of ohwx
-
00:11:10 man like this and I will use one class images per instance I am just setting it like this
-
00:11:18 and since it has already classification images, it won't generate any new one.
-
00:11:23 Save settings.
-
00:11:24 Hit train!
-
00:11:25 Okay, this is the terminal of 1x classification images.
-
00:11:28 Now number of batches each epoch is 24 because now it is using bucketing system.
-
00:11:34 Therefore, it adds in each epoch, classification images and the training images as well.
-
00:11:40 So it is a double size of the training images that you have.
-
00:11:43 The other settings are like this, same as the previous one and it started training.
-
00:11:48 We can see here.
-
00:11:50 The rest of the training will be same.
-
00:11:53 Only the number of classification images per instance will be different.
-
00:11:57 Okay, all trainings are started.
-
00:11:59 You see class images 24 used.
-
00:12:02 This is 2x.
-
00:12:03 You see class images 60 this is 5x. 120 class images this is 10x. 300 class images this
-
00:12:10 is 25x. 600 this is 50x.
-
00:12:14 This is 100x 1200 class images and 2400 class images this is the 200x.
-
00:12:21 So each one is going on.
-
00:12:23 I see that the stylizing capability of the 0x is gone already and in this one the stylizing
-
00:12:31 in 1x class also gone in epochs 74 in 2x class I see that still stylizing in epoch 39.
-
00:12:40 In 5x ues, stylizing is still available but learn rate is not very good yet 38 epochs.
-
00:12:47 This one is just started 10x class.
-
00:12:50 Uh, it had error so I had to restart.
-
00:12:53 Okay with 25x class 36 epochs it looks like the best.
-
00:12:57 And with 50x class I see that the stylizing is the best so far.
-
00:13:02 It is really good with 100 class stylizing is good with 200 class stylizing is not very
-
00:13:09 good but this is also the epoch 24 yet.
-
00:13:12 So when you use more classification images, can we say that it is taking more time?
-
00:13:17 I'm not sure yet.
-
00:13:19 Okay, now we need to wait all of them to finish and we will then compare.
-
00:13:24 Okay, all trainings are completed.
-
00:13:26 Let me show you quickly you see model epoch is 200 200 200 and each one is different.
-
00:13:34 You see 5x classifications 200.
-
00:13:36 So all of the trainings are completed without any errors now I am going to download samples
-
00:13:44 generated during training in all of the RunPods and we will start with comparing them.
-
00:13:50 I will analyze and comment on them.
-
00:13:53 Okay, samples are downloaded here.
-
00:13:55 So I will begin with the 0x classification images.
-
00:13:59 When we do not use any classification images, the generated samples are like this: okay,
-
00:14:05 we see that a good styling at 960 steps here and the face is looking decent as well.
-
00:14:13 The styling capability of the model is lost after 1260 steps.
-
00:14:19 So how are we going to calculate the epoch count for this?
-
00:14:23 When you divide it by 12, we are going to find the epoch count.
-
00:14:28 So this is 105 epoch.
-
00:14:31 So until 100 epochs.
-
00:14:33 It was able to stylize when we do not use any classification images.
-
00:14:37 So we can say that with 80 epochs we could get good results when we do not use any classification
-
00:14:42 images, we will test that.
-
00:14:45 Let's look at the results.
-
00:14:46 When we use only single classification image, we see that it started learning pretty fast.
-
00:14:53 This is a really good styling.
-
00:14:54 However, the styling capability of the model is lost pretty quickly after 60 epochs and
-
00:15:01 the generated samples quality is also decreased a lot.
-
00:15:04 It also looks like memorized the images instead of learning my face.
-
00:15:09 So 1x classification doesn't look very good.
-
00:15:13 We will see results.
-
00:15:14 Okay, this is 2x classification.
-
00:15:16 2x means that we have used 2 multiplied by 12 24 classification images just reminding.
-
00:15:23 It is looking decent for 40 epochs and after 60 epochs it becomes very bad actually.
-
00:15:30 It also looks like memorized the images not learned the face so we will see the results.
-
00:15:36 Okay, this is 5x classification images and after 60 epochs or 50 epochs it looks like
-
00:15:44 decent and it loses its styling pretty early also, after 100 epochs.
-
00:15:50 Okay, these are the sample images.
-
00:15:52 By the way, these images are not styled are not beautified.
-
00:15:56 They are just raw prompts.
-
00:15:58 Let me show you.
-
00:15:59 So you see, this is a raw prompt photo of ohwx man and this is the raw sanity prompt
-
00:16:04 photo of ohwx man by tomer hanuka.
-
00:16:06 We could add more beautifying tokens to these prompts and also negative prompts to improve
-
00:16:13 them.
-
00:16:14 Okay, let's see 10x classification.
-
00:16:16 In the 10x classification it keeps its styling ability much more as you can see even at the
-
00:16:23 190 epochs, it is still able to stylize my face with tomer hanuka style.
-
00:16:28 So you see as we increase the number of classification images, we are preventing over training certainly.
-
00:16:33 Okay, this is 25 classification images and somehow the styling capability is once again
-
00:16:41 lost quickly after 60 epochs.
-
00:16:44 Okay, 50x classification images.
-
00:16:46 When we use 50 classification images per instance, I see that it is the best one that keeps styling.
-
00:16:56 Also looks like learning the face.
-
00:16:58 So in this experiment I find that 50x is the sweet spot for my training data set.
-
00:17:06 I can't say it will be for you, but 50x looks like a good choice for now.
-
00:17:10 We will test and see it.
-
00:17:12 Started to learn the face slower than the others.
-
00:17:15 I think after 60 epochs it becomes somewhat decent and the best one looking like 180 epochs
-
00:17:23 like this.
-
00:17:24 This one also looking decent.
-
00:17:26 Let's look at the 100 classification images.
-
00:17:28 Okay, in the 100 classification images, the results are not fascinating.
-
00:17:33 Actually, the sanity prompt becomes very very irritating.
-
00:17:37 Also, the sample prompts as well.
-
00:17:39 I don't know why it is like this, but maybe there is a bug in the code that causes some
-
00:17:45 errors.
-
00:17:46 Problems?
-
00:17:47 it doesn't look good at all.
-
00:17:49 200 classification images is same.
-
00:17:52 After 150 epochs it becomes very bad.
-
00:17:55 Actually very very bad.
-
00:17:56 I don't know.
-
00:17:57 This is very weird.
-
00:17:59 The prompts are looking correct so this is weird.
-
00:18:02 I also checked the settings and verified the settings are correct and yes they are correct.
-
00:18:07 It certainly tried to learn the face, but the results are not very good.
-
00:18:11 So now what are we going to do is I have prepared a prompt and found a seed that shows my face
-
00:18:19 in seven of the eight generated images.
-
00:18:23 They are all my face.
-
00:18:24 They are stylized as you can see.
-
00:18:27 So the aim here is comparing each checkpoint and see how each model will perform.
-
00:18:34 And how am I going to do that?
-
00:18:35 I will copy the prompt in the each model.
-
00:18:39 Then I will copy the seed of this prompt.
-
00:18:43 Like this.
-
00:18:44 I will set the batch size as eight.
-
00:18:46 This is important.
-
00:18:48 Are we done?
-
00:18:49 no.
-
00:18:50 We are also going to use x/y/z plot and in x/y/z plot we are going to test different
-
00:18:56 checkpoints.
-
00:18:57 You see I have checked checkpoint name here.
-
00:18:59 I click this.
-
00:19:00 It will fill the checkpoints like this.
-
00:19:02 I will delete the first one and I am going to test all of the checkpoints.
-
00:19:07 So how many images this will generate?
-
00:19:09 This will generate eight multiplied by eight 64 images.
-
00:19:13 Because we have eight checkpoints, this is 20 epoch, 40 epoch, 60 epoch this is 25 epoch
-
00:19:21 50 epoch 75 epoch 100, epoch 125 epoch 150 epoch 175 epochs and this is 200 epochs.
-
00:19:32 I am not going to test other values because I want to see what it will generate for the
-
00:19:38 same settings in the different trained models.
-
00:19:42 By the way, this is the first zero classification example.
-
00:19:45 So we don't need man in this one, but in others, we will use man the class prompt and we are
-
00:19:51 ready to test.
-
00:19:52 Okay, all grids are generated.
-
00:19:54 However, they are not displayed on the gradio unfortunately.
-
00:19:58 So what did I do?
-
00:19:59 I went to the text to image grids folder and downloaded all of the images one by one and
-
00:20:05 now they are ready and now time to compare them.
-
00:20:09 But before doing that, I will now close my pods since we are done with the pods and time
-
00:20:14 to evaluate the results.
-
00:20:16 You see currently, I am using 3.37 dollars per hour time to close them.
-
00:20:22 After closing all the pods I am using 0.25 dollars per hour because currently I am using
-
00:20:29 900 gigabytes volume exited volume you see exited.
-
00:20:35 Therefore, it is spending 0.25 dollars per hour from my credits.
-
00:20:39 Okay, let's begin with 0x classification results.
-
00:20:43 In the 0x classifications there is one thing that you need to be careful.
-
00:20:48 It starts with 300 steps because we didn't use any classification images.
-
00:20:53 Therefore, each epoch equals to 12 steps.
-
00:20:58 The number of training images I have.
-
00:21:00 When we use classification images they are also included in the bucket.
-
00:21:04 Therefore, it will be double of this size.
-
00:21:07 So this is 25 epoch.
-
00:21:09 These are the results of 25 epoch, not very like my face.
-
00:21:13 These are the results of 50 epoch a very low similarity.
-
00:21:17 These are the results of 75 epoch and yes this one is becoming more similar.
-
00:21:24 Okay, this one is 100 epochs.
-
00:21:26 Yes, I see similarity in this one especially.
-
00:21:29 These are the results of 125 epochs.
-
00:21:33 As you can see the similarity increases.
-
00:21:35 However, I wouldn't call them very good results and after 150 epochs it starts to lose stylizing
-
00:21:42 and also my face.
-
00:21:44 And these are the results of 175 epochs and these are the results of 200 epochs.
-
00:21:51 So the best spot for 0x classification is 100 epochs.
-
00:21:56 The 1200 steps.
-
00:21:58 Let's look at the 1x classification results.
-
00:22:01 You see as I said, the 1x classification starts from 600 steps because classification images
-
00:22:07 are also included in the bucket.
-
00:22:09 Therefore, one epoch is 24 steps.
-
00:22:12 So let me show you each one of them.
-
00:22:15 This is 50 epochs.
-
00:22:16 The I see the similarity here but not very similar also.
-
00:22:21 These are the 75 epochs and the styling ability is becoming lesser as you can see.
-
00:22:27 Okay, this is the 100 epoch.
-
00:22:29 As you can see the results are not very good.
-
00:22:33 These are the 125 epoch.
-
00:22:35 Even though I said close shot.
-
00:22:37 You see they are all distant shots.
-
00:22:39 These are 150 epochs.
-
00:22:41 These are 175 epochs.
-
00:22:43 And this is the 200 epochs.
-
00:22:45 You see it is very much over trained and the quality is very bad as you can see.
-
00:22:50 So the best epoch for 1x classification is.
-
00:22:54 We can say 25 epoch.
-
00:22:56 After 25 epoch the results are not better.
-
00:23:01 So when you use 1x classification, 25 epochs is the sweet spot for my training data set.
-
00:23:07 It may change for you.
-
00:23:09 Okay now time to analyze 2x classification images.
-
00:23:13 In the 25 epoch these are the results as you can see.
-
00:23:16 This is not at all my face or the other one.
-
00:23:20 This is the 50 epoch and now it starts to resemble my face much better.
-
00:23:24 It is stylizing but not the best results.
-
00:23:27 So this is the 75 epoch and I can see my face in here, but the styling is lost pretty quickly.
-
00:23:35 This is 100 epochs as you can see.
-
00:23:38 It almost lost all of its stylizing capability and after 100 epochs you see it is just simply
-
00:23:45 printing my face without following our prompt.
-
00:23:49 Also it starts over training.
-
00:23:50 Yes, as you can see it is very much over trained at this point.
-
00:23:55 So the sweet spot for 2x classification looks like 50 epoch as you can see.
-
00:24:00 Generating a lot of images you may get what you want, but this is still not very good.
-
00:24:04 Okay now we are at the 5x classification.
-
00:24:07 In the first image none of the images are like me.
-
00:24:10 This is the 25 epoch.
-
00:24:12 This is the 50 epoch and some resemblance starts.
-
00:24:16 And this is the 75 epoch.
-
00:24:18 You see as we increase number of classification images, it takes more time to learn our face.
-
00:24:24 However, it is also able to stylize better.
-
00:24:28 This is the 75 epoch quality.
-
00:24:30 This is 100 epoch quality.
-
00:24:33 This is 125 steps.
-
00:24:34 Okay, this is 150 epochs as you can see.
-
00:24:38 Only this one is actually in armor.
-
00:24:41 So it's already over trained a lot.
-
00:24:44 So for the 5x classification, the sweet spot we can say is 75 epochs.
-
00:24:50 It is almost fully stylized.
-
00:24:52 All of the images are stylized, and all of the images are similar to me.
-
00:24:57 So therefore this is the sweet spot.
-
00:25:00 Okay, now time to see 10x classification.
-
00:25:02 In the first image the resemblance is good.
-
00:25:05 This is only 25 epoch.
-
00:25:08 This is 50 epoch.
-
00:25:09 It is stylized but not very much following our prompt.
-
00:25:12 This is 75 epoch.
-
00:25:15 Still stylized but not very much like we are targeting.
-
00:25:21 This is 100 epochs.
-
00:25:23 This is 125 epochs.
-
00:25:25 I think it is starting to over training.
-
00:25:27 This is 150 epochs as you can see the quality is decreased.
-
00:25:31 There are some problems errors.
-
00:25:34 This is 175 epoch and this is 200 epoch.
-
00:25:37 Very bad quality.
-
00:25:38 Okay, this is 25 classification images.
-
00:25:42 The first one is not at all like me.
-
00:25:44 This is 25 epoch.
-
00:25:46 This is 50 epoch.
-
00:25:47 It is stylized but not very similar to me.
-
00:25:49 This one looks like me.
-
00:25:51 But not very good.
-
00:25:52 This is 75 epoch.
-
00:25:53 I can say this is better than 50 epoch and this is 100 epoch.
-
00:25:59 It already looks like over trained.
-
00:26:02 Some major problems in the images and this is 125 epoch.
-
00:26:07 Already very much over trained and the rest is also.
-
00:26:09 You see it has memorized it, even the elevator or the backgrounds.
-
00:26:14 Okay, this is 60 classification images.
-
00:26:17 In the first image I can see the resemblance.
-
00:26:19 Some very good styling as well.
-
00:26:21 This is the 50 epoch and the results are really good actually if you ask me, with 50 epoch
-
00:26:28 and 50 classification images probably I can get whatever I want in a stylized manner.
-
00:26:34 The distance shot is also decent.
-
00:26:36 When you want to have distance shot then you need to upscale this image maybe with high
-
00:26:43 res fix and you can then in paint your face.
-
00:26:46 Then you can obtain very good images with that approach.
-
00:26:50 So this is uh so this is 75 epoch, still very much stylized.
-
00:26:55 I can see a very decent quality.
-
00:26:58 You see this looks pretty good one.
-
00:27:00 This is 100 epochs.
-
00:27:02 It is starting to lose styling capability.
-
00:27:05 This is 125 epochs and I can see it is memorized and producing bad quality images.
-
00:27:13 After 125 it starts over training so you can alternatively reduce the training speed by
-
00:27:18 halving it and it may help you to maybe obtain better ones.
-
00:27:24 And this is very bad.
-
00:27:25 You see totally over trained.
-
00:27:26 Okay, now time to see 100 classification images.
-
00:27:30 In the first one, there is almost no resembling.
-
00:27:33 This is the 25 epoch.
-
00:27:34 This is 50 epoch and I can see resemblance.
-
00:27:37 Actually, these results are also pretty decent for 50 epochs, but this is not like me.
-
00:27:42 In the 75 epoch, we see these are the generated images.
-
00:27:46 Yes, it is stylizing, but it is not very well and this is 100 epochs and the quality is
-
00:27:53 very bad and these are the rest.
-
00:27:56 It starts over training after 100 epochs, even for 100 classification images.
-
00:28:01 Okay, now we are at the 200 classification images.
-
00:28:05 In the 25 epoch version, there is some resemblance, but not very good.
-
00:28:10 In the 50 epoch, there are some more resembles but not very good either.
-
00:28:15 By the way, even though we are using same seed, that doesn't mean they are equal between
-
00:28:21 different trainings.
-
00:28:22 The same seed should produce similar results in the same training.
-
00:28:26 But in the different training, then we can say it will be same.
-
00:28:30 Same seed basically means that it will start from the same noise and then generate the
-
00:28:35 image with denoising.
-
00:28:36 Okay, this is 75 epochs.
-
00:28:39 The results are decent.
-
00:28:41 For 75.
-
00:28:42 Actually, I see stylizing some good decent results.
-
00:28:46 For 200 images, this is 100 epoch.
-
00:28:49 As you can see, uh, the styling ability is starting to lose once again and this is 125
-
00:28:55 epoch.
-
00:28:56 Okay, this is 150 epoch and it is already over trained.
-
00:29:00 So what is the the summary of this experiment?
-
00:29:05 As you increase the number of classification images, it doesn't mean you will get better
-
00:29:10 results.
-
00:29:11 From this experiment I can say that 50 classification images yielded best results for me.
-
00:29:17 That is my sweet spot for 12 training images.
-
00:29:21 I can't say it will be same for you, but 50 images looks like a sweet spot.
-
00:29:26 Also, you should take more checkpoints and compare them as I did and find your best checkpoint.
-
00:29:32 This is really important because difference between different checkpoints are huge.
-
00:29:38 You should find the best checkpoint that will work best for you.
-
00:29:42 Okay, this is all for today.
-
00:29:43 I hope you have enjoyed.
-
00:29:44 Please like, subscribe, leave a comment.
-
00:29:47 I will put the used prompt in the comment section.
-
00:29:50 Please also support us on patreon.
-
00:29:52 This is really important.
-
00:29:54 The patreon link will be in the description and also in the comment section.
-
00:29:58 You see so far we have 25 patrons.
-
00:30:00 I am hoping that you will also become our patreon.
-
00:30:03 I am hoping that you can also be for sharing, liking, making a comment and becoming our
-
00:30:09 patron.
-
00:30:10 Also, you can make a comment and tell me what you want to see next.
-
00:30:14 Hopefully see you in better more awesome videos.
