stable diffusion Getting Started Tutorial

sd foundation

How it works & Getting started

After typing the prompt, a text encoder will send the prompt to a text encoder.Compile into feature vectors，vae encoderPass the feature vector into thesubspace, the feature vectors are continuously noise reduced in the latent space and finally passed through thevae decoderDecode the noise-canceled feature vector into a pixel-by-pixel image.

The default vae model is generally checked

Modeling of Decoding Coding

The larger the CLIP value, the faster the advance stops, the fewer layers of our cue word are digitized, and the less relevant the cue word is. Conversely the smaller it is the more it enriches the cue word
The number of CLIP termination layers is typically 2

Other Functions

High Definition Restoration
When the rendering reaches 80%, switch to another model rendering.
Typically 5-10 The smaller the ai the freer it is and the higher it is the closer to the cue word
4. Random seeds The same prompt word, the same random seeds can be generated in different computers the same picture
5. The higher the number of iteration steps 20-30 The higher the quality of the image, the higher the number of steps is distorted, and consumes more time, and sometimes there is no effect

ADetailer Repairing a character's face
Sampling Method DPM++2M

Website, model recommendation

liblib NovelAI hugingface promlib civitai github
majicmix dreamshaper primemix architectrealmix

Cue word Grammar

Title symbols in English and
weights
The higher the number the higher the weight, and the picture focuses on what is being described

[cat]=(cat:0.9)
(cat)=(cat:1.1)
{cat}=(cat:1.05)
[[cat]]=(cat:0.9x0.9)=(cat:0.81)
((cat))=(cat:1.1x1.1)=(cat:1.21)
ctrl + up-arrow is a quick way to adjust the weights, you can go as low as you want, but not as high as you want, it will distort the image.

Short versus long sentences

What about spelling one word at a time instead of one sentence
Phrase by phrase would be more accurate and good for adjusting weights
Keep the prompts to no more than 75 words, positive and negative alike, no more than 75

starter

Forward 4k masterpiece will make the image more beautiful.
The opposite would be something like text blur
with modifiers

Cue word order, the higher the weight of the more advanced

no.1 Picture quality/style words
no.2 Theme one girl
no.3 Environment/Scene/Composition
no.4 lora

cue word pollution
1girl,blue dress,pink hair,green umbrella,
1girl,blue dress,red hair,puple umbrella,

Prevent cues, especially colors, from penetrating into each other by separating them with breaks.

Cue word fusion

1girl,cat cat on girl
1girl And cat Cat (1girl_cat has the same effect)
[cat|dog] also has a fusion effect.
{forest:1girl:0.3} End drawing forest at 30%.
{forest:1girl:0.7} End drawing forest at 70%.

Tucson & HD Restoration

Generate results by image plus cue word

Use a preset starting position and add in a cue word

Quickly select the cue words you want with the plugin

masterpiece,best quality,1girl,police,glamor,in summer,street,

Drag and drop the reference chart into the graph raw chart and add a cue word
masterpiece,best quality,1girl,police,glamor,in summer,street,coat,
Add the cue word coat to generate the image again through the original image

Tupelo redrawing range, not too high and not too low 0.3-0.5 The gap between the resultant figure and the reference figure will not be too big
                            0.5-0.7 gives ai more room for imagination.
                            Less than 0.3 or more than 0.7 will distort the image.

Local repainting, adding sunglasses cue words

upscale secondary

GAN 4X Anime6B for Anime Amplification (Abbreviation of Gan Generative Adversarial Network)
Redraw range 0.3-0.5 safety zone 0.5-0.7ai free field
Magnification x2 512 to 1024

Vincennes interface

Zooming in on the Vincennes screen requires locking the seed

Click the little icon again to come to theTucson (city)Zoom in again.

No need to lock seeds

Zoom in again. Zoom in on the model.

Scaled from 512 to 1024

controlnet

Style Conversion softedge

In the beginning, the only person who could restore the reference image through line processing was Canny.
canny hard edges canny outlines the reference image by a thin line Architecture calculates the difference based on neighboring pixels, rigid, the image will have inexplicable elements

softedge All-purpose modeling Outlining the subject

Enable the plug-in controlnet

pidinet and hed
hed Retain more details in the picture, good integrityArchitecture, Scenery
Pidinet is better able to preserve the subject and ignore the details.character (in a play, novel etc)

Use the model of the real department to describe the secondary picture well

Lineart coloring

lineart is mostly used for lineart coloring anime image processing
mj Generate line art

Click the explosion button, there will be to download the plug-in, such as the console shows git network failure, manually go to download to the appropriate directory to decompression

It's best to have the same aspect ratio as the original image, i.e. width and height.

controlnet-2

openpose Attitude Management

Successful attitude control

dw_openpose_full works the best!
According to the reference picture picture height and width consistent

depth [spatial relationship]

Scenes, like a church.
softedge + depth
Control line distribution + depth

depth

Preprocessing with hedsafe
softedge
Negative cue word here because it's a building, so removing the finger is relevant

Church colors can be changed by cue words

character (in a play, novel etc)
softedge openpose depth ipadapter
Line distribution Bound bones Spatial relationships Facial feature migration/style consistency

tile

The reference image was blurred, then resampled in blocks, resulting in a more detailed image.

You can drag the image into controlnet's single image and tile it again. you can see the grain of the leaves

controlnet Ipadapter

The ipadapter can be downloaded from hugingface and placed in the specified directory according to the suffix.
face transplant
material migration
style migration

face transplant

Generate an image for reference

Drag the reference image into the ipadapter

Realistic face swap, uploaded a picture of myself

material migration

ipdapter

softedge

depth

Midas is good too.

Material is very important, if you use the material below then use crystal materials

Ipadapter-1

style migration

Choosing a comprehensive large model dreamshaper

Generate Reference Chart

Style transfer

Click Generate.

HD zoom after successful generation

Direct style change
You can even change the drawing style of anime characters directly based on the

Combined use of ipadapter and openpose

controlnet unit1 ipadapter

Generating Skeletal Diagrams

controlnet unit2 openpose

Write prompts to add lora

It's the same scale as the skeleton.

Lora is closer to Hermione's picture.

Comprehensive Case Usage

interior design
Large Model Selection Architecturerealmix

unit 1 mlsd only contains straight lines after processing

unit 2

Generation of rough housing
Unfinished, nothing, no furniture, rough, house interior

well-furnished
ModernRoomDesign,Interior design,modern simplicity,green,(masterpiece),(high quality),best quality,real,(realistic),super detailed,(full detail),(4k),modern,fashion,grand,vista,(high floor:1.2),

Go to Toussaint

Change keywords and then graphs to generate graphs

If the finished product comes out and you still want to change it, go ahead with the Toussaint figure
If you want the wallpaper to be in the style of your own picture, by checking on theTransmission of independent control images
Using ipadapter, rendering
If there is no spatial depth then continue to use depth

stable diffusion basis [liblib]

Positive Cues

trait
1girl,solo,
suspender dress,headdress,delicate eyes,beautiful face,shallow smile,snow-white skin,elegant standing,
Scene Characteristics
outdoor,blue sky,white clouds,flowers,grass,
Scene Setting [Weather & Light & Day & Night]
day,night sunset，rain shrong rim light
movie light,light tracking,
scene adjective
beautifful ,happy
Raw Graphics Standards [Quality & Style]
8k，highest quality，high resolution
Comic Watercolor Realistic Abstract

1girl,solo,
suspender dress,headdress,delicate eyes,beautiful face,shallow smile,snow-white skin,elegant standing,
outdoor,blue sky,white clouds,flowers,grass,
movie light,light tracking,
beautiful,happy,
8k，highest quality，high resolution,
realistic,extreme detail

negative direction

low quality,blurry,bad proportions,cropped,watermark
ugly,bad body,missing fingers,extra feet
NSFW text logo

parameters

eular a brush selection, according to the recommendations, are actually pretty much the same
15-25 steps, too much won't help.

comfyui tries

sd hang lora + Chinese workflow
Images have metadata
Workflows can be obtained by pulling images

comfyui Most of the problems can be solved by installing offline Google can fix it. Whenever an error is reported
Go through the logs, the workflow reports red, install models if they are missing, install plug-ins if they are missing, and install whatever is missing!