4-4 Advanced Config
Fine-tune your AI art in SeaArt! Master negative prompts, VAEs, sampling, CFG scale, seed, and clip skip for precise control.
Last updated
Fine-tune your AI art in SeaArt! Master negative prompts, VAEs, sampling, CFG scale, seed, and clip skip for precise control.
Last updated
Creation Process: Select Model - Enter Prompts - Adjust Parameters - Generate
Generally, it's challenging for the model to understand negations in prompts, such as words like "no," "not," "except," or "without." Therefore, we need to include unwanted effects in the Negative Prompts. Apart from adding elements we don't want in the image, we can also include words like "low quality," "low detail," "ugly," "deformed," etc. This helps improve the quality of the final image. Typically, when generating images, SeaArt automatically includes negative labels.
(worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art:1.4), (watermark, signature, tet font, username, error, logo, words, letters, digits, autograph, trademark, name:1.2), (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities:1.3)
VAE can be regarded as a kind of "filter" that improves the quality of image generation and enhances visual effects through optimization algorithms. It can also make slight adjustments to the shapes of images. If you notice color issues in the images, you can try switching to a different VAE.
Commonly used VAEs:
Automatic: Automatically selects the most suitable VAE configuration for the current task.
None: Does not use any VAE.
vae-ft-mse-840000-ema-pruned: Realistic color style, 840000 indicates the number of training iterations, which helps improve the quality of generated images while reducing complexity and increasing efficiency.
vae-ft-ema-560000-ema-pruned: Realistic color style, trained for 560000 iterations, suitable for faster or lower resource-consuming image generation.
kl-f8-anime2: Optimized for generating images in anime style.
*Some Checkpoint come with built-in VAE, so there is no need to select VAE separately.
A standard AI painting process typically involves forward addition of noise and backward denoising, restoration, and target generation. During the forward process, noise is continually added to the input data, while the sampler is responsible for denoising during the backward process.
Forward process (from right to left): Gradually adding noise to the original image, mainly during the training process to train the U-Net network's ability to predict noise.
Backward process (from left to right): Gradually denoising the estimated noise by the trained U-Net network, ultimately reproducing the image.
In these two processes, AI effectively scrambles a specific image, then learns from parts of it to create a new image in reverse. That is, once the forward process is trained, the backward process generates a completely new image from a noisy image.
Before generating a clear image, the model needs to generate a random image in the latent space. The noise predictor starts working by subtracting the predicted noise from the image. With repeated steps, we eventually obtain a clear image. The entire denoising process can be referred to as "sampling," and the method used in sampling is called a sampler or sampling method.
The sampling method determines how the denoising is performed, and different sampling methods yield different image results.
*The numerical value of the seed determines the initial noise of the first generated image.
Old-School ODE Solvers
Euler: Euler method, the simplest solver.
Heun: More accurate but slower version of the Euler method.
LMS: Linear Multistep Method, same speed as Euler but more accurate.
Convergence: As the number of sampling steps increases, the sampled results eventually tend toward a fixed image, and the image gradually stabilizes.
Ancestral Samplers (names include an "a")
Euler a
DPM2 a
DPM++ 2S a
DPM++ 2S a Karras
These samplers add noise at each sampling step, thus exhibiting a degree of randomness and not converging.
Non-convergence: The images are random and may add some details. To obtain stable and reproducible results, one should avoid using ancestral samplers.
*Some samplers, even without an "a" in their name, are also random samplers.
DDIM, PLMS (no longer widely used)
DDIM: Denoising Diffusion Implicit Models, the first sampler designed for diffusion models.
PLMS: Pseudo Linear Multistep Method, a faster alternative to DDIM.
DPM and DPM++ Series
These samplers have a high utilization of tags, appropriately magnifying sampling steps for better effects, though overall speed is slower. DPM++ is an improvement over DPM, yielding more accurate results but at a slower pace.
Karras: Produces clear images with fewer sampling steps, optimizing the algorithm.
Restart: Uses fewer sampling steps to generate good images in less time.
LCM: Generates images quickly.
*Recommended to use:
Euler/Euler a: Fast speed, high quality, suitable for most scenarios, recommended steps are 15-30.
DPM++2M Karras: Convergent, fast speed, good quality (15-25 steps).
DPM++SDE Karras: Not convergent, slow speed, good quality, suitable for realistic images, recommended 10-15 steps.
DPM++2M SDE Karras: Intermediate algorithm between 2M and SDE, not convergent, slightly faster speed.
DPM++ 2M SDE Heun Exponential: Not convergent, soft and clean image, with fewer details.
DPM++ 3M SDE Karras
DPM++ 3M SDE Exponential: Same speed as 2M, requires more sampling steps, when sampling steps > 30, lower the text intensity (CFG) for better results.
Restart: Very fast speed, only suitable for quickly producing drafts or concept verification, ideal results can be achieved with very few steps.
LCM: "Real-time rendering" can be achieved in only 4 steps, although the image quality is average, it is suitable for generating inspirational sketches or preliminary concept designs.
*Note
I. Prioritize the algorithms recommended by the model author to ensure the best compatibility and effectiveness.
II. Prioritize using algorithms with a plus sign, as optimized algorithms tend to be more stable than those without a plus sign.
III. When encountering noise issues in the generated images, consider trying a different sampler.
Generally, the higher the number of sampling steps, the better the quality of the image. However, around 25 sampling steps are usually sufficient to achieve high-quality images. Increasing the number of steps beyond this point may generate different images, but it doesn't necessarily guarantee better quality. Additionally, higher sampling steps require more time. In most cases, there's no need to set excessively high sampling steps, which would only increase the waiting time.
As the number of sampling steps increases, the main form of the "girl" remains relatively consistent, while certain small details such as hair quality, color, background, etc., improve with the increase in steps. Therefore, the number of sampling steps should be adjusted according to one's own needs.
The relevance to the prompts: The higher the text strength, the closer the image is to the prompts. It's generally set around 7-10. If it's set too high, it may cause image breakdown. If the generated image does not follow the prompts, you can increase the text strength appropriately.
Prompts: full lenght shot, super hero pose, biomechanical suit, inflateble shapes, wearing epic bionic cyborg implants, masterpiece, intricate, biopunk futuristic wardrobe, highly detailed, artstation, concept art, cyberpunk, octane render
During the drawing process, there is significant uncertainty in AI Art , as each drawing involves a set of random computational mechanisms, each corresponding to a fixed seed value. By fixing the seed value, we can control the randomness of the drawing results.
For example, if we are satisfied with a particular image generated, we can fill in its seed value here to reproduce the same content. Clicking 'Random' resets the seed to the default -1, while 'Customization' allows you to freely enter the seed value.
Using the same parameters, prompts, and seed will produce identical images. Therefore, we can utilize the same seed to modify certain parameters, resulting in the generation of new images with the original features.
*Only modifying the emotional words to change the facial expression while keeping other features such as hair, clothes, and background unchanged.
Layer by layer, the prompts are transformed into numbers, then read by the converter, providing a progressively detailed understanding of the prompts.
If the prompt is: "A young girl, wearing a black dress, with a black hat, holding a wand, a witch," when Clip Skip is set to 2, the AI may omit the concepts of the black dress or the wand. As the Clip Skip value increases, the AI will omit more of the prompt.
Therefore, when Clip Skip is set to 1, it means terminating the image from the last layer. The result will include a complete description of the prompts. The earlier the termination, the less description will be obtained from the prompts, resulting in less accuracy in the final result. It's generally set to 2.
What is Clip Skip for?
Clip Skip helps to address overfitting situations by terminating the reading of the prompts in a timely manner. When the image is overfitted, Clip Skip can be increased.
By setting Clip Skip, you can adjust the details and style of AI Art, making the final result more flexible and controllable, thus meeting different image generation requirements.
Prompts: best quality,masterpiece,illustration,beautiful detailed glow,textile shading,absurdres,highres,dynamic lighting,intricate detailed,beautiful eyes,[backlighting],face lighting,(pov:1.3), (1 girl, solo:1.5),asymmetric bang,black hair,(smile),(jeans pants and shirts)