Core Nodes

Learn about ComfyUI's core nodes for image manipulation, conditioning, and more. Build powerful AI art workflows.

Image

Pad Image for Outpainting

Fill and extend the image, similar to expansion. First increase the image size, then draw the expanded area as a mask. It is recommended to use the VAE Encode (for Inpainting) to ensure that the original image remains unchanged.

Parameters:

left、top、right、bottom: Padding Amounts for Top, Bottom, Left, and Right

feathering: Edge Feathering Degree

Save Image

Load Image

ImageBlur

Add a Blur Effect to the Image

Parameters:

sigma: The smaller the value, the more concentrated the blur is around the center pixel.

Image Blend

You can blend two images together using transparency.

Image Quantize

Reduce the number of colors in the image

Parameters:

colors: Quantize the number of colors in the image. When set to 1, the image will have only one color.

dither: Whether to use dithering to make the quantized image appear smoother

Image Sharpen

Parameters:

sigma: The smaller the value, the more concentrated the sharpening is around the center pixel.

Invert Image

Invert the colors of the image

Upscaling

9.1 Upscale Image （Using Model）

Core Nodes - Upscale image using a model

9.2 Upscale Image

The Upscale Image node can be used to resize pixel images.

Parameters:

upscale_method: Select the pixel-filling method.

width: The adjusted width of the image

height: The adjusted height of the image

crop: Whether to crop the image

Preview Image

Loaders

Load CLIP Vision

Decode the image to form descriptions (prompts), and then convert them into conditional inputs for the sampler. Based on the decoded descriptions (prompts), generate new similar images. Multiple nodes can be used together. Suitable for transforming concepts, abstract things, used in combination with Clip Vision Encode.

Load CLIP

The Load CLIP node can be used to load a specific CLIP model, CLIP models are used to encode text prompts that guide the diffusion process.

*Conditional diffusion models are trained using a specific CLIP model, using a different model than the one which it was trained with is unlikely to result in good images. The Load Checkpoint node automatically loads the correct CLIP model.

unCLIP Checkpoint Loader

The unCLIP Checkpoint Loader node can be used to load a diffusion model specifically made to work with unCLIP. unCLIP Diffusion models are used to denoise latents conditioned not only on the provided text prompt, but also on provided images. This node will also provide the appropriate VAE and CLIP amd CLIP vision models.

*even though this node can be used to load all diffusion models, not all diffusion models are compatible with unCLIP.

Load ControInet Model

The Load ControlNet Model node can be used to load a ControlNet model, Used in conjunction with Apply ControINet.

Load LoRA

Load VAE

Load Upscale Model

Load Checkpoint

Load Style Model

The Load Style Model node can be used to load a Style model. Style models can be used to provide a diffusion model a visual hint as to what kind of style the denoised latent should be in.

*Only T2IAdaptor style models are currently supported

Hypernetwork Loader

The Hypernetwork Loader node can be used to load a hypernetwork. similar to LoRAs, they are used to modify the diffusion model, to alter the way in which latents are denoised. Typical use-cases include adding to the model the ability to generate in certain styles, or better generate certain subjects or actions. One can even chain multiple hypernetworks together to further modify the model.

Conditioning

Apply ControlNet

Load ControlNet model, which can connect multiple ControlNet nodes.

Parameters:

strength: The higher the value, the stronger the constraint on the image.

*The ControlNet image should be the corresponding preprocessed image, for example, the Canny preprocessed image corresponds to the Canny preprocessed graph. Therefore, it is necessary to add corresponding nodes between the original image and the ControlNet to preprocess it into the preprocessed graph

CLIP Text Encode (Prompt)

Input text prompts, including positive and negative prompts.

CLIP Vision Encode

Conditioning - Steps to CLIP Vision Encode

Decode the image to generate descriptions (prompts), then convert them into conditional inputs for the sampler. Based on the decoded descriptions (prompts), generate new similar images. Multiple nodes can be used together. Suitable for transforming concepts, abstract things, used in conjunction with Load Clip Vision.

CLIP Set Last Layer

Clip Skip, It is generally set to -2

GLIGEN Textbox Apply

Guide the prompts to generate in the specified portion of the image.

*The origin of the coordinate system in ComfyUI is located at the top left corner.

unCLIP Conditioning

The images encoded through the CLIP vision model provide additional visual guidance for the unCLIP model. This node can be chained to provide multiple images as guidance.

Conditioning Average

Blend two pieces of information based on their strengths. When conditioning_to_strength is set to 1, diffusion will only be influenced by conditioning_to. When conditioning_to_strength is set to 0, image diffusion will only be influenced by conditioning_from.

Apply Style Model

Can be used to provide additional visual guidance for the diffusion model, especially regarding the style of the generated images

Conditioning (Combine)

Blend two pieces of information.

Conditioning (Set Area)

Conditioning (Set Area) can be used to confine the affected region within a specified area of the image. Used together with the Conditioning (Combine) , it allows for better control over the composition of the final image.

Parameters:

width: The width of the control region

height: The height of the control region

x: The x-coordinate of the origin of the control region

y: The y-coordinate of the origin of the control region

strength: The strength of the conditional information

*The origin of the coordinate system in ComfyUI is located at the top left corner.

As shown in the figure: set the left side to "cat" and the right side to "dog".

Conditioning (Set Mask)

Conditioning (Set Mask) can be used to confine an adjustment within a specified mask. Used together with the Conditioning (Combine) node, it allows for better control over the composition of the final image.

Latent

VAE Encode（for Inpainting）

Applicable for Partial Repainting, right-click to achieve Partial Repainting through Open in MaskEditor.

Set Latent Noise Mask

The second method for partial repainting involves first encoding the image through a VAE encoder to transform it into content recognizable in latent space. Then, regenerate the masked part in the latent space.

Compared to the VAE Encode (for Inpainting) method, this approach can better understand the content that needs to be regenerated, resulting in a lower probability of generating incorrect images. It will reference the image to be redrawn.

Rotate Latent

Rotate the image clockwise.

Flip Latent

Flip the image horizontally or vertically.

Crop Latent

Used to crop the image into a new shape.

VAE Encode

VAE Decode

Latent From Batch

Extract latent images from batches. The Latent From Batch node can be used to select a latent image or image segment from a batch. This is very useful in workflows where isolating specific latent images or images is required.

Parameters:

batch_index: The index of the first latent image to be selected.

length: The number of latent images to retrieve.

Repeat Latent Batch

Repeat a batch of images, useful for creating multiple variations of an image in an IMG2IMG workflow.

Parameters:

amount: The number of repetitions.

Rebatch Latents

Can be used to split or merge batches of latent space images.

Upscale Latent

Adjust the resolution of latent space images, with pixel filling.

Parameters:

upscale_method: The method of pixel filling.

width: The width of the adjusted latent space image.

height: The height of the adjusted latent space image.

crop: Indicates whether the image is to be cropped.

*The Upscale image in latent space may suffer from degradation when decoded through VAE. KSampler can be used for secondary sampling to repair the image.

Latent Composite

Overlay one image onto another.

Parameters:

x: The x-coordinate of the overlay position of the upper layer.

y: The y-coordinate of the overlay position of the upper layer.

feather: Indicates the degree of feathering at the edges.

*The image needs to be encoded (VAE Encode) into latent space.

Latent Composite Masked

Overlay an image with a mask onto another, only overlaying the masked part.

input:

destination: The underlying latent space image.

source: The overlaying latent space image.

Parameters:

x: The x-coordinate of the overlay region.

y: The y-coordinate of the overlay region.

resize_source: Indicates whether to adjust the resolution of the masked region.

Empty Latent Image

The Empty Latent Image can be used to create a set of new empty latent images. These latent images can then be used in workflows such as Text2Img by adding noise and denoising to them using sampling nodes.

Mask

Load Image As Mask

Invert Mask

Solid Mask

It acts as a canvas for generating images and can be combined with Mask Composite

Convert Mask To Image

Convert Image To Mask

Convert the mask to a grayscale image.

Feather Mask

Apply feathering to the mask.

Crop Mask

Clip the mask to a new shape.

Mask Composite

Paste one mask into another, connecting Solid Mask. A Value of 0 represents black, which will not be drawn, while a Value of 1 represents white, which will be drawn. The values in the two connected Solid Masks must be different, otherwise the mask will not take effect.

input:

destination(1): The mask to be pasted in, equivalent to the final image dimensions.

source(0): The mask to be pasted.

Parameters:

X,Y: Adjust the position of the source.

operation: When the source is 0, use multiply; when it is 1, use add.