Image Conversion
Master ComfyUI's image conversion! This guide enhances Image-To-Image with precise control, facial referencing, and high-definition upscaling.
Thinking Process:
Comfyui's Image conversion is similar to webui's Img2Img, where an original image is uploaded and its style is modified through the model. However, to enhance the precision of the conversion, we can add a few new steps:
I. Add a model magnification node to control the size of the original image.
II. Increase similarity to the original image:
a. Use ipadapter faceid to reference facial features.
b. Reverse-engineer the original image prompts.
c. Add ControlNet (openpose, canny, depth).
III. Upscale the final Image
Step 1: Build the Model Group
We can start building on the basis of the Img2Img template. First, selectively add an "Upscale Image By" node behind the original image to control the size of the original image. Lora can be added according to individual needs, or not added at all. Add a "CLIP Set Last Layer" node according to your needs, which can also be omitted. This node allows skipping of layers, and finally, connect the corresponding nodes.
Add nodes:
Upscale Image By
CLIP Set Last Layer
Step 2: Reference the Original Image
(Reverse-engineer prompts + Ipadapter + Control Net)
Reverse-engineer prompts: WD14 Tagger node
Double-click to search and add the WD14 Tagger node.
Connect the image node.
Right-click on the positive prompts node and select "Convert text to input" to connect the WD14 Tagger to the positive prompts node.
However, this only includes the prompts from the image. If you want to add other prompts, You need to create a new Text Concatenate node, which can connect multiple segments of prompts together.
Then create a new Primitive node. The Primitive node can be connected to any node to become a related attribute.
Enter additional prompts on the Primitive, such as Lora Trigger Words, some quality words, etc.
At this point, the prompts not only include the ones reverse-engineered from the image but also those we input.
Next, set up the IPadapter FaceID to reference facial features:
Double-click to search for IPadapter FaceID and match the input nodes accordingly.
After dragging out the node, create new nodes:
ipadapter→ IPAdapter Model Loader
clip_vision→ Load CLIP Vision
insightface→ IPAdapter InsightFace Loader
Connect the output to the sampler.
Add nodes:
IPadapter FaceID
IPAdapter Model Loader
Load CLIP Vision
IPAdapter InsightFace Loader
Set up the ControlNet
It is recommended to use the CR Multi-ControlNet Stack node, which allows the addition of multiple ControlNets. Then add the corresponding preprocessors. It is recommended to use OpenPose, Canny, and Depth as the ControlNet. You can add or remove them based on the final visual needs. Afterwards, add a ControlNet application node at the output: CR Apply Multi-ControlNet. It is recommended to set the resolution at the preprocessor to 1024.
*Finally, remember to turn on the switches of the ControlNet that will be used.
Input:
Connected to the positive and negative prompt nodes.
Output:
Connected to the sampler.
Add nodes:
CR Multi-ControlNet Stack
CR Apply Multi-ControlNet
Step 3: High-Definition Restoration
After the model group and reference to the original image have been set up, we can add an image high-definition restoration step at the final output of the image:
Add Nodes: Upscale Image (using Model)
After assembling the nodes, you can organize them into a group for easier viewing.
Finally, it's necessary to adjust the relevant parameters based on the output image, such as ckpt, Lora weights, prompt words, sampler, redraw scale, etc.
Key parameters for this conversion include:
CLIP _layer:-2
Upscale Image By: 1.5
steps: 40
sampler_name: dpmpp_2m
scheduler: karras
denoise: 0.7
Note: If you choose the SDXL model, you will also need the corresponding SDXL Lora, and adjust the ControlNet to SDXL; otherwise, the image output will fail.
The above is a complete workflow for image conversion. Based on this, you can also add VAE or FreeU_V2 to adjust the final image:
FreeU_V2: Mainly controls color and extracts some content for optimization.
Load VAE: Fine-tunes the color and details of the image.
Through such a workflow, you can achieve conversions to different styles.
Last updated