Preface:
Learning ComfyUI is a long battle, and ComfyUI Impact is a huge module node library, built-in many very useful and powerful nodes, such as Detector, Detail Enhancer, Preview Bridge, Wildcard, Hook, Picture Sender, Picture Receiver and so on. Through the combination of these nodes, we can achieve a lot of work, such as automatic face detection and optimization repair, area enhancement, local repainting, crowd control, hairstyling, changing the model's clothes and so on. ComfyUI Impact is a big river that everyone can't bypass on the way of ComfyUI advancement, so this post will lead you to understand and learn to use these nodes. I wish you all good luck in your studies and become a master of ComfyUI soon!
catalogs
I. Installation
II. SAMLoader nodes
III. UltralyticsDetectorProvider node
IV. BBOX Detector(SEGS) / BBOX Detector(combined) node
V. SAMDetector(combined) / SAMDetector(segmented) nodes
6、Simple Detector(SEGS) / Simple Detector(SEGS/pipe) / Simple Detector for AnimateDiff(SEGS)
VII. DetailerDebug (SEGS) nodes
VIII. Sample workflows
I. Installation
Method 1: Installation via ComfyUI Manager (recommended)
Open the Manager interface
Method 2: Installation with git clone command
Enter cmd in the ComfyUI/custom_nodes directory and press Enter to enter the computer terminal.
Enter the following line of code in the terminal to start the download
git clone /ltdrdata/ComfyUI-Impact-Pack
II. SAMLoader nodes
Load and process SAM (Segment Anything Model) model files. The main function of this node is to load the pre-trained SAM model into memory for image segmentation or other related operations in subsequent nodes.
Input:
Path to the SAM model **If a path file is configured, the model can choose its own **
device_name → device_mode, you can choose to run on GPU or CPU, it is recommended to choose auto.
Output:
SAM_MODEL → Segmentation of the target according to the nodes that call the model, generating a profile of the target
Note: This model can not be used alone, you need other models to specify the segmentation area for it, and then the model can do fine segmentation. Comparison is shown in the figure below:
Usage Scenarios:
- Load Model: the SAMLoader node is used to load pre-trained SAM model files from a specified path. Typically, these model files are stored on a local disk or other storage media.
- Configuration Parameters: The node will provide some parameter configuration options that allow the user to specify the path to the model file and other loading parameters.
- Output model: once the model is loaded, the SAMLoader node outputs an object representing the model, which can be passed to other nodes to perform specific image segmentation tasks.
The SAMLoader node makes it easy to load and use pre-trained SAM models for a variety of image segmentation tasks, enabling a more efficient image processing workflow.
III. UltralyticsDetectorProvider node
Provides an Ultralytics-based target detector to replace the existing mmdet-based detector, which can be loaded with one of two models, BBOX_Model and SEGM_Model.
Input:
Path of the target detection model **Mainstream models are face detection, hand detection, body detection, etc. If the path file is configured, the model can be selected by itself **
Output:
BBOX_DETECTOR → Detection using rectangular boxes
SEGM_DETECTOR → detects the outline of the target
Note: This node can only load one of the BBOX and SEGM models, and should be connected according to the model type selected in model_name, for example, my bbox/face_yolov8m.pt is a BBOX type of face detection model, so when connecting again, I have to select the BBOX_DETECTOR for connecting, which is the reason why SEGM_DETECTOR will show a red cross.
The following two images illustrate the difference between the BBOX model and the SEGM model:
Node Notes:
- Ensure that the path to the model file is selected correctly, or if using a predefined model, ensure that the network connection is stable to download the model file.
- Adjusting the confidence threshold controls the precision and recall of the detection results. Higher thresholds will result in fewer false positives but may miss detections, and lower thresholds will increase recall but may result in more false positives.
With the UltralyticsDetectorProvider node, it is easy to utilize the efficient target detection model provided by Ultralytics for various target detection tasks, such as object recognition, tracking, etc., to achieve intelligent image analysis.
IV. BBOX Detector(SEGS) / BBOX Detector(combined) node
These two nodes are used for the target detection task and these nodes output the Bounding Box (BBOX) information of the detected objects. The main difference between them is the type of input and method of processing.
The BBOX Detector (SEGS) node is typically used to extract bounding boxes from segmentation results. This approach is applied to the output of an image segmentation model where the pixels of each object have been segmented and the node determines the bounding box of each object by analyzing these segmentation results.
The BBOX Detector (combined) node is used to process the input image directly, performing both object recognition and bounding box extraction with a target detection model. This node is suitable for using pre-trained target detection models such as YOLO.
Input:
bbox_detector → detection model of type BBOX
image → input loaded image
detailer_hook → a special parameter of the impact library that is used to perform some additional actions at different stages of the model or when specific events occur
Parameters:
threshold → threshold, only objects with a recognized confidence level higher than this value are detected
dilation → adjusts the size of the detected mask area for the final display
crop_factor → determines the number of times the surrounding area should be included in the detail repair process, based on the detected masked area **If this value is small, the repair may not work correctly, because the surrounding context is not known**
drop_size → drop size, used to control the size of the image drop size during detail enhancement.
Textbox → Use to enter text to specify where detection is allowed
Output:
SEGS → contains the cropped image, mask pattern, crop position, and confidence level for each detection
MASK → a single mask combining all detected rectangular boxes
Usage:
①BBOX Detector(SEGS) Node
- Input Segmentation Result: the node receives the output of the image segmentation model, usually an image containing a segmentation mask.
- Extracting the bounding box: by analyzing the segmentation mask, the node determines the bounding box of each object.
- Output bounding box: the node outputs a message containing the bounding box of all detected objects, including the position and size of each bounding box.
②BBOX Detector(combined) node
- Input Image: the node accepts an input image.
- Target detection: detecting objects in an image and extracting the bounding box by a pre-trained target detection model (e.g., YOLO).
- Output bounding box: the node outputs a message containing the bounding boxes of all detected objects, including the position and size of each bounding box, as well as the object class and confidence score.
Reasonable use of BBOX Detector(SEGS) and BBOX Detector(combined) nodes enables complex image analysis tasks, including object detection, localization, and classification, thus enhancing the level of intelligence in various application scenarios.
V. SAMDetector(combined) / SAMDetector(segmented) nodes
for target detection and image segmentation tasks. These two nodes utilize Segment Anything Model (SAM) for advanced image processing.
Input:
sam_model → receives the SAM model loaded by the SAMLoader node
segs → receives the target detection model loaded by the UltralyticsDetectorProvider node
image → receive loaded image
Parameters:
detection_hint → Used to specify which type of detection result to use as a hint to help generate masks during segmentation.
dilation → dilate the boundary of the segmentation result
threshold → threshold, only objects with a recognized confidence level higher than this value are detected
bbox_expansion → Controls the size of the expanded boundary when generating the outline to ensure better inclusion of the target object
mask_hint_threshold → Used in conjunction with the mask_hint_use_nagative parameter to specify a threshold for detection_hint, which interprets a mask value equal to or above the threshold in the mask region as a positive hint
Note: Values greater than 0 but less than mask_hint_threshold are not used as negative hints.
mask_hint_use_nagative → Controls whether negative hints are used to assist splitting
Output:
combined_mask → depth information of the segmentation node containing the segmentation result of the whole image
batch_masks → depth information for the segmentation nodes, containing the segmentation results for each sample in the batch
MASK → depth information of the segmentation node containing the segmentation result of the whole image
caveat
- Model Selection: Ensure that the appropriate SAM model version and configuration parameters are selected for optimal detection and segmentation.
- Input data quality: whether it is a direct input image or a pre-segmented result, the quality of the input data will affect the final detection and segmentation results.
Reasonable use of SAMDetector(combined) and SAMDetector(segmented) nodes enables efficient image detection and segmentation tasks to meet the needs of multiple application scenarios.
6、Simple Detector(SEGS) / Simple Detector(SEGS/pipe) / Simple Detector for AnimateDiff(SEGS)
Similar to the function of the previous detectors, both receive model->segmentation->output segmentation results, and are designed to handle different types of inputs and workflows to meet specific application requirements.
Input:
bbox_detector → receive BBOX model
image_frames → input image sequence for animation frames
image → Receives the image to be split.
sam_model_opt → receive SAM model
segm_detector_opt → receive SEGM model
detailer_pipe → input parameters for detailing to make the segmentation result more perfect
Parameters:
bbox_threshold → detection threshold for BBOX models
bbox_dilation → expansion parameter for the bounding box of the BBOX model, used to extend the bounding box
crop_factor → crop factor, used to crop the image
drop_size → parameter to set a size threshold for filtering smaller targets
sub_dilation → expansion parameter for subgraphs
sub_bbox_expansion → bounding box expansion parameters for subgraph BBOX models
sam_mask_hint_threshold → mask hint threshold for SAM models
masking_mode → Adjust the masking mode.
segs_pivot → select the datum for animation difference detection
post_dilation → post-processing expansion parameters
Output:
SEGS → depth information for segmentation results
caveat
- Input data quality: whether it is a direct input image or a pre-segmented result, the quality of the input data will affect the final detection and segmentation results.
- Workflow Configuration: Ensure smooth connectivity and data flow between nodes in complex workflows for optimal processing.
Reasonable use of these three nodes can realize a variety of image processing tasks, including target detection, image segmentation and animation processing, to meet the needs of different application scenarios.
VII. DetailerDebug (SEGS) nodes
Zoom in on the image and draw the image internally using the KSampler.
Input:
image → the original image to be redrawn
segs → segmentation results obtained by inputting the segmentation model
model → checkpoints large model
clip → cue coding model
vae → vae model
positive → positive cue word
negative → reverse cue word
detailer_hook → extension interface for finer tuning of the model
Note: Both the forward cue word and the reverse cue word here are for the redraw section and are used to guide the image after redrawing.
Parameters:
guide_size → reference_size **Smaller target images are scaled up to match, while larger images will be skipped as they do not need detailing**
guide_size_for → set what guide_size is based on **when set to bbox it uses the bbox detected by the detector as a reference, when set to crop_region it uses the cropped region recognized based on the detected bbox as a reference**
Note: When BBOX is selected, the size of the zoomed image based on crop_factor may be several times larger than guide_size.
max_size → max_size ** limits the longest edge of the target image to a safety measure less than max_size, which solves the problem that the bbox can become too large, especially if it has a slender shape**.
seed → seed with built-in KSampler
contorl_after_generate → control how seeds are changed **fixed for fixed seeds, increment for increasing by 1 at a time, decrement for decreasing by 1 at a time, randomize for randomization of seeds **
steps → the number of denoising steps (which can also be interpreted as the number of steps to generate the image)
cfg → cue guidance coefficient, i.e., the magnitude of the effect of the cue on the result **too high to have a negative effect**
sampler_name → select sampler
scheduler → Select scheduler
denoise → denoising amplitude **the larger the value, the greater the effect and change on the image**
feather → feather size
noise_mask → controls whether or not a noise mask is used in the restoration process **While lower denoising values sometimes produce more natural results when a noise mask is not used, it is usually recommended that this parameter be set to enabled**.
force_inpaint → Prevent skipping all processes based on guide_size ** This is useful when the goal is to repair rather than refine. SEGS smaller than guide_size will not be reduced to match guide_size; instead, they will be repaired to their original size**
Textbox → Enter wildcard specification, this option is ignored if left empty
cycle → number of iterations for sampling **When used with Detailer_hook, this option allows for the addition of intermittent noise, and can also be used to gradually reduce the denoising size, initially building the basic structure and then refining it.
inpaint_model → This option needs to be enabled when using the repair model to ensure correct inpainting at noise reduction values below 1.0
noise_mask_feather → controls whether the feathering operation is applied to the mask of the repair process
Note: This option does not guarantee a more natural image, while it may create artifacts at the edges, people set it as needed!
Output:
image → final redrawn image
cropped → resizing the image, removing unnecessary parts, focusing on specific areas of the effect image
cropped_refined → cropped and further processed image
cropped_refined_alpha → cropped and refined alpha channel
cnet_images → mask location map
caveat
- Input data quality: Ensure that the input segmentation results are accurate, because the quality of input data will directly affect the analysis and debugging results.
- Node Configuration: Adjust the configuration parameters of the DetailerDebug(SEGS) node according to specific needs to get the best analysis and debugging results.
By using the DetailerDebug(SEGS) node, you can gain insight into and optimize the performance of the image segmentation model to ensure that the segmentation results meet the requirements of real-world applications, as well as to improve the efficiency of model development and debugging.
VIII. Sample workflows
With the above nodes, you can build a simple "AI face swap" workflow.
This workflow first detects and segments the face of the loaded image, then inputs the result of segmentation, the original image, and the cue word for redrawing into the detail fixer, where the built-in sampler will rediffuse the face of the image according to the cue word, and finally outputs the image after face-swapping. The original image and the image after face replacement are shown below:
**To go beyond oneself is to strive for excellence. Perseverance is the key to success. **