Multi-GPU Support for Batch Image Processing in ComfyUI #5672
Replies: 1 comment
-
I've been contemplating this issue for quite some time.With the outstanding performance of DIT models,whether in the field of images or videos,DIT models always excel.However,their demand on devices has also become very high.If you add the ecological models of DIT models,the GPU memory consumption when they run will become unbearable for personal computers,often suffering from memory overflow and long waiting times.Future models will only get larger and more effective.But currently,many users'personal computers cannot run these models well,with most computers having around 16GB of video memory.The proportion of users with 24GB or 48GB of video memory is very small.If ComfyUI achieves multi-GPU parallel computing,this problem will be solved.I look forward to the official team accelerating the implementation.Otherwise,as time goes on and ComfyUI's codebase grows,it will become increasingly difficult to implement multi-GPU operation. |
Beta Was this translation helpful? Give feedback.
-
Multi-GPU Support for Batch Image Processing in ComfyUI
Context:
I am currently using 4 Nvidia T4 GPUs and am exploring ways to utilize them efficiently with ComfyUI for batch image generation and processing. My goal is to enable ComfyUI to leverage all available GPUs to handle API requests in a balanced and scalable manner.
Desired Setup:
Model Loading: Load the required model(s) into all 4 GPUs during initialization.
Request Handling:
For single API requests, automatically assign them to an available GPU.
For multiple concurrent requests, distribute the load equally across the GPUs.
Batch Processing: Allow for the generation of up to 4 images simultaneously, leveraging each GPU for one image in parallel.
Key Requirements:
Automatic GPU Assignment: ComfyUI should recognize GPU availability and assign tasks accordingly.
Load Balancing: Evenly distribute requests to avoid overloading any single GPU while others are idle.
Concurrency: Handle simultaneous API requests efficiently.
Scalability: Support this setup seamlessly for the existing GPUs and potentially more in the future.
Challenges:
Does ComfyUI natively support multi-GPU processing for this use case?
If not, what would be the recommended approach to achieve this functionality?
Modifications in the ComfyUI codebase to support multi-GPU usage.
Integrating external libraries like torch.nn.DataParallel or torch.distributed for distributed processing.
Any potential limitations or known issues when using ComfyUI in multi-GPU environments?
Proposed Solutions:
I am considering the following approaches but would like input from the community:
Custom GPU Management Layer:
Implement a GPU management mechanism that intercepts API requests and assigns them to available GPUs.
Use PyTorch's distributed utilities for model replication and task allocation.
Modify ComfyUI:
Investigate the ComfyUI source code to implement multi-GPU handling directly.
Introduce settings for model replication across GPUs and API request distribution.
Third-Party Load Balancers:
Utilize an external load balancer (e.g., Kubernetes with GPU-aware scheduling) to manage GPU tasks.
This would handle request routing but may require additional configuration for ComfyUI.
Request for Feedback:
Has anyone implemented a similar multi-GPU setup with ComfyUI?
What challenges or limitations should I be aware of?
Are there specific tools or best practices that could help streamline this setup?
Any plans to include native multi-GPU support in ComfyUI in the future?
Looking forward to hearing your thoughts and suggestions!
Beta Was this translation helpful? Give feedback.
All reactions