Dev nodes nexfort booster #911

ccssu · 2024-05-25T13:00:17Z

Nexfort
How to use Nexfort
- Case 1
- Case 2
Vae
- ComfyUI Workflow
- Result
Lora
- ComfyUI Workflow
- Result
Controlnet
- ComfyUI Workflow
- Result
IPAdapter

Nexfort

Vae Speedup
Quick Switching Lora
Controlnet Speedup
支持编译 IPA https://github.com/cubiq/ComfyUI_IPAdapter_plus
支持编译 PuLID_ComfyUI https://github.com/cubiq/PuLID_ComfyUI
支持编译 https://github.com/cubiq/ComfyUI_InstantID
支持编译 https://github.com/city96/ComfyUI_ExtraModels
Quick Switching checkpoint

cd ComfyUI

# For CUDA Graph
export NEXFORT_FX_CUDAGRAPHS=1

# For best performance
export TORCHINDUCTOR_MAX_AUTOTUNE=1
# Enable CUDNN benchmark
export NEXFORT_FX_CONV_BENCHMARK=1
# Faster float32 matmul
export NEXFORT_FX_MATMUL_ALLOW_TF32=1

# For graph cache to speedup compilation
export TORCHINDUCTOR_FX_GRAPH_CACHE=1

# For persistent cache dir
export TORCHINDUCTOR_CACHE_DIR=~/.torchinductor



# debug
# export  TORCH_LOGS="+dynamo" 
# export  TORCHDYNAMO_VERBOSE=1
# export NEXFORT_DEBUG=1 NEXFORT_FX_DUMP_GRAPH=1 TORCH_COMPILE_DEBUG=1

python main.py --gpu-only --disable-cuda-malloc --port 8188 --cuda-device 6

Install: https://github.com/siliconflow/onediff/tree/main/src/onediff/infer_compiler/backends/nexfort#install-nexfort
torch.version='2.4.0.dev20240507+cu124'
nexfort.version='0.1.dev215+torch240dev20240507cu121'
commit ffc4b7c30e35eb2773ace52a0b00e0ca5c1f4362 (HEAD -> master, origin/master, origin/HEAD)
Author: comfyanonymous [email protected]
Date: Sat May 25 02:31:23 2024 -0400

How to use Nexfort

Case 1

# Compile arbitrary models (torch.nn.Module)
import torch
import onediff.infer_compiler as infer_compiler

class MyModule(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.lin = torch.nn.Linear(100, 10)

    def forward(self, x):
        return torch.nn.functional.relu(self.lin(x))

mod = MyModule().to("cuda").half()
with torch.inference_mode():
    compiled_mod = infer_compiler.compile(mod,
        backend="nexfort",
        options={"mode": "max-autotune:cudagraphs", "dynamic": True, "fullgraph": True},
    )
    print(compiled_mod(torch.randn(10, 100, device="cuda").half()))

Case 2

import torch
import onediff.infer_compiler as infer_compiler
@infer_compiler.compile(
    backend="nexfort",
    options={"mode": "max-autotune:cudagraphs", "dynamic": True, "fullgraph": True},
)
def foo(x):
    return torch.sin(x) + torch.cos(x)

print(foo(torch.randn(10, 10, device="cuda").half()))

Vae

ComfyUI Workflow

Result

{ model: sdxl, batch_size: 1 , image: 1024x1024 , speedup: vae}

Accelerator	Baseline (non-optimized)	OneDiff (Nexfort)	Percentage improvement
NVIDIA GeForce RTX 4090	3.02 s	2.95 s	2.31%

First compilation time： 321.92 seconds

Lora

ComfyUI Workflow

Result

{ model: sdxl, batch_size: 1 , image: 1024x1024 , speedup: vae + unet}

Accelerator	Baseline (non-optimized)	OneDiff (Nexfort)	Percentage improvement
NVIDIA GeForce RTX 4090	3.02 s	1.85 s	38.07 %

First compilation time： 878.19 seconds

Controlnet

ComfyUI Workflow

Result

{ model: sdxl, batch_size: 1 , image: 1024x1024 , speedup: controlnet}

Accelerator	Baseline (non-optimized)	OneDiff (Nexfort)	Percentage improvement
NVIDIA GeForce RTX 4090	4.93 s	4.07 s	17.44 %

First compilation time： 437.84 seconds

IPAdapter

src/onediff/infer_compiler/backends/nexfort/nexfort.py

ccssu added 2 commits May 25, 2024 07:03

dev nodes_nexfort_booster

0c4431c

refine

f26e10e

ccssu marked this pull request as draft May 25, 2024 13:00

ccssu commented May 25, 2024

View reviewed changes

src/onediff/infer_compiler/backends/nexfort/nexfort.py Outdated Show resolved Hide resolved

ccssu and others added 23 commits May 26, 2024 19:57

Merge branch 'main' into dev_nodes_nexfort_booster

7834c2f

refine deployable_module

2a0bffb

refine

db83dbc

add sub model test

91ce792

Add TestModelInference

4e06084

refine

149f5c1

fix oneflow import

a2f5316

refine

d81cf6a

merge main

ca7c3b0

update compiler_modes

562633e

support ipa

ca11789

Merge branch 'main' into dev_nodes_nexfort_booster

d73b4e3

support pulid

c0c0b4e

support puild

4611ad4

fix import error

554edb9

support ComfyUI_InstantID

bf0895c

Merge branch 'main' into dev_nodes_nexfort_booster

a395a97

support Quick Switching checkpoint

d2801a8

Merge branch 'main' into dev_nodes_nexfort_booster

e9ea9f7

merge main

83549dd

refine

2e41671

fix Quick Switching checkpoint

15b3900

refine cache

4e814bb

strint marked this pull request as ready for review June 12, 2024 14:28

strint merged commit 323897c into main Jun 12, 2024
7 checks passed

strint deleted the dev_nodes_nexfort_booster branch June 12, 2024 14:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev nodes nexfort booster #911

Dev nodes nexfort booster #911

ccssu commented May 25, 2024 •

edited

Loading

Dev nodes nexfort booster #911

Dev nodes nexfort booster #911

Conversation

ccssu commented May 25, 2024 • edited Loading

Nexfort

How to use Nexfort

Case 1

Case 2

Vae

ComfyUI Workflow

Result

Lora

ComfyUI Workflow

Result

Controlnet

ComfyUI Workflow

Result

IPAdapter

ccssu commented May 25, 2024 •

edited

Loading