Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ControlNet #153

Merged
merged 10 commits into from
Apr 18, 2023
Merged

Support ControlNet #153

merged 10 commits into from
Apr 18, 2023

Conversation

ryu38
Copy link
Contributor

@ryu38 ryu38 commented Apr 8, 2023

I added ControlNet feature in model conversion and inference.

New Files

controlnet.py

ControlNet.swift

  • This is used in image generation with Swift.

Main Changes

torch2coreml.py

  • two new options added
    • --convert-contronet
      • Unlike other --convert-* options, it requires controlnet models name after the option.
      • To convert multiple models, provide their names separated by spaces.
      • Example: --convert-contronet lllyasviel/sd-controlnet-mlsd lllyasviel/sd-controlnet-canny
      • ControlNet model is saved as ControlNet_lllyasviel_sd-controlnet-mlsd.mlpackage
    • --unet-support-controlnet
      • This option enables UNet to receive ControlNet results as additional inputs.
      • The model is saved with a different name: *_control-unet.mlpackage

unet.py and UNet.swift

  • Supports ControlNet

pipeline.py

  • two new options added
    • --controlnet
      • Models provided with this option are used in image generation.
      • Enter the option in the same way as --convert-contronet option in torch2coreml.py
    • --controlnet-inputs
      • Image inputs corresponding to each ControlNet
      • Enter paths to the images in same order as --controlnet
  • If ControlNet is enabled, pipeline uses "control-unet.py" instead of "unet.py"

StableDiffusionCLI

  • two new options added. These are almost the same as ones in pipeline.py.
    • --controlnet (enter model file names in Resources/controlnet without extension)
    • --controlnet-inputs
  • If ControlNet is enabled, pipeline uses "ControledUNet.mlmodelc" instead of "UNet.mlmodelc"

Do not erase the below when submitting your pull request:
#########

  • I agree to the terms outlined in CONTRIBUTING.md

@pj4533
Copy link

pj4533 commented Apr 8, 2023

🎉 nice!

@@ -194,6 +208,9 @@ def bundle_resources_for_swift_cli(args):
("unet", "Unet"),
("unet_chunk1", "UnetChunk1"),
("unet_chunk2", "UnetChunk2"),
("control-unet", "ControledUnet"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Could we please change this toControlledUnet?

var destinationG = try vImage_Buffer(width: Int(width), height: Int(height), bitsPerPixel: 8 * UInt32(MemoryLayout<Float>.size))
var destinationB = try vImage_Buffer(width: Int(width), height: Int(height), bitsPerPixel: 8 * UInt32(MemoryLayout<Float>.size))

var minFloat: [Float] = [-1.0, -1.0, -1.0, -1.0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff in this file looks unexpectedly large, could you please verify that the only changes are related to minFloat and maxFloat vars?

Copy link
Collaborator

@atiorh atiorh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work @ryu38! I left a few comments that you could hopefully address. Do you mind adding the new CLI args (Python and Swift) in the README?

for n in 0..<results.count {
let result = results.features(at: n)
if currentOutputs.count < results.count {
let initOutput = result.featureNames.reduce(into: [String: MLMultiArray]()) { output, k in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use MLShapedArray instead of MLMultiArray

let result = results.features(at: n)
if currentOutputs.count < results.count {
let initOutput = result.featureNames.reduce(into: [String: MLMultiArray]()) { output, k in
output[k] = MLMultiArray(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a lot faster if we could pre-allocate the output with the expected size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this suggesting that we should pre-allocate MLShapedArray with a specific shape in output dictionary? If we do this before allocating model results, would we create an MLShapedArray filled with zero values?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, create it with the right size and fill with zeros.

let fileName = model + ".mlmodelc"
return urls.controlNetDirURL.appending(path: fileName)
}
if (!controlNetURLs.isEmpty) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!controlNetURLs.isEmpty) {
if !controlNetURLs.isEmpty {

let unetURL: URL, unetChunk1URL: URL, unetChunk2URL: URL

// if ControlNet available, Unet supports additional inputs from ControlNet
if (controlNet == nil) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (controlNet == nil) {
if controlNet == nil {

"timestep" : MLMultiArray(t),
"encoder_hidden_states": MLMultiArray(hiddenStates)
]
additionalResiduals?[$0.offset].forEach { (k, v) in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
additionalResiduals?[$0.offset].forEach { (k, v) in
for (k, v) int additionalResiduals?[$0.offset] {

@@ -29,6 +33,10 @@ public extension StableDiffusionPipeline {
safetyCheckerURL = baseURL.appending(path: "SafetyChecker.mlmodelc")
vocabURL = baseURL.appending(path: "vocab.json")
mergesURL = baseURL.appending(path: "merges.txt")
controlNetDirURL = baseURL.appending(path: "Controlnet")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since torch2coreml seems to export to the controlnet directory, it seems like a good idea to start with lower case here as well.

Thanks for your great contribution!

@ryu38
Copy link
Contributor Author

ryu38 commented Apr 15, 2023

Thank you for your reviews! I'll check or fix them one by one. I'll also update README to include about the new args.

@atiorh
Copy link
Collaborator

atiorh commented Apr 17, 2023

@ryu38 I see that you have pushed some commits addressing the feedback. Please let me know when you would like me to re-review :)

@atiorh
Copy link
Collaborator

atiorh commented Apr 18, 2023

Update: I am running the final tests and I will merge this PR when they pass. The latest commit seems to have addressed all the feedback but I will do one more visual pass just in case

@atiorh atiorh merged commit 7f65e1c into apple:main Apr 18, 2023
@ryu38
Copy link
Contributor Author

ryu38 commented Apr 18, 2023

@atiorh I apologize that I pushed new commit just before the branch merged. This commit addressed the remaining feedback and improved inference speed in ControlNet.swift.

@pj4533
Copy link

pj4533 commented Apr 18, 2023

Just wow! Well done all. Can't wait to dig into this! 🎉🎉🎉🎉

@atiorh
Copy link
Collaborator

atiorh commented Apr 19, 2023

Just realized the extra commit, this is my bad too! I don't have concerns with the diff though. Thanks for the contribution @ryu38 !

@ryu38
Copy link
Contributor Author

ryu38 commented Apr 19, 2023

@atiorh Thank you for your confirmation!
I'm happy that we were able to incorporate ControlNet in this project! 🙌

@TimYao18
Copy link

Excuse me,

I call the following command:

python -m python_coreml_stable_diffusion.torch2coreml \
    --convert-vae-decoder --convert-vae-encoder --convert-unet \
    --unet-support-controlnet --convert-text-encoder \
    --model-version runwayml/stable-diffusion-v1-5 \
    --bundle-resources-for-swift-cli \
    --quantize-nbits 6 \
    --attention-implementation SPLIT_EINSUM_V2 \
    -o ~/MochiDiffusion/models && \
    python -m python_coreml_stable_diffusion.torch2coreml \
    --convert-unet --unet-support-controlnet \
    --model-version runwayml/stable-diffusion-v1-5 \
    --bundle-resources-for-swift-cli \
    --quantize-nbits 6 \
    --attention-implementation SPLIT_EINSUM_V2 \
    -o ~/MochiDiffusion/models

but only these files are generated, no Unet:
ControlledUnet.mlmodelc
TextEncoder.mlmodelc
VAEDecoder.mlmodelc
VAEEncoder.mlmodelc
merges.txt
vocab.json

If I want to get runnable model supported controlNet, what commands should I run?

@jrittvo
Copy link

jrittvo commented Aug 11, 2023

The files you ended up with are a working model, when used along with a ControlNet model. But they won't work without a ControlNet model. That is, they won't work for regular inference, or for Image2Image.

To also get the Unet.mlmodelc so that the base model will work with and without a ControlNet in the pipeline, remove --unet-support-controlnet from the second command (the one after the &&). That pass will now add the Unet.mlmodelc to the files from the first pass.

The --unet-support-controlnet modifies the type of of Unet created by the --convert-unet argument. With just --convert-unetyou get a Unet.mlmodelc. With --convert-unet and unet-support-controlnet together, you get a ControlledUnet.mlmodelc. With just unet-support-controlnet you do not get any Unet.

Note: I believe that you will also need to use the --quantize-nbits 6 argument when converting the ControlNet model in order for it to work with a 6-bit base model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants