Add SD3.5-large 4bit quantized #43

EduardoPach · 2024-10-28T13:54:10Z

What does this PR do?

This pull request adds support for a new 4-bit quantized model version (argmaxinc/mlx-stable-diffusion-3.5-large-4bit-quantized) across various files. The changes include updating model configurations, loading mechanisms, and script parameters to handle this new model variant.

Model Configuration Updates:

python/src/diffusionkit/mlx/__init__.py: Added entries for the 4-bit quantized model in MMDIT_CKPT and T5_MAX_LENGTH. [1] [2]
python/src/diffusionkit/mlx/model_io.py: Updated _MODELS, _PREFIX, _CONFIG, DEPTH, and MAX_LATENT_RESOLUTION to include the new model. [1] [2] [3]

Model Loading Adjustments:

python/src/diffusionkit/mlx/model_io.py: Modified load_mmdit, load_vae_decoder, and load_vae_encoder (vae related functions were modified as the checkpoint was pushed with modifications that are applied on the function) functions to handle the specific requirements of the 4-bit quantized model, including prefix adjustments and quantization handling. [1] [2] [3]

arda-argmax

LGTM!

EduardoPach added 2 commits October 28, 2024 14:39

add: sd-3.5 4-bit quantized

5f8a185

remove: unnecessary arg in generate_image

1d05fce

EduardoPach requested review from arda-argmax and atiorh October 28, 2024 13:54

arda-argmax approved these changes Oct 29, 2024

View reviewed changes

arda-argmax merged commit d737473 into argmaxinc:main Oct 29, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SD3.5-large 4bit quantized #43

Add SD3.5-large 4bit quantized #43

EduardoPach commented Oct 28, 2024

arda-argmax left a comment

Add SD3.5-large 4bit quantized #43

Add SD3.5-large 4bit quantized #43

Conversation

EduardoPach commented Oct 28, 2024

What does this PR do?

Model Configuration Updates:

Model Loading Adjustments:

arda-argmax left a comment

Choose a reason for hiding this comment