The New Stable Audio Open 1.0 Sampler In a ComfyUI Node. Make some beats!
In this workflow, I got random cfg_scale
, sigma_min
and step
values making variations on the same 16 beats; same prompt
and seed
. VOLUME WARNING!
output-2.mp4
- The longer your audio, the more VRAM you need to stitch it together
- on a 3060, we've tried up to 10 seconds so far
- Go to Stable Audio Open on HuggingFace and download the
model.safetensors
andmodel.config.json
files. - Place the files in the
models/audio_checkpoints
folder. If you don't have one, make one in your comfy folder. - Open Comfy and StableAudioLoader will see your model and config
- Make sure you have your
HF_TOKEN
environment variable for hugging face because model loading doesn't work just yet directly from a saved file - Go ahead and download model from here for when we fix that Stable Audio Open on HuggingFace
- Make sure to run
pip install -r requirements.txt
inside the repo folder if you're not using Manager - It should just run if you've got your environment variable set up
There will definitely be issues because this is so new and it was coded quickly so we couldn't test it out.
This is not an official StableAudioOpen repository.
- Load your own models!
- Runs in half precision optional
- Nodes
- A Sampler Node: now with seed control, positive and negative prompts
- A Pre-Conditioning Node: kind of like empty latent audio with batch option
- A Prompt Node: Pipes conditioning
- A Model Loading Node: Includes repo options and scans
models/audio_checkpoints
for models and config.json files
control_after_generate
option- Audio to Audio (like in the Gradio Example) Still working on fix for this
- Can still use HF env key if you want
- Generates audio and outputs raw bytes and a sample rate for use with VHS
- Includes all of the original Stable Audio Open parameters
- Sampler outputs a Spectrogram image (experimental)
- Can save audio to file
- New Prefix Templates for save file naming
- Outputs a temporary
wav
totemp/stableaudiosampler.wav
you can use for looping like in this video.
The part I use AnyNode for is just getting random values within a range for cfg_scale
, steps
and sigma_min
thanks to feedback from the community and some tinkering, I think I found a way in this workflow to just get endless sequences of the same seed/prompt in any key (because I mentioned what key the synth lead needed to be in).
With the new save prefix templating, it makes it easy to just look at the file and know what settings (since wav doesn't have PNGinfo)
Keeping track of requests and ideas as they come in:
- Stereo output
- Nodes
- A Mixer Node (mix your audio outputs with some sort of mastering)
- A Tiling Sampler (concatenate the audios)
- More Sampler Node Options
- Gain
- Possibly Clipping at some dB
- Cleaning up some of the current options with selectors, etc.
- Upfi (upscaling fidelity)
- Audio Preview Node
If you get the progressbar
error, you can use our new utility from the latest update.
cd ComfyUI/custom_modules/ComfyUI-StableAudioSampler
python util_discrepancies.py progressbar
You will see something like this...
In this screenshot, you see protobuf
but that is only because I don't have version issues with progressbar
.
Note: If I install one of those version suggestions, StableAudioSampler should work, but at the same time, it might make other packages not work.
We are very open to anyone who wants to contribute from the open source community. Make your forks and pull requests. We will build something cool. If it's already on the roadmap, chances are we're already working on it, so check in with us. We will start a dev branch.
If you have a request for a feature, open an issue about it and it will be seen.
Happy Diffusing!