"real-time"? #6

SoftologyPro · 2024-11-22T21:44:39Z

Running locally on Windows with a 24GB 4090.
python inference.py --ckpt_dir "D:\Tests\LTX-Video\LTX-Video\models" --prompt "roses in the rain" --height 512 --width 768 --num_frames 257 --seed 12345

Stats show
| 2/40 [04:14<1:19:10, 125.02s/it]

I do have GPU torch installed and Task Manager shows GPU at 100%. 23.4/24.0 VRAM used.
What hardware did you use to get it to generate "faster than it takes to watch them"?

Any tips for speeding up local generation on a 4090?

The text was updated successfully, but these errors were encountered:

StatusReport · 2024-11-22T23:05:21Z

If you'll go to the LTXV fal playground which uses H100, you'll see it runs at around 9it/s, which is definitely real time (assuming 20-30 steps per generation).

The memory usage grows with the number of frames you generate. Until we optimize VRAM usage further, try a lower number (such as 129). On my 4090 I'm getting around 1.3it/s, and the community currently managed to optimize it even further.

SpaceCowboy850 · 2024-11-23T01:03:07Z

Can confirm I'm getting same thing as SoftologyPro, even setting --num_frames to 129 or trying to reduce resolution to 256x256.
VRAM sits at 23.5/24GB used on a 4090.

Windows 11
Python 3.10.15
Cuda 12.4
PyTorch 2.5.1

This:
python inference.py --ckpt_dir ../weights/hf/LTX_Video --prompt "roses in the rain" --height 256 --width 256 --num_frames 65 --seed 42

Gets me 11.14s/it

2s video took 7 minutes to generate.

SoftologyPro · 2024-11-23T01:27:55Z

If you'll go to the LTXV fal playground which uses H100, you'll see it runs at around 9it/s, which is definitely real time (assuming 20-30 steps per generation).

Maybe add that to the readme. ie Needs a H100 80 GB VRAM GPU to run at any decent speed.

I tried my command line above on a 3090 and it has 3 hours so far with 24 hours remaining.

The memory usage grows with the number of frames you generate. Until we optimize VRAM usage further, try a lower number (such as 129). On my 4090 I'm getting around 1.3it/s, and the community currently managed to optimize it even further.

Can you give me the commandline you use to get that performance on your 4090?

ootsuka-repos · 2024-11-23T02:07:17Z

The inference was run on a 48GB GPU.
The results are smooth and successful.

python inference.py --ckpt_dir '/home/user/Desktop/git/LTX-Video' --prompt "Dog resting on grass." --input_image_path /home/user/Downloads/Dog_Breeds.jpg --height 480 --width 720 --num_frames 72 --seed 42

It seems that the GPU memory requirements are quite high.

SoftologyPro · 2024-11-23T03:59:52Z

If you have a normal consumer GPU, use this ComfyUI workflow.
https://comfyanonymous.github.io/ComfyUI_examples/ltxv/?ref=blog.comfy.org
All you need to do is download 2 models and import the workflow. Generates in 30 seconds on a 4090.

jpgallegoar · 2024-11-23T15:43:40Z

The inference was run on a 48GB GPU. The results are smooth and successful.

python inference.py --ckpt_dir '/home/user/Desktop/git/LTX-Video' --prompt "Dog resting on grass." --input_image_path /home/user/Downloads/Dog_Breeds.jpg --height 480 --width 720 --num_frames 72 --seed 42

It seems that the GPU memory requirements are quite high.

What's the difference between comfy workflow and inference.py? in comfy 4090 took 1m and in inference.py it took 2h (probably OOM), but the inference.py version was better quality

Hangsiin · 2024-11-23T21:18:02Z

If you have a normal consumer GPU, use this ComfyUI workflow. https://comfyanonymous.github.io/ComfyUI_examples/ltxv/?ref=blog.comfy.org All you need to do is download 2 models and import the workflow. Generates in 30 seconds on a 4090.

Thanks! it works very well on 4090.

loretoparisi · 2024-11-25T22:08:24Z

https://comfyanonymous.github.io/ComfyUI_examples/ltxv/?ref=blog.comfy.org

uhm there are a number of optimizations within ComfyUI, see here https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/model_management.py

But hard to extract in a separate script.
By example, an approach would be to add pipe support to vae slicing and tiling, that it is not by defaults:

pipe.vae.enable_slicing()
pipe.vae.enable_tiling()

like it happens in the CogVideoXPipeline pipe for enable_slicing and enable_tiling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"real-time"? #6

"real-time"? #6

SoftologyPro commented Nov 22, 2024 •

edited

Loading

StatusReport commented Nov 22, 2024

SpaceCowboy850 commented Nov 23, 2024 •

edited

Loading

SoftologyPro commented Nov 23, 2024 •

edited

Loading

ootsuka-repos commented Nov 23, 2024

SoftologyPro commented Nov 23, 2024 •

edited

Loading

jpgallegoar commented Nov 23, 2024

Hangsiin commented Nov 23, 2024

loretoparisi commented Nov 25, 2024

"real-time"? #6

"real-time"? #6

Comments

SoftologyPro commented Nov 22, 2024 • edited Loading

StatusReport commented Nov 22, 2024

SpaceCowboy850 commented Nov 23, 2024 • edited Loading

SoftologyPro commented Nov 23, 2024 • edited Loading

ootsuka-repos commented Nov 23, 2024

SoftologyPro commented Nov 23, 2024 • edited Loading

jpgallegoar commented Nov 23, 2024

Hangsiin commented Nov 23, 2024

loretoparisi commented Nov 25, 2024

SoftologyPro commented Nov 22, 2024 •

edited

Loading

SpaceCowboy850 commented Nov 23, 2024 •

edited

Loading

SoftologyPro commented Nov 23, 2024 •

edited

Loading

SoftologyPro commented Nov 23, 2024 •

edited

Loading