Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorTR extension is building engine with wrong parameters #333

Open
Sana-A-E opened this issue Jun 14, 2024 · 0 comments
Open

TensorTR extension is building engine with wrong parameters #333

Sana-A-E opened this issue Jun 14, 2024 · 0 comments

Comments

@Sana-A-E
Copy link

Sana-A-E commented Jun 14, 2024

I built a SDXL engine via TensorRT Exporter tab doing the following:

  1. I selected "768x768 - 1024x1024|Batch Size 1-4" from the dropdown
  2. I opened Advanced settings and changed the settings to the following
    Min/Optimal/Max Batch Size all set to 4
    Min/Optimal/Max Width & Height all set to 1024
    Min/Optimal prompt token count 75
    Max prompt token count 300

After generating the engine with above-mentioned parameters, the resulting engine actually had all batch sizes set to 2 instead of 4. When inspecting the engine through the TensorTR tab, the reported batch sizes are all set to 2 instead of 4. (It's worth noting that the console process of building the engine reported correct parameters during the build.)

To verify whether it is a UI bug, I attempted to generate images using the just generated engine. When trying to generate images with batch size 4, the generation failed reporting back to UI error "RuntimeError: The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0".

Console reports exception:

File "\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 70, in forward
        self.engine.allocate_buffers(feed_dict)
      File "\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 304, in allocate_buffers
        tuple(shape), dtype=numpy_to_torch_dtype_dict[dtype]
    ValueError: __len__() should return >= 0

However, as one would expect, when I tried to generate images with batch size 2, it proceeded to do so without issue, therefore confirming that the generated engine indeed has wrong parameters.

I went to regenerate the engine again, trying to get it to properly generate with batch size 4, but it reported that the engine I am trying to generate already exists. Forcing regeneration regenerated it again with batch size 2 instead of batch size 4 that I wanted.

I proceeded in the endeavor of trying to produce an engine with batch size 4, this time changing min batch size to 2.
After generating the engine, the result had min/optimal/max batch size set to 1/4/4. Still not correct, but close enough this time to be usable while I wait for a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant