-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compilation flags for dispatch formation for deeplabv3 #515
Comments
@MaheshRavishankar With these flags, it now generates 83 dispatches. Most dispatches look reasonable to me, but there are still 7 standalone transpose dispatches, like this:
|
All the dispatches can be found here https://github.com/nod-ai/npu-benchmark/blob/main/processed_dispatches.zip |
Can you provide the dump of the IR after |
Yes, here is the dump IR https://gist.github.com/yzhang93/456640440608e48550308bf87245523c |
There are some obvious things here that could help more
This probably requires the depthwise convs also to be converted to nhwc, then the transpose should fold away
These two should be the same dispatch. Dont know why it isnt. Something is going wrong with
The Fixing these 3 things must get us into much better shape. |
@MaheshRavishankar I tried to pad the conv ops, and this is the IR afterwards https://gist.github.com/yzhang93/d0b09b559800f74314eb2d95c0aa2b7d. Here I modify the codes to not only pad the intrinsic dimensions (OW, OC, IC), but also the OH dimension. OH dimension has to be padded in order to distribute inputs evenly to 4 AIE cores. After padding I noticed some issues
|
@MaheshRavishankar Well it's not just a simple fold, it goes from This separation of conv and elementwise ops still appears in the quantized model. |
The DeeplabV3 i8 model compilation by default might not create the efficient dispatches. Some flags to try
To start with we need
--iree-flow-enable-aggressive-fusion --iree-opt-data-tiling=off
To fuse padding with consumer convolutions we would need to add
--iree-flow-enable-fuse-padding-into-linalg-consumer-ops
To enable conversion of NCHW convolutions to NHWC we would need
--iree-preprocessing-pass-pipeline=builtin.module(iree-preprocessing-transpose-convolution-pipeline)
--iree-global-opt-propagate-transposes=true --iree-opt-aggressively-propagate-transposes=true
The text was updated successfully, but these errors were encountered: