Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CMSIS-NN] Fix memory alignment bug in CMSIS-NN demo #11221

Merged
merged 1 commit into from
May 9, 2022

Conversation

grant-arm
Copy link
Contributor

There is a latent bug in the CMSIS-NN demo app where the input and output tensors generated by the create_image.py script are not 16-byte aligned in memory. Although this does not cause an issue in the demo using the current person_detect model, if a different model is substituted with a larger output tensor, it causes the FVP to hang in certain cases.

This PR updates create_image.py to correct the issue.

@Mousius @ashutosh-arm @areusch

@ashutosh-arm
Copy link
Contributor

@grant-arm Thanks for the PR. Just for my understanding, why do we need to align .rodata to 16 bytes? I have seen this for constants in the code generated C files as well. Also, does it hold good for all variations of Arm® Cortex®-M processors?

@Mousius
Copy link
Member

Mousius commented May 6, 2022

@grant-arm Thanks for the PR. Just for my understanding, why do we need to align .rodata to 16 bytes? I have seen this for constants in the code generated C files as well. Also, does it hold good for all variations of Arm® Cortex®-M processors?

Hi @ashutosh-arm, the reason is because we default the workspace allocation alignment to 16 bytes in a few places, for example:

Integer workspace_byte_alignment =
executor_config->GetAttr<Integer>("workspace-byte-alignment").value_or(16);

Which means we have to ensure we start on a 16 byte boundary to avoid overflows when we round up the allocations in memory planning. I believe we chose this because it's the max alignment required across all devices (including microNPUs), we could actually detect this and align as per the architecture at some point but the savings are minimal.

@ashutosh-arm
Copy link
Contributor

@grant-arm Thanks for the PR. Just for my understanding, why do we need to align .rodata to 16 bytes? I have seen this for constants in the code generated C files as well. Also, does it hold good for all variations of Arm® Cortex®-M processors?

Hi @ashutosh-arm, the reason is because we default the workspace allocation alignment to 16 bytes in a few places, for example:

Integer workspace_byte_alignment =
executor_config->GetAttr<Integer>("workspace-byte-alignment").value_or(16);

Which means we have to ensure we start on a 16 byte boundary to avoid overflows when we round up the allocations in memory planning. I believe we chose this because it's the max alignment required across all devices (including microNPUs), we could actually detect this and align as per the architecture at some point but the savings are minimal.

Thanks for the detailed explanation @Mousius.

* Updates convert_image.py to include memory alignment

Change-Id: I41c9a1f86a73450d64fb196a27994929ad16366f
@grant-arm grant-arm force-pushed the fix_cmsis_nn_demo branch from 7971d0d to 8bc9fe3 Compare May 6, 2022 12:44
@Mousius Mousius merged commit 9e404f0 into apache:main May 9, 2022
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request May 16, 2022
* Updates convert_image.py to include memory alignment
shtinsa pushed a commit to Deelvin/tvm that referenced this pull request May 17, 2022
* Updates convert_image.py to include memory alignment
shingjan pushed a commit to shingjan/tvm that referenced this pull request May 17, 2022
* Updates convert_image.py to include memory alignment
SebastianBoblest pushed a commit to SebastianBoblest/tvm that referenced this pull request May 27, 2022
* Updates convert_image.py to include memory alignment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants