Alias allocations with same size elements despite different dtypes #665

jacobhinkle · 2023-07-31T18:54:28Z

Currently the reuseMemoryAllocations lowering pass reuses TensorViews if they have compatible shapes and vectorization, and the same dtype. This PR replaces the dtype condition with a check that the size of each element is the same.

It is somewhat rare in our test suite that we find opportunities to re-use one buffer with another of a different type, but it does happen. For example, one case in FusionWelfordShmoo_CUDA corresponding to testWelford(DataType::ComplexFloat, 1, 320, 256); currently shows:

ptxas info    : Used 61 registers, 16 bytes smem, 464 bytes cmem[0]

With the current PR, this becomes

ptxas info    : Used 59 registers, 16 bytes smem, 464 bytes cmem[0]

since we re-use two register TensorViews that change from int64_t to std::complex<float>.

Re-using shared memory in these cases is likely always beneficial, but the effect of explicitly re-using registers is hard to analyze since the compiler pipeline is so complicated. It is likely rarely harmful, so this behavior is on by default. However, it can be disabled using DisableOption::ReuseMismatchedTypeRegisters or NVFUSER_DISABLE=reuse_mismatched_type_registers.

jacobhinkle · 2023-07-31T18:54:54Z

Assuming it is safe to do this, I am a bit surprised NVRTC doesn't find this opportunity by itself.

jacobhinkle · 2023-07-31T18:56:00Z

!build

zasdfgbnm · 2023-07-31T19:22:09Z

Are you saying my claim in #221 about "reusing registers are useless" is wrong? This is 🤯.

jacobhinkle · 2023-07-31T20:11:27Z

Are you saying my claim in #221 about "reusing registers are useless" is wrong? This is 🤯.

I think so! I was planning to try this for smem tensors then noticed it has an effect for registers. Maybe I am missing some reason this is unsafe so it is not done automatically.

See #2934 (comment) PR #665 allowed us to re-use allocations that have different dtypes. We already check that our aliased tensors do not have vectorized accesses larger than those of the original tensors. However, when we have different dtypes we `reinterpret_cast` it to a different `Array` type. Previously we did not specify any alignment in that type's template args, meaning it assumed an alignment of size 1. Since the actual addresses will all still be aligned this does not caused misaligned accesses at runtime. This PR sets the template arg for alignment to be that of the vectorized access width for the alias tensor, so that the compiler could hypothetically do some optimizations knowing the address is aligned.

Alias allocations with same size elements

660cbd3

Remove reference on lhs

e50d67a

jacobhinkle marked this pull request as ready for review July 31, 2023 20:44

jacobhinkle requested a review from zasdfgbnm July 31, 2023 21:37

zasdfgbnm approved these changes Aug 1, 2023

View reviewed changes

jacobhinkle and others added 3 commits July 31, 2023 20:47

Add NVFUSER_DISABLE=reuse_mismatched_type_registers

aaadf6b

Merge branch 'main' into alias_different_dtypes

d7281b7

Merge branch 'main' into alias_different_dtypes

7df68ec

jacobhinkle merged commit b1477f2 into main Aug 1, 2023

jacobhinkle deleted the alias_different_dtypes branch August 1, 2023 11:45

naoyam mentioned this pull request Oct 1, 2024

use aligned array for iter grouped reduction inputs #2934

Merged

jacobhinkle mentioned this pull request Oct 2, 2024

Specify different-dtype alias TV alignment #3084

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alias allocations with same size elements despite different dtypes #665

Alias allocations with same size elements despite different dtypes #665

jacobhinkle commented Jul 31, 2023 •

edited

Loading

jacobhinkle commented Jul 31, 2023

jacobhinkle commented Jul 31, 2023

zasdfgbnm commented Jul 31, 2023 •

edited

Loading

jacobhinkle commented Jul 31, 2023

Alias allocations with same size elements despite different dtypes #665

Alias allocations with same size elements despite different dtypes #665

Conversation

jacobhinkle commented Jul 31, 2023 • edited Loading

jacobhinkle commented Jul 31, 2023

jacobhinkle commented Jul 31, 2023

zasdfgbnm commented Jul 31, 2023 • edited Loading

jacobhinkle commented Jul 31, 2023

jacobhinkle commented Jul 31, 2023 •

edited

Loading

zasdfgbnm commented Jul 31, 2023 •

edited

Loading