Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default to 16-byte alignment in malloc #14456

Closed
wants to merge 1 commit into from
Closed

Default to 16-byte alignment in malloc #14456

wants to merge 1 commit into from

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Jun 15, 2021

By default we now use alignof(max_align_t) in both emmalloc and
dlmalloc. This is a requirement of the C and C++ standards.

Developers can opt into the old behviour using -s MALLOC=dlmalloc-align8 or -s MALLOC=emmalloc-align8.

Based on #10110 which was authored by @Akaricchi.

Fixes: #10072

@sbc100 sbc100 requested review from kripken and juj June 15, 2021 18:38
@sbc100 sbc100 force-pushed the fix_malloc_align branch 2 times, most recently from c7750f3 to 5c01997 Compare June 15, 2021 18:43
correct (since `alignof(max_align_t)` is 16 for the WebAssembly clang target)
and fixes several issues we have seen in the wild. Since some programs can
benefit having a lower alignment we have added `dlmalloc-align8` and
`emmalloc-align8` variants which can use used with the `-s MALLOC` option to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which can use used

typo

@kripken
Copy link
Member

kripken commented Jun 15, 2021

We had some hope of avoiding the speed and size regressions this causes, by changing the alignment of long double to be just 8 and not 16 (which would have no harmful effects) and by ignoring SIMD (which is not mentioned in the C/C++ specs AFAIK - but may have concerns in practice). Has there been more investigation there?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 15, 2021

We had some hope of avoiding the speed and size regressions this causes, by changing the alignment of long double to be just 8 and not 16 (which would have no harmful effects) and by ignoring SIMD (which is not mentioned in the C/C++ specs AFAIK - but may have concerns in practice). Has there been more investigation there?

I considered that option but I don't think its a reasonable path right now (literally today) for a couple of reasons:

  1. It would require an ABI change upstream in llvm which could take time and be controversial in an off itself.
  2. There seems to be a lot of opinion on Make dlmalloc and emmalloc align to max_align_t #10110 suggesting that going down that path and allowing allocations there are not aligned to simd128 would be a pain/regression/not a good idea.

The idea behind this change is to get something that we can land today and keep everyone mostly happy. We can always try to change our max_align_t separately, but I think that can be a separate discussion.

@sbc100 sbc100 force-pushed the fix_malloc_align branch 3 times, most recently from 5a4f642 to 62c6ab6 Compare June 16, 2021 03:18
By default we now use `alignof(max_align_t)` in both `emmalloc` and
`dlmalloc`.  This is a requirement of the C and C++ standards.

Developers can opt into the old behviour using `-s
MALLOC=dlmalloc-align8` or `-s MALLOC=emmalloc-align8`.

Based on #10110 which was authored by @Akaricchi.

Fixes: #10072
@sbc100 sbc100 force-pushed the fix_malloc_align branch from 62c6ab6 to 2769963 Compare June 16, 2021 05:18
@Akaricchi
Copy link
Contributor

There seems to be a lot of opinion on #10110 suggesting that going down that path and allowing allocations there are not aligned to simd128 would be a pain/regression/not a good idea.

Emscripten had always behaved this way with its 8-byte aligning malloc, so it can't possibly be a regression. Personally I don't think it should be malloc's responsibility to hold SIMD programmers's hands, so I'd be in favor of reducing long double/max_align_t alignment to 8 bytes. But I don't care all that much, as long as the behavior is conformant, thus I support explicitly aligning to max_align_t.

@kripken
Copy link
Member

kripken commented Jun 16, 2021

Focusing on the practical side, where have the bug reports on this happened? I might change my mind here if it's on SIMD code, in particular. But if it's on float128 or something else, I'd prefer to not regress.

About the ABI aspect, do you think it would be controversial? float128 doesn't need to be aligned for any reason I can see, as no hardware actually loads it and no wasm spec refers to it. Or has that changed?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 16, 2021

Focusing on the practical side, where have the bug reports on this happened? I might change my mind here if it's on SIMD code, in particular. But if it's on float128 or something else, I'd prefer to not regress.

As far as I can tell the bug reports relate to max_align_t not be honored by malloc. They don't relate to SIMD per say, just our spec-non-conformance breaking folks assumptions.

About the ABI aspect, do you think it would be controversial? float128 doesn't need to be aligned for any reason I can see, as no hardware actually loads it and no wasm spec refers to it. Or has that changed?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 16, 2021

Focusing on the practical side, where have the bug reports on this happened? I might change my mind here if it's on SIMD code, in particular. But if it's on float128 or something else, I'd prefer to not regress.

Speaking of the practical side do we have any evidence that real world programs are actually effected by this increase in the malloc alignment? Its hard to imagine a performance-sensitive program that also happen to contains millions of micro allocations. Isn't looking at allocation strategies one of the first things anyone optimizing their code would do?

@kripken
Copy link
Member

kripken commented Jun 16, 2021

I admit the evidence is not from real-world program, but just from benchmarks like Havlak that do focus on mallocing many small things.

As far as I can tell the bug reports relate to max_align_t not be honored by malloc. They don't relate to SIMD per say, just our spec-non-conformance breaking folks assumptions.

I do wish I understood the issue here better though. The only things we misalign are float128 and SIMD, so it seems like every place a user notices this has to be one of those? And, how do they notice this - false positives in UBSan? (I think I remember such a report) Or runtime errors on misaligned SIMD loads (which trap?) If it's hard to collect that data then I wouldn't insist on it, but I thought maybe you already knew the answer.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 16, 2021

I admit the evidence is not from real-world program, but just from benchmarks like Havlak that do focus on mallocing many small things.

As far as I can tell the bug reports relate to max_align_t not be honored by malloc. They don't relate to SIMD per say, just our spec-non-conformance breaking folks assumptions.

I do wish I understood the issue here better though. The only things we misalign are float128 and SIMD, so it seems like every place a user notices this has to be one of those? And, how do they notice this - false positives in UBSan? (I think I remember such a report) Or runtime errors on misaligned SIMD loads (which trap?) If it's hard to collect that data then I wouldn't insist on it, but I thought maybe you already knew the answer.

IIUC its not about hardware or trapping, it just about the assumption and assertions made in higher level user code. See #10072 (comment) for example.

@kripken
Copy link
Member

kripken commented Jun 16, 2021

If it's just for asserting onmax_align_t as mentioned there, then I agree with @Akaricchi in that comment, we should instead lower max_align_t to 8. Is that defined in a system header, or would that require an ABI change?

Note that even if benchmarks like Havlak show a large effect, there will be smaller effects in less extreme cases, possibly even real-world ones. It's overhead for no good reason, if the only reason is float128. I think this would be more than good enough reason to reopen the issue of float128, if it comes to that (we only compromised on float128 because we found technical solutions to its overhead in printf, and we thought that with that work we avoided overhead for our users).

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 16, 2021

If it's just for asserting onmax_align_t as mentioned there, then I agree with @Akaricchi in that comment, we should instead lower max_align_t to 8. Is that defined in a system header, or would that require an ABI change?

It looks like max_align_t sometimes defined by musl and sometimes by the compiler:

#if __STDC_VERSION__ >= 201112L || __cplusplus >= 201103L
#define __NEED_max_align_t
#endif

#if defined(__NEED_max_align_t) && !defined(__DEFINED_max_align_t)
typedef struct { long long __ll; long double __ld; } max_align_t;
#define __DEFINED_max_align_t
#endif

I will look into the llvm-side change now.

Note that even if benchmarks like Havlak show a large effect, there will be smaller effects in less extreme cases, possibly even real-world ones. It's overhead for no good reason, if the only reason is float128. I think this would be more than good enough reason to reopen the issue of float128, if it comes to that (we only compromised on float128 because we found technical solutions to its overhead in printf, and we thought that with that work we avoided overhead for our users).

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 16, 2021

If it's just for asserting onmax_align_t as mentioned there, then I agree with @Akaricchi in that comment, we should instead lower max_align_t to 8. Is that defined in a system header, or would that require an ABI change?

It looks like max_align_t sometimes defined by musl and sometimes by the compiler:

#if __STDC_VERSION__ >= 201112L || __cplusplus >= 201103L
#define __NEED_max_align_t
#endif

#if defined(__NEED_max_align_t) && !defined(__DEFINED_max_align_t)
typedef struct { long long __ll; long double __ld; } max_align_t;
#define __DEFINED_max_align_t
#endif

Oops I didn't read that correctly, it looks like it is only defined for certain versions of C/C++ and always defined by musl. So we do have more control here than I thought. There may also be some builtin defines in clang that needs updating though.

@Akaricchi
Copy link
Contributor

Although you could hack max_align_t in that way, you probably shouldn't, as I believe you'd still be violating the spec albeit in a more subtle/pedantic way. max_align_t is supposed to be "suitably aligned for any object" (read: any standard C data type, including long double; this does not include e.g. non-standard SIMD types and other overaligned objects, mind you). Therefore if alignof(max_align_t) < alignof(long double), then your definition of max_align_t is still non-conforming, even though it does match the alignment of malloc (that just means your malloc is non-conforming, too).

I still think the best way to go is to fix this on the LLVM side by reducing long double alignment requirements, but also make Emscripten's malloc explicitly use max_align_t for robustness, independently of that change.

@Akaricchi
Copy link
Contributor

Akaricchi commented Jun 16, 2021

we only compromised on float128 because we found technical solutions to its overhead in printf, and we thought that with that work we avoided overhead for our users

Does that solution have anything to do with alignment of float128/long double, though? I'm pretty sure you can keep the 16-byte size and just reduce the alignment requirement (llvm-side) with no fallout whatsoever.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 16, 2021

we only compromised on float128 because we found technical solutions to its overhead in printf, and we thought that with that work we avoided overhead for our users

Does that solution have anything to do with alignment of float128/long double, though? I'm pretty sure you can keep the 16-byte size and just reduce the alignment requirement (llvm-side) with no fallout whatsoever.

The issue Alon is referring to is the age old discussion/debate/fight we had about if/why long double should be 128 bits on WebAssembly at all. We went back and forth for a long time on it and eventually ended up agreeing to keep it at 128 rather than reduce it to 64 because we found a (kind of nasty) way to work around the printf-bloat issue that it caused. Now that we have found another issue that would not exist if had just made it 64 then we could reconsider once again if it was a wise move.

@Akaricchi
Copy link
Contributor

The issue Alon is referring to is the age old discussion/debate/fight we had about if/why long double should be 128 bits on WebAssembly at all. We went back and forth for a long time on it and eventually ended up agreeing to keep it at 128 rather than reduce it to 64 because we found a (kind of nasty) way to work around the printf-bloat issue that it caused. Now that we have found another issue that would not exist if had just made it 64 then we could reconsider once again if it was a wise move.

Yeah, but that's just about the size, not alignment. I've said it multiple times in the previous thread(s), you can keep the 128 bits size while reducing just the alignment requirement. That's still an ABI change but a somewhat less significant one, and shouldn't break anyone who actually wants super high precision float math for some reason.

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 16, 2021

The issue Alon is referring to is the age old discussion/debate/fight we had about if/why long double should be 128 bits on WebAssembly at all. We went back and forth for a long time on it and eventually ended up agreeing to keep it at 128 rather than reduce it to 64 because we found a (kind of nasty) way to work around the printf-bloat issue that it caused. Now that we have found another issue that would not exist if had just made it 64 then we could reconsider once again if it was a wise move.

Yeah, but that's just about the size, not alignment. I've said it multiple times in the previous thread(s), you can keep the 128 bits size while reducing just the alignment requirement. That's still an ABI change but a somewhat less significant one, and shouldn't break anyone who actually wants super high precision float math for some reason.

I think we understand that. IIUC the point is that reducing the size of long double to 64bit would fix both of these issues, and potentially more that we have not yet hit.

@kripken
Copy link
Member

kripken commented Jun 16, 2021

Ok, sounds like changing max_align_t to 8 would move us from one form of incorrectness (the current issue where malloc can emit things of lower alignment than max_align_t) to another (where the alignment of long doubles would be larger than max_align_t). That still feels like a good intermediate step, as the former problem seems worse. But maybe it would be safer to wait on this for a decision on the alignment of long double.

I think we should change the alignment (but not the size, as mentioned) of long double to 8. There is no benefit to a higher alignment for it. I hope we can do that across the entire wasm ecosystem for consistency. Does that sound good to work towards?

@dschuff
Copy link
Member

dschuff commented Jun 17, 2021

I like the idea of reducing the alignment of long double (since it would allow proper use of max_align_t without any regression), but I wish we had a better sense of the impact of that ABI breakage. It could still change the layout or alignment of structures or unions containing long double elements.
One possible way to mitigate that might be to have a compiler flag opt-in to the old behavior (e.g. -malign-longdouble analogous to the existing -malign-double) which would set the aligment of long double to 16. If we can audit libc/libcxx and be sure that no public data structures or code there are dependent on that alignment (even indirectly via max_align_t) then we could in theory even avoid compiling a separate version of those libraries for that ABI (I'm not sure how likely that is; another option would be just to allow users to pass arbitrary flags to the library builds and let them build their own system libs). I guess it also helps that we don't have good dynamic loading support or a stable C++ ABI yet so maybe there aren't too many users relying on ABI stability for 3rd-party libraries yet. Even if we try to provide such an escape hatch, it would be good if we could offer some more clear advice on when to use it.

That still leaves the issue that slots aligned to max_align_t are no longer suitable for SIMD (which is not a standards conformance issue but is a potential practical issue). I guess that boils down to how often SIMD users rely on max_align_t rather than some other mechanism to ensure their alignment needs.

@Akaricchi
Copy link
Contributor

Akaricchi commented Jun 18, 2021

I guess that boils down to how often SIMD users rely on max_align_t rather than some other mechanism to ensure their alignment needs.

Nobody should be doing that. aligned_alloc(alignof(max_align_t), 42069) is just a funny way of spelling malloc(42069) in a conformant environment; this is essentially equivalent to not caring about over-aligning anything at all. If such code even exists, it's just super broken, and can easily cause problems on less "exotic" platforms than emscripten. If someone is relying on this instead of actually spelling out the specific alignment they require, I think they deserve the consequences.

@Akaricchi
Copy link
Contributor

I like the idea of reducing the alignment of long double (since it would allow proper use of max_align_t without any regression), but I wish we had a better sense of the impact of that ABI breakage. It could still change the layout or alignment of structures or unions containing long double elements.
One possible way to mitigate that might be to have a compiler flag opt-in to the old behavior (e.g. -malign-longdouble analogous to the existing -malign-double) which would set the aligment of long double to 16. If we can audit libc/libcxx and be sure that no public data structures or code there are dependent on that alignment (even indirectly via max_align_t) then we could in theory even avoid compiling a separate version of those libraries for that ABI (I'm not sure how likely that is; another option would be just to allow users to pass arbitrary flags to the library builds and let them build their own system libs). I guess it also helps that we don't have good dynamic loading support or a stable C++ ABI yet so maybe there aren't too many users relying on ABI stability for 3rd-party libraries yet. Even if we try to provide such an escape hatch, it would be good if we could offer some more clear advice on when to use it.

Please, let's drop the pretense of stability. You guys break my Taisei builds literally every release, either with your inability to commit to a stable CLI, or random runtime regressions, or JS API breaks, or a combination of the three. A tiny ABI change is literally nothing in comparison. I would never expect linking code compiled with different versions of emscripten to actually work correctly, if at all. It wouldn't be a good practice even if you did care about stability, anyway.

@kripken
Copy link
Member

kripken commented Jun 18, 2021

@Akaricchi

Personally, it makes my day a little bit worse to read a comment like yours. I think the comment is not productive. I'm sorry to hear that we've broken your project that often, and I'd like to look into ways to avoid that in the future - in particular, perhaps there are more automatic tests that we need. But it's not helpful or fair to say that we "do not care about stability" - that's inaccurate and unnecessarily harsh.

It is true that we sometimes break backwards compatibility intentionally, but we try to do that only when strongly justified, and to mention it in the changelog. (Maybe we've gotten that wrong sometimes, of course.)

In addition, the comment you are responding to did not claim we had a stable ABI: "I guess it also helps that we don't have good dynamic loading support or a stable C++ ABI yet". So your comment is not a direct contradiction of it.

Again, I'm not dismissing your concerns. Let's find ways to improve stability - it may be worth filing issues for specific things. But we should do so in a way that's pleasant for everyone here.

Returning to the topic, I don't think there is disagreement between you, me, and that comment. We do not have a stable C or C++ ABI, and so an ABI change here is possible for us to do. But, we should do so only if it's justified, which I think in this case it might be for the reasons discussed earlier.

@Akaricchi
Copy link
Contributor

We do not have a stable C or C++ ABI, and so an ABI change here is possible for us to do

Exactly — we do not have a stable ABI, so why does it take us over a year to agree on a barely significant, clearly beneficial ABI change? Why are we even talking about stability and performance when we don't have correctness in this instance? I would shut up if you approached the arguably more important stability concerns (such as CLI stability) with half as much deliberation, but it is quite frustrating to engage in this nebulous "ABI breakage" discussion when in reality, whenever I unwisely decide to update emscripten for my project, I end up having to do at least one of:

  • Wrestle with the rat's nest of ever-changing Emscripten-specific flags to get the thing to even build
  • Figure out the mystery regression of the week
  • Write a JS shim for some removed API that SDL uses
  • Fix the obligatory obscure Closure minification bug and wonder why do I even keep that garbage enabled

Yes, it is harsh, and perhaps unwarranted, but I hope you can see how it's possible to arrive at the "you don't care about stability" conclusion from my perspective.

Anyway, rant over.

The way I see it, this PR can and should land ASAP. It's not blocked on the long double ABI change. This would finally give us correct max_align_t/malloc behavior. The new alignment will be slightly suboptimal, but this is fine, because correctness trumps performance. When/if the ABI change happens, we'll automatically get the 8-byte alignment back. Personally, I don't think it's worth providing the dlmalloc-align8 and emmalloc-align8 options in this scenario, since the ABI change would obsolete them soon.

@kripken
Copy link
Member

kripken commented Jun 21, 2021

I agree this PR should land. My only suggestion at this point is that, since I think we all agree that it is ok to change the ABI to reduce the alignment of float128, that if we do that first then we would avoid raising the malloc alignment and then reducing it. Raising and reducing it could be confusing for users, and might be annoying in bisections as well. So if we can lower the float128 alignment quickly, we can finish all of this properly.

@sbc100 are there concerns on the LLVM side about changing the float128 alignment down?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 23, 2021

Ok, lets see how the llvm-side change goes: https://reviews.llvm.org/D104808

@jaykrell
Copy link

Fyi, obscure, but Windows heap does this:
winnt.h:

#if defined(_WIN64) || defined(_M_ALPHA)
#define MEMORY_ALLOCATION_ALIGNMENT 16
#else
#define MEMORY_ALLOCATION_ALIGNMENT 8
#endif

so 16 is a good number imho, given the prevalence of Win64 (I realize wasm is 32bit for now).
It might aid compatibility in obscure cases (using the lower bits of malloc result)._

#ifndef MALLOC_ALIGNMENT
#include <stddef.h>
/* `malloc`ed pointers must be aligned at least as strictly as max_align_t. */
#define MALLOC_ALIGNMENT (__alignof__(max_align_t))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static assert that it is 16?

sbc100 added a commit to llvm/llvm-project that referenced this pull request Jul 2, 2021
This means `max_align_t` is 8 bytes which also sets the alignment
malloc.  Since this is technically and ABI breaking change we have
limited to just the emscripten OS target.  It is also relatively low
import breakage since it will only effect the alignement of struct that
contai `long double`s (extremerly rare I imagine).

Emscripten's malloc implementation already use 8 byte alignement
(dlmalloc uses and alignement of 2*sizeof(void*) == 8 rather than
checking max_align_t) so will not be effected by this change.  By
bringing the ABI in line with the current malloc code this will fix
several issue we have seen in the wild.

See: emscripten-core/emscripten#14456

Differential Revision: https://reviews.llvm.org/D104808
@sbc100
Copy link
Collaborator Author

sbc100 commented Jul 12, 2021

Fixed instead on the llvm side. See #14634

@sbc100 sbc100 closed this Jul 12, 2021
@sbc100 sbc100 reopened this Jul 12, 2021
@sbc100 sbc100 closed this Jul 12, 2021
@sbc100 sbc100 deleted the fix_malloc_align branch July 12, 2021 20:40
arichardson pushed a commit to arichardson/llvm-project that referenced this pull request Sep 13, 2021
This means `max_align_t` is 8 bytes which also sets the alignment
malloc.  Since this is technically and ABI breaking change we have
limited to just the emscripten OS target.  It is also relatively low
import breakage since it will only effect the alignement of struct that
contai `long double`s (extremerly rare I imagine).

Emscripten's malloc implementation already use 8 byte alignement
(dlmalloc uses and alignement of 2*sizeof(void*) == 8 rather than
checking max_align_t) so will not be effected by this change.  By
bringing the ABI in line with the current malloc code this will fix
several issue we have seen in the wild.

See: emscripten-core/emscripten#14456

Differential Revision: https://reviews.llvm.org/D104808
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this pull request Oct 7, 2022
This means `max_align_t` is 8 bytes which also sets the alignment
malloc.  Since this is technically and ABI breaking change we have
limited to just the emscripten OS target.  It is also relatively low
import breakage since it will only effect the alignement of struct that
contai `long double`s (extremerly rare I imagine).

Emscripten's malloc implementation already use 8 byte alignement
(dlmalloc uses and alignement of 2*sizeof(void*) == 8 rather than
checking max_align_t) so will not be effected by this change.  By
bringing the ABI in line with the current malloc code this will fix
several issue we have seen in the wild.

See: emscripten-core/emscripten#14456

Differential Revision: https://reviews.llvm.org/D104808
vgvassilev pushed a commit to vgvassilev/clang that referenced this pull request Dec 28, 2022
This means `max_align_t` is 8 bytes which also sets the alignment
malloc.  Since this is technically and ABI breaking change we have
limited to just the emscripten OS target.  It is also relatively low
import breakage since it will only effect the alignement of struct that
contai `long double`s (extremerly rare I imagine).

Emscripten's malloc implementation already use 8 byte alignement
(dlmalloc uses and alignement of 2*sizeof(void*) == 8 rather than
checking max_align_t) so will not be effected by this change.  By
bringing the ABI in line with the current malloc code this will fix
several issue we have seen in the wild.

See: emscripten-core/emscripten#14456

Differential Revision: https://reviews.llvm.org/D104808

llvm-monorepo: d1a96e906cc03a95cfd41a1f22bdda92651250c7
@kg
Copy link

kg commented Apr 2, 2024

Is there current guidance on what to do if I need 16-byte alignment? Right now I allocate with extra padding and manually align my pointer inside of the allocation, but it's a little awkward, and more importantly that makes it impossible to use realloc. I was going to ask "is there some way to make emscripten's dlmalloc use 16-byte alignment" but from reading over this and some of the related issues, it sounds like dlmalloc is tuned around 8-byte alignment, and there are concerns that changing it would negatively impact software.

My motivation for this is 128-bit SIMD, for which natural alignment is 16 bytes. My understanding is that SIMD is part of why other ABIs have standardized on 16-byte alignment for stack frames, i.e. Mac OS X and (iirc) Windows x64. (Incidentally, I'm not sure how I would tell what the alignment of emscripten's stack frames is, and how that interacts with JS code invoking stackAlloc. That could be a wrinkle here.)

Maybe the guidance should just be 'use posix_memalign since emscripten's dlmalloc supports that', and people have to give up on realloc?

It looks like the current guidance in the official docs re:alignment is from the asm.js days, and talks about not having support for x86 unaligned memory ops.

@sbc100
Copy link
Collaborator Author

sbc100 commented Apr 2, 2024

If you want allocations aligned higher than alignof(max_align_t) then you can either use memalign or posix_memalign (I think either one should work).

If you want to change the default you would need to modify the allocator (either emmalloc.c, dlmalloc.c or mimalloc). I imagine its just a case of changing a single line in each one. e.g:

// Configuration: specifies the minimum alignment that malloc()ed memory outputs. Allocation requests with smaller alignment
// than this will yield an allocation with this much alignment.
#define MALLOC_ALIGNMENT alignof(max_align_t)
static_assert(alignof(max_align_t) == 8, "max_align_t must be correct");

I don't think we will be changing the default upstream since using 8-byte alignment has showed to have an impact on some benchmarks.

For stack alignment we do use 16 bytes, as does llvm, so that should be fine.

yamt added a commit to yamt/tool-conventions that referenced this pull request Apr 12, 2024
dschuff pushed a commit to WebAssembly/tool-conventions that referenced this pull request Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

malloc results are not aligned to alignof(max_align_t)
6 participants