Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Always use bulk memory at compile time #22873

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

dschuff
Copy link
Member

@dschuff dschuff commented Nov 6, 2024

  • Remove libbulkmemory _emscripten_memcpy_js and fold memcpy and memset into libc
    • Use bulk memcpy/memset for Oz builds, but keep ASan behavior the same.
  • Move the zero-length check in memcpy from C into assembly, and add one for memset
  • Remove the use of -mno-bulk-memory at compile time (enabling it in object files)
  • Temporarily set the Safari version required to use bulk memory to a 14.1 (which has the effect of enabling it by default without enabling the other 14.1 features by default). This will be reverted when nontrapping-fptoint and bigint are also enabled by default.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! So awesome to see this land!

system/lib/libc/emscripten_memcpy.c Outdated Show resolved Hide resolved
system/lib/libc/emscripten_memcpy_bulkmem.S Show resolved Hide resolved
system/lib/libc/emscripten_memset.c Outdated Show resolved Hide resolved
@dschuff
Copy link
Member Author

dschuff commented Nov 7, 2024

@sbc100 @tlively @aheejin @kripken for opinions:
The lowering pass currently refuses to run on a module that has atomics enabled, because we shouldn't be lowering away copy/fill if atomics are being used since we need passive segments anyway. The current logic runs the lowering pass anytime the link command line doesn't ask for a new enough browser version. That combination causes any test that links a library built with pthreads to fail. For example libfreetype is always built with -pthread, (I'm not sure why) so if you link a non-pthreads binary with libfreetype, the lowering pass will error out. Previous to this PR I think the result of linking any atomics-using library into a non-atomic output is that it will silently create a binary that uses atomics even though it wasn't asked for. Is the behavior we want? If so, I guess the easiest thing would be just to make the lowering pass not do anything if atomics are enabled. Another would be to make the behavior stricter so that you can't link atomics-using code into a non-atomics-using output (to avoid the surprise behavior); but that could possibly be annoying. A third option would be to make it explicit inside emcc and not run the lowering pass if atomics are enabled. I like the idea of the 2nd option, but it would require fixing the libraries that are compiled with threads (e.g. would we need a freetype-mt? maybe not, I don't actually see any atomic instructions inside libfreetype). And it could break users who are doing this linking (the fix would be to explicitly enable atomics or threads at link time; would that have bad consequences? I guess they'd have more stuff linked into their binary?)

edit: I should clarify that the case that would fail here with the current logic is when the build targets Safari 14.1 (the current default, but would be done manually in the future). The default behavior going forward would actually be the same as it is now, except that bulk memory would be enabled (i.e. the lowering pass would not be run, and linking an atomics-using library would cause the resulting binary to have atomics).

More generally speaking, or feature-enabling code in emscripten is a little inconsistent. e.g. with the current logic enabling WASM_BIGINT will automatically cause bulk memory and nontrapping-fp to be enabled because it implicitly causes Safari 15 to be selected, and then that selection determines whether the lowering passes run. I find that a little surprising, and there currently isn't a way to override it other then selecting a different browser version manually. Also the -mbulk-memory et al. feature flags don't work at link time to select features at a fine grain, only the browser versions work. That seems kind of bad to me, but might also be a pain to fix, for not much benefit.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 7, 2024

The lowering pass currently refuses to run on a module that has atomics enabled, because we shouldn't be lowering away copy/fill if atomics are being used since we need passive segments anyway.

Why not just allow the pass to run in this case? The pass related to memory.copy and memory.fill only right? The passive segements and memory.init would be unaffected, no?

@sbc100
Copy link
Collaborator

sbc100 commented Nov 7, 2024

Previous to this PR I think the result of linking any atomics-using library into a non-atomic output is that it will silently create a binary that uses atomics even though it wasn't asked for. Is the behavior we want?

Thats a good question.. I can't remember the conclusion but I do remember that updated the spec such that browser would allow atomic instruction even in single threaded builds, so maybe this was why we did that. But if we want to support older browsers we would still want to be able to lower those atomics away at link time I think. How many such libraries do we have? Its certainly nice to be able to build libraries just once and avoid the -mt variants if we can.

@dschuff
Copy link
Member Author

dschuff commented Nov 7, 2024

I can't remember the conclusion but I do remember that updated the spec such that browser would allow atomic instruction even in single threaded builds, so maybe this was why we did that. But if we want to support older browsers we would still want to be able to lower those atomics away at link time I think. How many such libraries do we have? Its certainly nice to be able to build libraries just once and avoid the -mt variants if we can.

Ah, right I forgot about allowing atomics in single-threaded builds. That makes sense then, that the default behavior for the default targets (which support bulk and atomics) would be to just allow atomics to pass through anytime. So for older browsers we currently do not support lowering away atomics at link time (only at compile time). Our current default target (Safari 14.1) actually does support atomics, but I guess we don't support targeting even older browsers while linking in atomic-using libraries. There aren't many emscripten bulitin library variants built this way (freetype might be the only one actually) so maybe this isn't a big problem. If we want to lean further into this direction, we could potentially even remove some '-mt' variants and just always link atomic versions.

Why not just allow the pass to run in this case? The pass related to memory.copy and memory.fill only right? The passive segements and memory.init would be unaffected, no?

We could do it, it would just be a pessimization for no reason; there are no targets AFAIK that support passive segments but not memory.copy/fill.

@dschuff
Copy link
Member Author

dschuff commented Nov 7, 2024

After some discussion with @sbc100 we might do the following:

  1. Keep the atomics passthrough linking behavior as-is. The consequence will (continue to) be that users targeting browsers older than Safari 14.1 will fail to load their module if they link libraries with atomics. So far this hasn't been a problem.
  2. Make the lowering pass not do anything (or just run as normal) if atomics are enabled. This would make it harmless to run the pass when there are atomics (even if it's not useful).
  3. Make the lowering pass error out if there are any other uses of bulk memory, e.g. passive segments or table operations. The reason is that it would be incorrect remove the bulk-memory feature if there are still uses of bulk memory, (and it would definitely be a bug for someone to run this pass on a module with other bulk memory usage).

@dschuff
Copy link
Member Author

dschuff commented Nov 8, 2024

OK, I think the bulk memory part of this patch is ready; the tests are working and the comments so far are addressed.

As written, this PR does not update the default features, it only builds with build memory and turns on the lowering. This means the lowering will run by default. This isn't a state we want to release with, but it means we can land the changes for nontrapping-fp and updating the default separately. The other option is to add to this PR to enable the nontrapping-fp lowering (once we have test_sse fixed), and then update the defaults.
I maybe lean toward landing separately, but don't have a strong opinion.

@@ -156,6 +163,8 @@ def apply_min_browser_versions():
if settings.PTHREADS:
enable_feature(Feature.THREADS, 'pthreads')
enable_feature(Feature.BULK_MEMORY, 'pthreads')
if settings.WASM_WORKERS or settings.SHARED_MEMORY:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe else here? otherwise the feature will be add twice in pthread mode (which also enables SHARED_MEMORY)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Although I don't think it actually matters, it should be safe to enable it more than once.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants