Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono] Don't generate safepoints inside write barrier wrapper #58346

Merged
merged 3 commits into from
Aug 31, 2021

Conversation

BrzVlad
Copy link
Member

@BrzVlad BrzVlad commented Aug 30, 2021

A ref store and its associated write barrier need to happen atomically relative to garbage collections. No GCs should happen between the ref store and the card mark.

There is no need for a m2n transition since the method just marks the card table. No exception can be thrown, no marshalling is required, no gc can happen during its execution etc. In addition to the redundancy of the wrapper, it also contains gc safepoints which is wrong. Having a GC between the ref store and the corresponding card table marking can in theory crash the GC.
This wrapper is called immediately after a store to a ref field. The ref store and the card table marking should be atomic in relation to GCs. Therefore we shouldn't have a safepoint in the wrapper.
@BrzVlad
Copy link
Member Author

BrzVlad commented Aug 30, 2021

@imhameed @EgorBo I think I might be missing a place where safe points are added. I think there is some logic in llvm itself adding safepoints. Any observations ?

@lambdageek
Copy link
Member

lambdageek commented Aug 30, 2021

Do we have a list of LLVM passes that we use somewhere? It should be the https://llvm.org/docs/Statepoints.html#placesafepoints PlaceSafepoints pass that adds calls to gc.safepoint_poll.

I think the decision of which functions should get safepoints is controlled by PlaceSafepoints::shouldRewriteFunction https://github.com/dotnet/llvm-project/blob/release/11.x/llvm/lib/Transforms/Scalar/PlaceSafepoints.cpp#L454-L456

I guess the way to mark a Function not to do a safepoint is to make sure that Function::getGC returns something other than "mono" (or "coreclr" or "safepoint-example"), right? Do we have that exposed to the aot-compiler?

@imhameed

@imhameed
Copy link
Contributor

@imhameed @EgorBo I think I might be missing a place where safe points are added. I think there is some logic in llvm itself adding safepoints. Any observations ?

Calls to gc.safepoint_poll are automatically inserted by the PlaceSafepoints pass, which we always use with LLVM as an AOT backend and LLVM as a JIT backend.

@imhameed
Copy link
Contributor

imhameed commented Aug 30, 2021

Do we have a list of LLVM passes that we use somewhere? It should be the https://llvm.org/docs/Statepoints.html#placesafepoints PlaceSafepoints pass that adds calls to gc.safepoint_poll.

We don't have a centralized list of all the LLVM passes we use, and the passes we use differ between LLVM AOT and LLVM as a JIT. For AOT, the flags we pass to opt and llc are here. For JIT, the passes we use are listed here and here.

I think the decision of which functions should get safepoints is controlled by PlaceSafepoints::shouldRewriteFunction https://github.com/dotnet/llvm-project/blob/release/11.x/llvm/lib/Transforms/Scalar/PlaceSafepoints.cpp#L454-L456

I guess the way to mark a Function not to do a safepoint is to make sure that Function::getGC returns something other than "mono" (or "coreclr" or "safepoint-example"), right? Do we have that exposed to the aot-compiler?

You could call LLVMSetGC(method, NULL) to make PlaceSafepoints::runOnFunction leave your function alone (or simply not call LLVMSetGC at all).

@lambdageek
Copy link
Member

Do we have a list of LLVM passes that we use somewhere? It should be the https://llvm.org/docs/Statepoints.html#placesafepoints PlaceSafepoints pass that adds calls to gc.safepoint_poll.

We don't have a centralized list of all the LLVM passes we use, and the passes we use differ between LLVM AOT and LLVM as a JIT. For AOT, the flags we pass to opt and llc are here. For JIT, the passes we use are listed here and here.

I think the decision of which functions should get safepoints is controlled by PlaceSafepoints::shouldRewriteFunction https://github.com/dotnet/llvm-project/blob/release/11.x/llvm/lib/Transforms/Scalar/PlaceSafepoints.cpp#L454-L456
I guess the way to mark a Function not to do a safepoint is to make sure that Function::getGC returns something other than "mono" (or "coreclr" or "safepoint-example"), right? Do we have that exposed to the aot-compiler?

You could call LLVMSetGC(method, NULL) to make PlaceSafepoints::runOnFunction leave your function alone (or simply not call LLVMSetGC at all).

@BrzVlad So add the barrier wrapper here:

if (!cfg->llvm_only && mono_threads_are_safepoints_enabled () && requires_safepoint) {
if (!cfg->compile_aot && cfg->method->wrapper_type != MONO_WRAPPER_ALLOC) {
LLVMSetGC (method, "coreclr");
emit_gc_safepoint_poll (ctx->module, ctx->lmodule, cfg);
} else if (cfg->compile_aot) {
LLVMSetGC (method, "coreclr");
}

@imhameed Why do we bother with a "mono" GC type when the runtime actually uses "coreclr" 🤪

@BrzVlad
Copy link
Member Author

BrzVlad commented Aug 30, 2021

Refactored the code a little bit so we disable safepoints in GC wrappers both on aot and !aot.

@lambdageek
Copy link
Member

@BrzVlad do we want this for 6.0? How risky is it?

@BrzVlad
Copy link
Member Author

BrzVlad commented Aug 30, 2021

I think the risk is low but not negligible.

I don't know if this issue ever lead to a crash, so I don't think it is a mandatory backport.

This change might have some noticeable perf gains on AOT since we avoid wrapper + safepoints for stores into ref fields, so that could be a motivating factor. Maybe if we see some crazy perf gains in the benchmark runs that could justify it.

@BrzVlad BrzVlad merged commit b721f6d into dotnet:main Aug 31, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Sep 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants