-
-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross-module inlining: Enable emission into multiple object files #3442
Conversation
fd->ir->setDefined(); | ||
return; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been aware of this additional 'culling'.
Duplicate function literals without culling, now missing ones with culling... |
c071d88
to
a8cf528
Compare
tests/codegen/inlining_gh3126.d
Outdated
// files. | ||
|
||
// Generate unoptimized IR for 2 source files as separate compilation units. | ||
// RUN: %ldc -c -output-ll %s %S/inputs/inlinables.d -od=%t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add testing of the reversed order of the files too? (IIRC we codegen in reverse order to also codegen stuff that has been sema3'ed while doing codegen).
gen/functions.cpp
Outdated
// some non-inlined call. | ||
// Prevent undefined symbol errors by defining literals in | ||
// available_externally parents as regular non-available_externally | ||
// functions (with linkonce_odr linkage). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a function is available_externally
then why is a literal referenced by that externally available function not available from that external object file either? Is it because we define literals as private symbols of that module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this either. Emitting any symbols as non-avilable_externally risks accidentally duplicating symbols with slightly different mangles, etc. in case of frontend bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problematic thing for the Phobos unittests was https://github.com/ldc-developers/phobos/blob/3371f469af84adf5b4cff8b74066b60b0460454d/std/experimental/logger/core.d#L327-L340. The delegate literals in there are for each lazy arg; they are apparently treated as literals inside that function, while they are referenced from the call sites (to populate the lazy arg delegates, putting the function pointer into those).
My 1st approach was to make sure all literals (in available_externally
parents) are emitted as available_externally too and to apply pragma(inline, true)
to those as well. Didn't work, and looking at the .ll revealed the indirect usage via function pointer for the delegates.
As to why the log
template parent is treated as available_externally
, I'm not sure. I tried returning false for each function inside some template instance where needsCodegen()
returns false, but that didn't help (something like if (auto ti = DtoIsTemplateInstance(fdecl)) { if (!ti.needsCodegen()) /* not available_externally */ }
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this the old problem of __FILE__
as default template argument? Depending on compilation command line, __FILE__
will be different and thus the mangling will be different. Then, what can happen is:
- phobos source somewhere uses the
log
template with specific template args A1 in functionf
-->log(A1)()
is emitted into libphobos - in a separate compilation, the user's code is compiled that calls
log
indirectly through functionf
in phobos (user code -->f
-->log
). - Now what the compiler sees is that
f
should already be codegenned inlibphobos
, thusf
-->log
does not need codegen. All is good. - With cross module inlining enabled,
f
is codegenned as an inlining candidate, andlog
aswell, both marked asavailable_externally
. - But now the
__FILE__
bites us: in step 1, the commandline was such that__FILE__
ended up as relative path, but in step 4. (with the user's code and commandline) it ends up as absolute path, or some different relative path. So actually the call islog(A2)()
and the call will be to that symbol (mangling includes A2 instead of A1): the compiler sees that that call should already have been codegenned (hence available_externally), but because__FILE__
was different, that is not true. ---> Missing symbols during linking when LLVM decides to inlinef
but not to inline theavailable_externally log
(or literals in it, in this case).
Ideas for solution:
- When crossmodule codegenning, always_inline when
__FILE__
is a template parameter. - Never crossmodule codegen and never apply
externally_available
when__FILE__
is template parameter - Make
__FILE__
much more predictable. For example by forcing it toone/two/three.d
whenmodule one.two.three;
is at the top of the file (and nottwo/three.d
orzero/one/two/three.d
when that would also work with the set of-I=...
cmdline options passed).
Ideas 1 and 2 could not be implemented at the time because the D frontend discards that information about string template arguments. This is very bad anyway, and needs to be fixed in the frontend.
Idea 3, I didn't think of at the time. Don't know how many corner cases there will be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(so we went we option 1, with explicit pragma(inline,true)
needed everywhere where __FILE__
is used :(. Clearly, that cannot be done with function literals. Static local variables will probably have the same problem.)
Some printf debugging showed the actual issue, see last commit. The test now covers both orders, includes a little I'm sure there are still more than enough issues with |
Previously, the logic gave up early if semantic3 was run for the function to be inlined, either because the function is going to be codegen'd in some root module, or because sema3 was already run for an available_externally 'copy' in a previous compilation unit. This restricted a function to being defined in at most one compilation unit per ldc2 invocation. So e.g. a little `pragma(inline, true)` wrapper in druntime wasn't inlined into other druntime modules, because the whole lib is built in a single cmdline by default, and sema3 was obviously run for the actual emission in the corresponding object file. When later compiling Phobos in a single cmdline, only the first object file referencing the wrapper got lucky, running sema3 manually and getting an available_externally 'copy'.
…ernally This is exactly what happened for the 4 problematic delegate literals for the Phobos unittests - DtoDefineFunction(fd) could result in that definition ending up as available_externally - all done as part of DtoResolveFunction(), which declared it, and DtoDeclareFunction() defined it as available_externally, and the outer DtoDefineFunction() returned early in that case without fixing up the linkage.
Previously, the logic gave up early if semantic3 was run for the function to be inlined, either because the function is going to be codegen'd in some root module, or because sema3 was already run for an
available_externally
'copy' in a previous compilation unit. This restricted a function to being defined in at most one compilation unit perldc2
invocation.So e.g. a little
pragma(inline, true)
wrapper in druntime wasn't inlined into other druntime modules, because the whole lib is built in a single cmdline by default, and sema3 was obviously run for the actual emission in the corresponding object file.When later compiling Phobos in a single cmdline, only the first object file referencing the wrapper got lucky, running sema3 manually and getting an
available_externally
'copy'.