Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic linking & indirect calls #8268

Closed
kripken opened this issue Mar 8, 2019 · 11 comments · Fixed by #9393 or #9431
Closed

Dynamic linking & indirect calls #8268

kripken opened this issue Mar 8, 2019 · 11 comments · Fixed by #9393 or #9431

Comments

@kripken
Copy link
Member

kripken commented Mar 8, 2019

Currently in emscripten dynamic linking we share functions by exporting and importing them. This should be faster than indirect calls, except for the case of calling a function from a later-linked module (since we don't have anything to import yet, so we import a JS indirection helper thunk).

This has a problem, though - pointer identity is not preserved. The reason is that if module A exports a function foo, and module B imports it, and module B wants to call foo indirectly, then in the current model we have received the function itself. To make it indirectly callable, we create a little helper function and put that in the table. But it has a different index than the original (if the original was also in the table). So two function pointers may be different even though they refer to the same actual function.

Options:

  1. Do all dynamic linking cross-module calls using indirect calls. This would be simplest - when a module exports functions, it would just export the function pointer indexes, and not the actual functions. But the overhead of indirect calls is high (on the other hand this would avoid the JS indirection helpers.)
  2. Also export the function pointer indexes, in addition to the functions themselves. Then another module can use either the function for a direct call or the index for an indirect one.

In both of those, a downside is that we'd need to place every exported function in the table, and also export those indexes.

cc @sbc100 @dschuff @awtcode

@sbc100
Copy link
Collaborator

sbc100 commented Mar 8, 2019

I also realized this problem recently.

I like the idea of (1). This means that the wasm table becomes the equivalent of the ELF PLT, where all calls go through a level of indirection. This also allows for things like symbol interposition and runtime redirection via table.set().

I think the way it would work is that DLLs that import external functions would instead import a global holding the "address" (table index) of a function. So all calls to non-local-bound symbols would look like:

 import global plt.$foo
 ...
 get_global $foo
 call_indirect

The JS loader would when construct plt as a map from function name to table index.

As an optimization to allow for at least some cross-DLL-direct-calling, we could distinguish between address-take functions (which we import and call by table index), and non-address-taken function which we can import directly and call directly (but never put in a table).

@kripken
Copy link
Member Author

kripken commented Mar 11, 2019

Yeah, perhaps the importing module would import both the function and its index in the table, and then use the former for direct calls, and the latter for indirect ones. Then we can just remove the current wrapper code.

This all does mean that modules must place all exports in the table, and must export both the function and the table index, which seems annoying, but perhaps unavoidable.

@sbc100
Copy link
Collaborator

sbc100 commented Mar 11, 2019

Yeah, perhaps the importing module would import both the function and its index in the table, and then use the former for direct calls, and the latter for indirect ones. Then we can just remove the current wrapper code.

This all does mean that modules must place all exports in the table, and must export both the function and the table index, which seems annoying, but perhaps unavoidable.

Does it?

Couldn't modules just export the functions, and then import all the table address for any function they export? Then the dynamic loader could be in charge or building the table from the combined exports of all the modules?

I guess each DLL might still have its own private table section for "hidden" (non-dll-exported) symbols.

@awtcode
Copy link
Contributor

awtcode commented Mar 12, 2019

Just to clarify, when we have code like the following in the main module:

globalFnPtr == sidey

where sidey is a function imported from a side module. Are we comparing the function's address or the table index? If it is the former, does it mean that the address needs to be imported from the side module as well?

@sbc100
Copy link
Collaborator

sbc100 commented Mar 12, 2019

With wasm a function's address is its table index, and least that is how we model the function address in C/C++ code today. So that operation is a comparison of two table indexes.

One approach being suggested here is that we (1) is that we only import the table index and then all call's become indirect calls (at least all calls across the DLL boundary).

@kripken
Copy link
Member Author

kripken commented Mar 12, 2019

Couldn't modules just export the functions, and then import all the table address for any function they export? Then the dynamic loader could be in charge or building the table from the combined exports of all the modules?

Good point. Yeah, talking with @dschuff, it does seem like a good approach would be to have the dynamic loader do it. So modules export the functions, and the loader tracks function name => wasm function. Then we have a JS Proxy object for the imports, and when we see someone imports __address$foo (or some other way to say "the index of foo"), then we add that function to the table if it isn't there yet, and now have an index we can give the import. In other words, lazy adding to the table.

sbc100 added a commit that referenced this issue Mar 22, 2019
This test currently is currently expected to fail, until we fix
this issue;

See #8268.
@sbc100
Copy link
Collaborator

sbc100 commented Mar 22, 2019

I'd love to see this fixed. I added a failing test case for it to confirm how it currently fails.

@kripken do you have time to look into this or should I?

@sbc100
Copy link
Collaborator

sbc100 commented Mar 22, 2019

I like the idea of assign the table index's lazily, however that does mean that the defining module for foo will also need to call __address$foo for to find its address. i.e. nobody will know the address of foo until runtime.

Another approach would be for modules to export __address$foo for every function foo that they export? Would double the number of exports though.

@awtcode
Copy link
Contributor

awtcode commented Mar 25, 2019

@kripken and @sbc100 , I found another issue last week which might be related. The problem happens when we pass a derived pointer to a function in another module. The dynamic_cast fails in the other module.

main.cpp
`EMSCRIPTEN_KEEPALIVE extern "C"
void mainy(base * arg) {
derived * temp = dynamic_cast < derived * > (arg);
printf("temp:%p\n", temp);
}

extern "C" void sidey(base* arg);

int main() {
derived* temp = new derived();
printf("main: temp:%p\n", temp);
sidey(temp);
return 0;
}`

side.cpp
`EMSCRIPTEN_KEEPALIVE extern "C"
void sidey(base* arg) {
derived * temp1 = dynamic_cast < derived * > (arg);
printf("sidey: arg:%p temp1:%p\n", arg, temp1);

base* temp = new derived();
mainy(temp);
}`

@kripken
Copy link
Member Author

kripken commented Mar 25, 2019

@sbc100 - I don't think I'll have time for this, but in any case, we should probably just focus on the LLVM wasm backend anyhow, which you're doing anyway (that is, I don't see a point to fixing this in fastcomp, since we hope to deprecate it).

@kripken
Copy link
Member Author

kripken commented Mar 25, 2019

I think we have a reasonable plan here, where the dynamic loader can assign the final fixed index in the table to any function, even of a function in a module not yet loaded (it reserves the spot, and will fill it in later).

Issues left:

  • It seems like for globals we do need mutable imported globals, as the dynamic loader can't actually figure them out ahead of time (each function takes one slot; a global is part of a module's memory allocation, whose size we only see later, unless we parse it out of the modules).
  • For direct calls, we can hook them up directly, but for cycles (a module not yet loaded) it seems we have to add a stub + an indirect call.

sbc100 added a commit that referenced this issue Mar 25, 2019
This test currently is currently expected to fail, until we fix
this issue;

See #8268.
sbc100 added a commit to emscripten-core/emscripten-fastcomp that referenced this issue Apr 27, 2019
This means that the dynamic linker then assigned function pointers
and both main module and side module use `fp$foo` accessors to lookup
function addresses, just like they do for global addresses.

This allows the test_dylink_function_pointer_equality test in emscripten
to pass.

See emscripten-core/emscripten#8268.
sbc100 added a commit to emscripten-core/emscripten-fastcomp that referenced this issue Apr 27, 2019
…via `fp$foo` functions

This means that the dynamic linker then assigned function pointers
and both main module and side module use `fp$foo` accessors to lookup
function addresses, just like they do for global addresses.

This allows the test_dylink_function_pointer_equality test in emscripten
to pass.

See emscripten-core/emscripten#8268.
VirtualTim pushed a commit to VirtualTim/emscripten that referenced this issue May 21, 2019
This test currently is currently expected to fail, until we fix
this issue;

See emscripten-core#8268.
sbc100 added a commit to emscripten-core/emscripten-fastcomp that referenced this issue Sep 4, 2019
…via `fp$foo` functions

This means that the dynamic linker then assigned function pointers
and both main module and side module use `fp$foo` accessors to lookup
function addresses, just like they do for global addresses.

This allows the test_dylink_function_pointer_equality test in emscripten
to pass.

See emscripten-core/emscripten#8268.
sbc100 added a commit that referenced this issue Sep 4, 2019
There is upstream fastcomp change coming which addresses function
pointer equality between wasm modules (i.e. SIDE_MODULE/MAIN_MODULE):
emscripten-core/emscripten-fastcomp#259

This change prepares for that change by handling the new
externFunctions metadata attribute.  This new key represents a list
of all non-static functions in a relocatable module which are address
taken.  The module will call fp$<name>() in order to find out the
runtime address (table entry) for such functions.

Once both these changes land they they will fix #8268.
sbc100 added a commit to emscripten-core/emscripten-fastcomp that referenced this issue Sep 4, 2019
…via `fp$foo` functions

This means that the dynamic linker then assigned function pointers
and both main module and side module use `fp$foo` accessors to lookup
function addresses, just like they do for global addresses.

This allows the test_dylink_function_pointer_equality test in emscripten
to pass.

See emscripten-core/emscripten#8268.
sbc100 added a commit that referenced this issue Sep 5, 2019
There is upstream fastcomp change coming which addresses function
pointer equality between wasm modules (i.e. SIDE_MODULE/MAIN_MODULE):
emscripten-core/emscripten-fastcomp#259

This change prepares for that change by handling the new
externFunctions metadata attribute.  This new key represents a list
of all non-static functions in a relocatable module which are address
taken.  The module will call fp$<name>() in order to find out the
runtime address (table entry) for such functions.

Once both these changes land they they will fix #8268.
sbc100 added a commit that referenced this issue Sep 5, 2019
There is upstream fastcomp change coming which addresses function
pointer equality between wasm modules (i.e. SIDE_MODULE/MAIN_MODULE):
emscripten-core/emscripten-fastcomp#259

This change prepares for that change by handling the new
externFunctions metadata attribute.  This new key represents a list
of all non-static functions in a relocatable module which are address
taken.  The module will call fp$<name>() in order to find out the
runtime address (table entry) for such functions.

Once both these changes land they they will fix #8268.
sbc100 added a commit that referenced this issue Sep 5, 2019
)

There is upstream fastcomp change coming which addresses function
pointer equality between wasm modules (i.e. SIDE_MODULE/MAIN_MODULE):
emscripten-core/emscripten-fastcomp#259

This change prepares for that change by handling the new
externFunctions metadata attribute.  This new key represents a list
of all non-static functions in a relocatable module which are address
taken.  The module will call fp$<name>() in order to find out the
runtime address (table entry) for such functions.

Once both these changes land they they will fix #8268.
sbc100 added a commit to emscripten-core/emscripten-fastcomp that referenced this issue Sep 12, 2019
…via `fp$foo` functions

This means that the dynamic linker then assigned function pointers
and both main module and side module use `fp$foo` accessors to lookup
function addresses, just like they do for global addresses.

This allows the test_dylink_function_pointer_equality test in emscripten
to pass.

See emscripten-core/emscripten#8268.
sbc100 added a commit to emscripten-core/emscripten-fastcomp that referenced this issue Sep 12, 2019
…via `fp$foo` functions (#259)

This means that the dynamic linker then assigned function pointers
and both main module and side module use `fp$foo` accessors to lookup
function addresses, just like they do for global addresses.

This allows the test_dylink_function_pointer_equality test in emscripten
to pass.

See emscripten-core/emscripten#8268.
sbc100 added a commit that referenced this issue Sep 12, 2019
Recent changes to fastcomp mean that this test now passes.

Fixes: #8268
sbc100 added a commit that referenced this issue Sep 13, 2019
Recent changes to fastcomp mean that this test now passes.

Fixes: #8268
belraquib pushed a commit to belraquib/emscripten that referenced this issue Dec 23, 2020
…scripten-core#9393)

There is upstream fastcomp change coming which addresses function
pointer equality between wasm modules (i.e. SIDE_MODULE/MAIN_MODULE):
emscripten-core/emscripten-fastcomp#259

This change prepares for that change by handling the new
externFunctions metadata attribute.  This new key represents a list
of all non-static functions in a relocatable module which are address
taken.  The module will call fp$<name>() in order to find out the
runtime address (table entry) for such functions.

Once both these changes land they they will fix emscripten-core#8268.
belraquib pushed a commit to belraquib/emscripten that referenced this issue Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants