Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow linking custom LLVM optimization passes #429

Open
wants to merge 93 commits into
base: main
Choose a base branch
from

Conversation

alendit
Copy link
Contributor

@alendit alendit commented Dec 4, 2018

This PR is related to the dicussion in numba/numba#3531, concretely to the fact that numba can't optimize ref counting across block borders right now. The underlying reason is the split between C++/Python module representation and inability to easily convert from C++ to Python. As a consequence the only reasonable way to implement complex optimization passes is to write a pass in C++ and execute it using llvmlite.

This PR adds following things:

  • An example hello pass, pretty much a copy of the one in LLVM source code. It acts as the basic test case and as an example for the developers who wish to implement an own pass.

  • Adjust exported symbols in libllvmlite.so by including the symbols from libLLVMCore. This along with using RTLD_GLOBAL is required to subsequently load the DSO containing passes. Unfortunatelly it increases the size of libllvmlite.so by around 5% 2% from 53MB to 56MB 54MB on my machine. Any suggestions how to do it more efficiently are, of course, welcome.

  • Extend PassManager interface with load_shared_lib, list_registered_passes and add_pass_by_arg. arg is used in LLVM context to refer to the unique pass identifier, as opposed to name which is its human-readable form. I chose to keep this distinction, but we can adjust it.

  • Add method load_pass_plugin to llvmlite.bindings which takes a path to a shared library and loads it in the address space.

This PR is the prerequisite to another planed extension which would enhance remove_redundant_nrt_refct to work across basic blocks.

@codecov-io
Copy link

codecov-io commented Dec 4, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@6044afe). Click here to learn what that means.
The diff coverage is 93.9%.

@@            Coverage Diff            @@
##             master     #429   +/-   ##
=========================================
  Coverage          ?   92.35%           
=========================================
  Files             ?       33           
  Lines             ?     5283           
  Branches          ?      370           
=========================================
  Hits              ?     4879           
  Misses            ?      324           
  Partials          ?       80

@alendit
Copy link
Contributor Author

alendit commented Dec 5, 2018

@sklam @stuartarchibald I would really appreciate it if one of you could give me a hand at figuring out the compilation issues on osx and windows.

On osx the build process seem to fail on the basic cmake sanity check because of 32/64-bit mismatch. Is this an issue with CI? Should I configure cmake differently?

On windows the linker complains about unresolved external symbols. Isn't it possible to create a DLL with unresolved symbols which will be resolved at the load time? I thought cmake's add_library(... MODULE) does this, but apparently it fails.

@alendit
Copy link
Contributor Author

alendit commented Dec 10, 2018

Ok, I think I got it for osx, but Windows being Windows is wholly another matter. Apparently it doesn't support weak dynamic linking(http://lists.llvm.org/pipermail/llvm-dev/2015-November/091960.html) which makes the whole approach hard if not impossible. I'll have to think about it.

@alendit
Copy link
Contributor Author

alendit commented Dec 11, 2018

It doesn't look good on windows at all.

The problem is the following: you can't create a DLL with undefined symbols which would be resolved at load time. But linking the whole libllvmlite.so or LLVM itself into pass plugin DLL leads to two sets of global static variables being created. This is a problem most prominently in case of PassRegistry::getPassRegistry, which is used a lot, but returns a different address depending on from which DLL it's called.

I'm baffled that something so simple isn't possible under Windows (though LLVM carries a part of the blame with their excessive use of global statics). And I'm as always open to suggestions, because I think llvmlite and numba could benefit greatly from the ability to load custom optimization passes.

@alendit
Copy link
Contributor Author

alendit commented Feb 1, 2019

So, with Windows support being highly unlikely, how about hiding this feature behind a flag, i.e. make only available to Linux and Mac users?

As I've mentioned, the reason why I'd like this feature merged is that it would allow to add custom optimization on a ModuleRef instance. Concretely, numba right now has suboptimal handling of incref/decref pairs, which are only optimized await on a basic block level. On the other hand, it's trivial to create an optimization pass in C++ which would remove the call based on the CFG information. It would also eliminate the need for ModuleRef->str->ModuleRef round trip for optimization and thus improve compilation performance.

Of course, hiding this feature behind a flag would mean that Windows users would be left out, which is very unfortunate. Another option would be to "hard compile" an additional set of numba-specific optimization passes into libllvmlite.so which would be not great for modularity, but cover all users.

@gmarkall
Copy link
Member

@alendit Many thanks for your efforts on this PR so far, and I'm sorry that we haven't been able to make more progress with it so far.

I'm spending some time going through the llvmlite PR backlog. Now that #634 is merged, which adds a custom pass to prune refcounts on an intra-block basis, my understanding is that what this PR would still add is the ability to load / link a custom pass at runtime from outside llvmlite. Is that understanding correct? If that is correct, do you think there is sufficient value in pursuing this PR to completion?

My guess would be that given it seems problematic on Windows, that there's probably a non-trivial amount of work involved in updating this patch for current master, and that we can now at least include custom passes within the llvmlite build, it might seem better to close this PR - however, I'm happy to help push this forward if you see value here. Additionally I'm keen to be corrected if there are any misunderstandings in any of the above :-)

@esc esc removed this from the PR Backlog milestone Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants