Allow linking custom LLVM optimization passes #429

alendit · 2018-12-04T09:17:19Z

This PR is related to the dicussion in numba/numba#3531, concretely to the fact that numba can't optimize ref counting across block borders right now. The underlying reason is the split between C++/Python module representation and inability to easily convert from C++ to Python. As a consequence the only reasonable way to implement complex optimization passes is to write a pass in C++ and execute it using llvmlite.

This PR adds following things:

An example hello pass, pretty much a copy of the one in LLVM source code. It acts as the basic test case and as an example for the developers who wish to implement an own pass.
Adjust exported symbols in libllvmlite.so by including the symbols from libLLVMCore. This along with using RTLD_GLOBAL is required to subsequently load the DSO containing passes. Unfortunatelly it increases the size of libllvmlite.so by around 5% 2% from 53MB to ~~56MB~~ 54MB on my machine. Any suggestions how to do it more efficiently are, of course, welcome.
Extend PassManager interface with ~~load_shared_lib~~, list_registered_passes and add_pass_by_arg. arg is used in LLVM context to refer to the unique pass identifier, as opposed to name which is its human-readable form. I chose to keep this distinction, but we can adjust it.
Add method load_pass_plugin to llvmlite.bindings which takes a path to a shared library and loads it in the address space.

This PR is the prerequisite to another planed extension which would enhance remove_redundant_nrt_refct to work across basic blocks.

codecov-io · 2018-12-04T11:01:47Z

Codecov Report

❗ No coverage uploaded for pull request base (master@6044afe). Click here to learn what that means.
The diff coverage is 93.9%.

@@            Coverage Diff            @@
##             master     #429   +/-   ##
=========================================
  Coverage          ?   92.35%           
=========================================
  Files             ?       33           
  Lines             ?     5283           
  Branches          ?      370           
=========================================
  Hits              ?     4879           
  Misses            ?      324           
  Partials          ?       80

alendit · 2018-12-05T09:06:19Z

@sklam @stuartarchibald I would really appreciate it if one of you could give me a hand at figuring out the compilation issues on osx and windows.

On osx the build process seem to fail on the basic cmake sanity check because of 32/64-bit mismatch. Is this an issue with CI? Should I configure cmake differently?

On windows the linker complains about unresolved external symbols. Isn't it possible to create a DLL with unresolved symbols which will be resolved at the load time? I thought cmake's add_library(... MODULE) does this, but apparently it fails.

alendit · 2018-12-10T13:41:13Z

Ok, I think I got it for osx, but Windows being Windows is wholly another matter. Apparently it doesn't support weak dynamic linking(http://lists.llvm.org/pipermail/llvm-dev/2015-November/091960.html) which makes the whole approach hard if not impossible. I'll have to think about it.

alendit · 2018-12-11T12:32:31Z

It doesn't look good on windows at all.

The problem is the following: you can't create a DLL with undefined symbols which would be resolved at load time. But linking the whole libllvmlite.so or LLVM itself into pass plugin DLL leads to two sets of global static variables being created. This is a problem most prominently in case of PassRegistry::getPassRegistry, which is used a lot, but returns a different address depending on from which DLL it's called.

I'm baffled that something so simple isn't possible under Windows (though LLVM carries a part of the blame with their excessive use of global statics). And I'm as always open to suggestions, because I think llvmlite and numba could benefit greatly from the ability to load custom optimization passes.

alendit · 2019-02-01T09:40:09Z

So, with Windows support being highly unlikely, how about hiding this feature behind a flag, i.e. make only available to Linux and Mac users?

As I've mentioned, the reason why I'd like this feature merged is that it would allow to add custom optimization on a ModuleRef instance. Concretely, numba right now has suboptimal handling of incref/decref pairs, which are only optimized await on a basic block level. On the other hand, it's trivial to create an optimization pass in C++ which would remove the call based on the CFG information. It would also eliminate the need for ModuleRef->str->ModuleRef round trip for optimization and thus improve compilation performance.

Of course, hiding this feature behind a flag would mean that Windows users would be left out, which is very unfortunate. Another option would be to "hard compile" an additional set of numba-specific optimization passes into libllvmlite.so which would be not great for modularity, but cover all users.

gmarkall · 2021-12-21T11:37:51Z

@alendit Many thanks for your efforts on this PR so far, and I'm sorry that we haven't been able to make more progress with it so far.

I'm spending some time going through the llvmlite PR backlog. Now that #634 is merged, which adds a custom pass to prune refcounts on an intra-block basis, my understanding is that what this PR would still add is the ability to load / link a custom pass at runtime from outside llvmlite. Is that understanding correct? If that is correct, do you think there is sufficient value in pursuing this PR to completion?

My guess would be that given it seems problematic on Windows, that there's probably a non-trivial amount of work involved in updating this patch for current master, and that we can now at least include custom passes within the llvmlite build, it might seem better to close this PR - however, I'm happy to help push this forward if you see value here. Additionally I'm keen to be corrected if there are any misunderstandings in any of the above :-)

alendit mentioned this pull request Dec 5, 2018

Investigate exposing Arrow StringBuilder to numba xhochy/fletcher#2

Closed

alendit force-pushed the allow_linking_passes branch from 489e804 to 9e7a246 Compare December 6, 2018 08:15

Dimitri Vorona added 26 commits December 7, 2018 09:19

Basic implementation

2ff1221

Make the test more robust

1d63213

Reorganize passes

1c97e27

Add pass listing

591fa04

Add docs and minor api changes

83877a3

Remove extra new line

e1f8ec6

Remov unnecessary changes from transforms.cpp

421602e

Add new line at the end of the file

0894899

Fix test

72201d2

Lower required cmake version

f099f08

Don't use nonlocal

a4888d7

Avoid named arguments for python2's sake

cdec2d6

Correct extensions for different platforms

cfc49ed

cd using an absolute path

3fb8e9e

Check correctly raised exception and adjust docs

460f6b3

More adjustments for windows

a88b8b8

Prevent double loading of shared libraries

70f6cd3

Remove unnecessary no-rtti option

f5876b7

Rework loading demo pass

38027a0

Use add definitions

b252fee

Use the correct form for add definitions

4e01749

Add windows specific definitions

5bc48ad

Modify linking

1a73def

Remov clutter

91c72a4

Check correct arg name

6acbfa4

Restructure pass plugin loading

53f4e0b

Dimitri Vorona added 14 commits December 10, 2018 10:08

Do something drastic

dde89b9

Do something drastic

2ea83bf

something even more drastic

3a57e48

Manually set llvm flags

c879b3b

Link to passes explicitly

6720cec

BUild dylib

9ebbc2c

Don't link passes explicitly

f1a449a

Use platform independent path

9ec52cc

Build a module on osx

e1f2cd4

Add dynamic lookup

c9f94ed

Revert part of last change

37aae81

Adjust build script

f1cb4f6

Fix utils

792ac93

Fix utils

1265e97

Dimitri Vorona added 7 commits December 10, 2018 16:03

Adjust to link on windows

4142064

Adjust build dir on windows

254afe9

Rework plugin dir loading

409764b

Add api export

38b3d2f

Use load library

7b6a015

Check for mscver

d40b9f6

Fix doc reference

19f17f0

stuartarchibald mentioned this pull request Jul 9, 2019

Numba nogil + dask threading backend results in no speed up (computation is slower) numba/numba#4261

Closed

gmarkall added this to the PR Backlog milestone Dec 21, 2021

gmarkall added the 4 - Waiting on author label Dec 21, 2021

ludgerpaehler mentioned this pull request Mar 10, 2023

Add Support for Compiler Plugins #915

Draft

esc removed this from the PR Backlog milestone Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow linking custom LLVM optimization passes #429

Allow linking custom LLVM optimization passes #429

alendit commented Dec 4, 2018 •

edited

Loading

codecov-io commented Dec 4, 2018 •

edited

Loading

alendit commented Dec 5, 2018 •

edited

Loading

alendit commented Dec 10, 2018 •

edited

Loading

alendit commented Dec 11, 2018 •

edited

Loading

alendit commented Feb 1, 2019 •

edited

Loading

gmarkall commented Dec 21, 2021

Allow linking custom LLVM optimization passes #429

Are you sure you want to change the base?

Allow linking custom LLVM optimization passes #429

Conversation

alendit commented Dec 4, 2018 • edited Loading

codecov-io commented Dec 4, 2018 • edited Loading

Codecov Report

alendit commented Dec 5, 2018 • edited Loading

alendit commented Dec 10, 2018 • edited Loading

alendit commented Dec 11, 2018 • edited Loading

alendit commented Feb 1, 2019 • edited Loading

gmarkall commented Dec 21, 2021

alendit commented Dec 4, 2018 •

edited

Loading

codecov-io commented Dec 4, 2018 •

edited

Loading

alendit commented Dec 5, 2018 •

edited

Loading

alendit commented Dec 10, 2018 •

edited

Loading

alendit commented Dec 11, 2018 •

edited

Loading

alendit commented Feb 1, 2019 •

edited

Loading