Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

dynamic load libnvrtc.so #17858

Closed
leezu opened this issue Mar 17, 2020 · 8 comments
Closed

dynamic load libnvrtc.so #17858

leezu opened this issue Mar 17, 2020 · 8 comments

Comments

@leezu
Copy link
Contributor

leezu commented Mar 17, 2020

Description

Compiling MXNet with nvrtc support dynamically links libcuda.so. However, libcuda.so is part of the cuda driver and not typically available on cpu machines. This means, on cpu machines libmxnet.so will have missing dependencies and can't be loaded.

Unfortunately, parts of MXNet compilation require to dlopen(libmxnet.so). In particular OpWrapperGenerator.py for the cpp package. This breaks compiling mxnet with nvrtc + cpp-package on cpu machines (as used by our CI).

As a workaround, the stub driver can be added to LD_LIBRARY_PATH. A better solution may be to dlopen libnvrtc and thereby drop the dependency on libcuda.so.

CC: @ptrendx

References

@ptrendx
Copy link
Member

ptrendx commented Mar 17, 2020

I agree it makes sense to dlopen it.

@ChaiBapchya
Copy link
Contributor

@mxnet-label-bot add [bug]
This work-around [adding to LD_LIBRARY_PATH] is great. But without the workaround, the mxnet breaks. So I guess, it needs to be fixed and is a bug rather than a feature-request.

@ptrendx
Copy link
Member

ptrendx commented May 11, 2020

Hi @mseth10, do you think the utilities that you introduced in dynamic library loading support PR could be used for this?

@mseth10
Copy link
Contributor

mseth10 commented May 11, 2020

Hi @ptrendx if you need a generic dlopen for this, I'll suggest using ctypes CDLL.

@ptrendx
Copy link
Member

ptrendx commented May 11, 2020

But that would be just for Python then, right? I would need it to call functions from C++ and the load_lib function from your PR seems like it already handles all the differences in OSs and stuff.

@mseth10
Copy link
Contributor

mseth10 commented May 11, 2020

Sorry, I thought you were referring to the python api introduced by the PR. That has a lot of additional functionality.
We can use the c++ utility function - lib_load https://github.com/apache/incubator-mxnet/blob/9d440868603ad26b702e12ddd2587e5c4b56e42b/src/initialize.cc#L113

@ptrendx
Copy link
Member

ptrendx commented May 3, 2021

I thought about this and the real problem seems to actually be libcuda.so, right? libnvrtc.so does not depend on libcuda.so directly, it is just that to actually use the kernels compiled with libnvrtc.so one needs to use the driver API. So it should be fine to link normally to libnvrtc.so, and just dlopen libcuda.so, right @leezu?

@leezu
Copy link
Contributor Author

leezu commented May 4, 2021

@ptrendx yes

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants