Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Consider MXNET_BUILD_SHARED_LIBS flag #18365

Merged
merged 3 commits into from
May 22, 2020
Merged

Consider MXNET_BUILD_SHARED_LIBS flag #18365

merged 3 commits into from
May 22, 2020

Conversation

cbalint13
Copy link
Contributor

@cbalint13 cbalint13 commented May 19, 2020

Description

  1. Consider MXNET_BUILD_SHARED_LIBS build flag

Comments

Building pure shared version is distro's releng favorite way.

Thank You !

@cbalint13 cbalint13 requested review from leezu and szha as code owners May 19, 2020 15:07
@mxnet-bot
Copy link

Hey @cbalint13 , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [edge, centos-cpu, unix-gpu, sanity, windows-gpu, miscellaneous, unix-cpu, website, clang, windows-cpu, centos-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@leezu
Copy link
Contributor

leezu commented May 19, 2020

Thanks @cbalint13 . Actually we may improve the defaults for upcoming MXNet 2. I suggest to make the "build shared" the default and optionally allow to build only the static library. (ie following the CMAKE_BUILD_SHARED) approach.

What do you think?

@cbalint13
Copy link
Contributor Author

cbalint13 commented May 19, 2020

@leezu ,

Let's try, will see how will pass CI.

@cbalint13 cbalint13 changed the title Add optional MXNET_BUILD_SHARED Introduce MXNET_BUILD_STATIC=OFF May 19, 2020
@cbalint13
Copy link
Contributor Author

@leezu ,

UPDATE: Followed your proposal, updated the code.

  • The way CI fails is beyond of this PR cause (noticed others fail too, i.e cuda.h not found).
  • Is it still acceptable feature for trunk ?

Let me know if need more work on it.

@leezu
Copy link
Contributor

leezu commented May 19, 2020

Thanks @cbalint13. I wonder if we should only build the static library if MXNET_BUILD_STATIC=ON, instead of building both static and shared libraries. If yes, would MXNET_BUILD_STATIC even be needed or should we simply use the value of CMAKE_BUILD_SHARED (but default to CMAKE_BUILD_SHARED=ON). What do you think?

@cbalint13 cbalint13 changed the title Introduce MXNET_BUILD_STATIC=OFF Introduce MXNET_BUILD_STATIC=ON May 19, 2020
@cbalint13
Copy link
Contributor Author

cbalint13 commented May 19, 2020

@leezu ,

Uhh, Ohh.

  • Can't make pure shared build without hurting CI very badly.

  • cmake's internal CMAKE_BUILD_SHARED would be perfect (but it is ON by default).

  • I ended up using MXNET_BUILD_STATIC=ON to keep current behaviour, but also:

    • CMAKE_BUILD_SHARED internal by default is ON (too big change for CI state).
    • In current static mode a large static object and a small shared surrogate is build, that's it now.
    • CI isn't hurt for now, by switching MXNET_BUILD_STATIC=OFF one can fix CI side.
    • distro's releng can handle MXNET_BUILD_STATIC=OFF for now if they want on linux.
    • When CI is solved we can replace MXNET_BUILD_STATIC with CMAKE_BUILD_SHARED internal.

@leezu
Copy link
Contributor

leezu commented May 19, 2020

In current static mode a large static object and a small shared surrogate is build, that's it now.

Actually not. There is a hack further down in the cmake file where the complete static object is included as is into the shared object. So there is not much difference between the two besides the format.

Can't make pure shared build without hurting CI very badly.

What do you mean?

@cbalint13 cbalint13 changed the title Introduce MXNET_BUILD_STATIC=ON Consider CMAKE_BUILD_SHARED flag May 20, 2020
@cbalint13
Copy link
Contributor Author

@mxnet-bot run ci [all]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-cpu, edge, sanity, website, centos-gpu, clang, unix-cpu, miscellaneous, unix-gpu, centos-cpu, windows-gpu]

@cbalint13
Copy link
Contributor Author

cbalint13 commented May 20, 2020

In current static mode a large static object and a small shared surrogate is build, that's it now.

Actually not. There is a hack further down in the cmake file where the complete static object is included as is into the shared object. So there is not much difference between the two besides the format.

I am aware.

Can't make pure shared build without hurting CI very badly.

What do you mean?

Will not pass CI.

@leezu
Copy link
Contributor

leezu commented May 20, 2020

Will not pass CI.

Why do you think so. The CI does not rely on the libmxnet.a.

@cbalint13
Copy link
Contributor Author

@leezu

Will not pass CI.

Why do you think so. The CI does not rely on the libmxnet.a.

  • It seems, had the impression that all test units links libmxnet.a

Can advice me on MacOS CI failure ? Dont get the meaning of error (can't see what symbol):

E   AttributeError: dlsym(0x7fdb866c3810, MXGetLastError): symbol not found

Thank You !

CMakeLists.txt Show resolved Hide resolved
tests/CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@leezu
Copy link
Contributor

leezu commented May 21, 2020

It seems, had the impression that all test units links libmxnet.a

That's part of the problem. It comes with huge link, disk space and CI network traffic overheads due to the number of unittests.

E AttributeError: dlsym(0x7fdb866c3810, MXGetLastError): symbol not found

I'm not yet sure about this error. Could you address above comments first (or state that you disagree) and then let's look more closely into this issue.

@szha
Copy link
Member

szha commented May 21, 2020

@leezu static object should be useful in building standalone C++ inference programs. If the size of test is a problem, we can strip the binary after compilation.

@leezu
Copy link
Contributor

leezu commented May 21, 2020

@szha sure. In that case just set BUILD_SHARED_LIBS=0.

This is only about simplifying the cmake file to follow standard practice, reducing the amount of codes and hacks used by mxnet build system.

@cbalint13
Copy link
Contributor Author

@leezu static object should be useful in building standalone C++ inference programs. If the size of test is a problem, we can strip the binary after compilation.

  • we did't get rid, still can build static objects if one desire
  • in my opinion, too big for normal releng purposes (even stripped).

@cbalint13
Copy link
Contributor Author

@leezu ,

  • Proposed leap is big, someone need to adjust CI (out of my scope).
  • I can re-propose PR it in earlier form (agree not elegant) and see that single one MacOS-CI issue.

@cbalint13 cbalint13 changed the title Consider CMAKE_BUILD_SHARED flag Consider BUILD_SHARED_LIBS flag May 21, 2020
@leezu
Copy link
Contributor

leezu commented May 21, 2020

Adopting the standard BUILD_SHARED_LIBS may take additional work to make sure all dependencies + the CI logic correctly handle the result. It would however result into a more "distro-friendly" build you quote, as the dependencies will also be built dynamically.

But to keep things simple in this PR, it's fine to use an MXNet specific setting for now. I modified your PR slightly to reintroduce the MXNet specific option.

@cbalint13
Copy link
Contributor Author

Adopting the standard BUILD_SHARED_LIBS may take additional work to make sure all dependencies + the CI logic correctly handle the result. It would however result into a more "distro-friendly" build you quote, as the dependencies will also be built dynamically.

Wow someone circumvent CI 🥇

But to keep things simple in this PR, it's fine to use an MXNet specific setting for now. I modified your PR slightly to reintroduce the MXNet specific option.

Thank you very much @leezu !

@cbalint13 cbalint13 changed the title Consider BUILD_SHARED_LIBS flag Consider MXNET_BUILD_SHARED_LIBS flag May 22, 2020
@leezu leezu merged commit 497bf7e into apache:master May 22, 2020
@leezu
Copy link
Contributor

leezu commented May 22, 2020

Thanks @cbalint13

AntiZpvoh pushed a commit to AntiZpvoh/incubator-mxnet that referenced this pull request Jul 6, 2020
…pache#18365)

* Consider BUILD_SHARED_LIBS flag

* Use MXNET_BUILD_SHARED_LIBS
bartekkuncer added a commit to bartekkuncer/incubator-mxnet that referenced this pull request Jun 14, 2021
szha pushed a commit that referenced this pull request Jul 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants