-
Notifications
You must be signed in to change notification settings - Fork 6.8k
gcc8+ memory usage regression for compiling indexing_op.o #18501
Comments
Fixing this would be a welcome improvement. Did you investigate if the high memory consumption is consistent among gcc and clang, as well as still present on gcc 9 (or 10) / clang 10? |
Hi @leezu , the compiler I used is the latest version of gcc, namely gcc 10.1.0. |
I believe maybe I could help since clearly we are having the same problem, I don't have a notion about cross-compiling but I have access to 16GB+ mem computer |
I remember that it takes fewer than 8GB memory to build the eldder version of MXNet. If we can reduce the cost of memory, it is helpful for building MXNet on laptop computer and edge machine, which own less than 8GB/16GB memory. |
Only 4 files take more than 8GB |
@leezu sorry that I did not check 1.7 anx 1.x branches. |
There seem to be some more issues. In certain build configuration with llvm 7, many of the numpy object files blow up
|
Hi @leezu, @wkcn , as this is only a build issue when building MXNet from source on some certain machines (installed with small memory < 16GB), I suggest not to tag it a block issue for 1.7.0 and consider to include the fix if it's available before the release happened. |
it would be great if you could make a prebuild that works on a raspberry pi with armv7 because I tried to build all versions from 1.2.1 to 1.6.0 and failed. |
Hi @ciyongch , I agree that we don't need to tag it a block issue, and the issue can be fixed after MXNet 1.7 releases. After the problem addressed, we can backport the PR to 1.7.x branch. |
Thanks for your confirm @wkcn :) |
@woreom It seems that the pre-built MXNet 1.5 package will not be uploaded because of ASF licensing policy, but pre-built MXNet 1.7 and 2.0+ on ARM may be uploaded. Before that, you can try the naive build or cross-compiling, following the instruction: https://mxnet.apache.org/get_started?platform=devices&iot=raspberry-pi& |
I disagree. Official MXNet releases are source releases. At this point in time, there exist 0 compliant binary releases. I didn't check if this is present in 1.7, but if it is, it certainly is a release blocker in my opinion. Note that this is probably a regression due the work on mxnet 2. It's not acceptable to introduce such regressions in the 1.x series. |
I measure the overall memory consumption during compilation using linux control group feature. https://github.com/gsauthof/cgmemtime Results are v1.7.x v1.6.x v1.5.x
This is preliminary in that it measures parallel compilation, thus memory usage is very high. Overall there's a 44% increase from 1.5 |
Doing a single-process build of 1.7.x branch ( Child user: 4167.479 s |
I'm trying to use cmake -GNinja -DUSE_CUDA=0 ..
cgmemtime ninja I run
|
Thanks @wkcn. I'll report the same with gcc7. You are using gcc10 right? |
Single process build of MXNet master with gcc7 gives the following results:
That's a 24% increase to 1.7, but less than 3GB high-water. So I don't think we have any blocking issue here. @wkcn I suggest you reduce the number of parallel builds to stay under 16GB. Also recommend to use |
This comment has been minimized.
This comment has been minimized.
Hi @leezu , I found the cause.
Besides, since the compiler flags is different in different building ways (for example |
@wkcn thank you for investigating this. The regression in gcc is quite serious. Would you check if there is a report at https://gcc.gnu.org/bugs/ and potentially open a new bug report? Eventually gcc10 will be shipped by default on many platforms and this issue may affect more users later. |
@leezu Sorry that I do not know how to find the bug report in https://gcc.gnu.org/bugs/ |
@wkcn the bugtracker is linked on the page. It's https://gcc.gnu.org/bugzilla/ |
@leezu Thank you! I guess that the bug is memory leak of the compiler gcc 10.1.0. |
According to #15393 (comment) the leak already occurs with gcc8 |
Description
Hi there, I try to build MXNet2.0 (only cpu) in my laptop with 16GB memory. I found that it takes over 16GB memory to compile a single file src/operator/tensor/index_op.o. I need to create extra 8GB virtual memory for building this file.
Is it possible to divide indexing_op into multiple small files to reduce the memory cost?
Environment
The latest code of MXNet 2.0
Arch Linux
Conclusion
The issue has been solved.
The cost of memory depends on the compiler and the building method (ninja or make)
I build
indexing_op.o
by ninja with different version of gcc.Besides, since the compiler flags is different in different building ways (for example
Makefile
enable-funroll-loops
, it will takes more memory), the cost of memory is different.The solution is to build MXNet with g++-6 or g++-7.
The text was updated successfully, but these errors were encountered: