-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DArray : memory not fully recovered upon gc() #8912
Comments
Starting with one julia worker, and executing the below multiple times,
I notice a leak the first time around but not on subsequent invocations of the However, if I change the
there is additional memory retained in the worker every time around. There is also an increase in the memory size of the master, but that is only for the first time. I suspect there are two issues here: I am pretty sure one is #6597, since both |
I noticed another thing which may be related or a separate issue: On Windows 8.1, 64 bit, Julia 0.4.0-dev+1318 7a7110b, when using distribute() to create a DArray over 12 cores from an array of size 100, it seems the full memory is used on each julia subprocess. That is, I have a memory usage of 2.6GB in the process that calls distribute(A), where around 1.8GB are in the array A, and afterwards all 11 processes (using addprocs(11)), use the same amount of memory, 2.6GB that is. This was not the case three weeks ago on 0.4.0-dev. Another symptom is that the distribute() call is much slower than before, and during the several minutes or so it is taking, only a few Julia processes are active. For example, initially the main Julia process uses 8% CPU (1/12), and one other Julia process is also using 8% CPU (1/12), but there are 10 processes which have 0% CPU activity. After a minute there are three processes, each using 8%, after another minute, there are four processes, etc. until all processes are active and 100% CPU is utilized. Then, the distribute() call returns. This is very different from what I observed when originally developing the code a few weeks ago and I have not changed it since. |
Here are a couple of more problem reports for reference: As a workaround, removing all the workers (rmprocs(workers())) and restarting them (addprocs(ncpu)) every iteration seems to work. I suspect the problem may go deeper than distributed arrays: In my case the distributed array is not that big, but the result I fetch from the workers (via pmap) is. The memory usage is consistent with those fetched values not being properly garbage collected. I will post a simple example to replicate if I can come up with one. |
OK, here is the example:
and here is the output (10GB memory added to master every iteration):
|
What's the status on this? Until we have a fix, this issue seems significant enough to warrant labeling DArrays as an experimental feature. At the very least I think something should be mentioned about this in the docs. |
@samuela DArrays no longer exist in base. |
Oh, cool beans! Should this issue be closed then? What should be used in place of DArrays now? |
https://github.com/JuliaParallel/DistributedArrays.jl took over the DArray code. I'm not sure where the actual issue lies, but since it's been crossreferenced by folks who should have some idea, I'll leave this open. |
After 3bbc5fc
I find that the master slowly grows to around 10GB of resident memory after 20 iterations which holds steady for the next 180 iterations. This 10GB is not released even after all the iterations complete. |
it seems possible that |
why would that impact memory usage on the master node? |
With
with the master process varying between 30% to 40% of system memory and the workers between 15% to 30% It is no longer a leak, the loop runs to completion, but at the end, the memory is not being released. Does libuv or the malloc implementation cache memory buffers anticipating future use? I'll try and test the stuff on OSX later in the day to see if the behavior is limited to Linux. |
libuv tries really hard not to allocate anything. but malloc and the julia gc will hold onto some amount of memory. 10GB sounds a bit high. although, i guess if there was something on every page, it would have trouble releasing the memory fully. |
closed by 6b94780 |
DArrays do not seem to be fully garbage collected.
context: https://groups.google.com/d/msg/julia-users/O8-Axv7wVZE/Xe-_8LIGKhAJ
The text was updated successfully, but these errors were encountered: