Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak with function redefinition #30653

Closed
cstjean opened this issue Jan 8, 2019 · 16 comments · Fixed by #32428
Closed

Memory leak with function redefinition #30653

cstjean opened this issue Jan 8, 2019 · 16 comments · Fixed by #32428
Assignees
Labels
GC Garbage collector

Comments

@cstjean
Copy link
Contributor

cstjean commented Jan 8, 2019

This code takes less than a minute to fill the 64GB of my Linux machine:

julia> versioninfo()
Julia Version 1.0.3
Commit 099e826241 (2018-12-18 01:34 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, haswell)

julia> for i in 1:100
           @eval function foo()
               [fill(1.0, 2_000_000) for _ in 1:100]
               nothing
           end
           @time @eval foo()
           # GC.gc()   # uncomment this line to fix the issue
       end

EDIT: simplified code

@Keno
Copy link
Member

Keno commented Jan 8, 2019

Looks like julia does know about this memory because the full GC is able to reclaim it.

@cstjean
Copy link
Contributor Author

cstjean commented Jan 8, 2019

It seems that it's the function redefinition which somehow causes a... lost GC root? I'm out of my depth here, but if I take out the @evals, then there's no leak. I found this by repeatedly running similar code in an IJulia cell.

@cstjean
Copy link
Contributor Author

cstjean commented Jan 9, 2019

Looks like julia does know about this memory because the full GC is able to reclaim it.

FWIW, if I run it for 30 iterations, then gc(), it seems to only reclaim the last foo()'s garbage.

@nalimilan nalimilan added the GC Garbage collector label Jan 29, 2019
@cstjean cstjean changed the title Memory leak with vector of tuple Memory leak with vector of vector Feb 8, 2019
@cstjean cstjean changed the title Memory leak with vector of vector Memory leak with function redefinition Feb 8, 2019
@cstjean
Copy link
Contributor Author

cstjean commented Feb 8, 2019

Here's code that shows that the function hangs on to its temporary's memory even once it's returned. I define 100 identical functions, that each allocate a large temporary vector of vector:

julia> memory_usage() = parse(Int, split(read(`ps -p $(getpid()) -o rss`, String))[2]) / 1024 # in MB
memory_usage (generic function with 1 method)

julia> funs = []
       for i in 1:100    
           f = Symbol(:foo, i)
           @eval function $f()
               [fill(1.0, 2_000_000) for _ in 1:100]
               nothing
           end
           @eval push!(funs, $f)
       end

Calling the first function 20 times does not increase memory usage (in MB):

julia> mem_used = [(funs[1](); memory_usage()) for j in 1:20];

julia> using UnicodePlots; lineplot(mem_used)
        ┌────────────────────────────────────────┐ 
   4000 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⠀⢀⡀⠀⠀⠀⠀⢀⡠⠤⠤⠤⣀⣀⡠⠤⡄⠀⠀⢀⠤⠤⡄⠀⠀⢀⠤⠤⡄⠀⠀⢀⠤⠤⡄⠀│ 
        │⠀⠀⠀⠀⠀⢸⡇⠀⠀⢠⠔⠁⠀⠀⠀⠀⠀⠀⠀⠀⠈⠢⠔⠁⠀⠀⠈⠢⠔⠁⠀⠀⠈⠢⠔⠁⠀⠀⠈⠢│ 
        │⠀⠀⠀⠀⠀⡸⢣⠀⠀⢸⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⠀⡇⢸⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⢠⠃⠈⡆⢠⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⢸⠀⠀⡇⢸⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⡜⠀⠀⢸⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⡇⠀⠀⠘⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠒⠒⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
   1000 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
        └────────────────────────────────────────┘ 
        0                                       20

But calling the jth function does:

julia> mem_used = [(funs[j](); memory_usage()) for j in 1:20];

julia> using UnicodePlots; lineplot(mem_used)
         ┌────────────────────────────────────────┐ 
   20000 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⣀⠤⠔⠒⠊⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠔⠉⠉⠉⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠤⠒⠉⠉⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡠⠎⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠔⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡠⠎⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠔⠊⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⡠⠤⠒⠒⠉⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⢀⠎⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠔⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
       0 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         └────────────────────────────────────────┘ 
         0                                       20

@staticfloat
Copy link
Member

This is just me thinking out loud here, but could it literally just be code size? You are generating new functions every time, and those functions take a certain amount of memory, so this could just be due to the fact that you're generating new code in a loop.

@cstjean
Copy link
Contributor Author

cstjean commented Mar 21, 2019

That code creates 100 simple functions, calls 20 of them, then memory usage goes to 20 GB. That would be excessive for compiled code, no?

@staticfloat
Copy link
Member

......ah. I did not realize those were the units on your graphs. ;) Yes, that does seem excessive.

@Keno
Copy link
Member

Keno commented Mar 23, 2019

I spent a couple hours looking at this and it turns out this isn't actually a memory leak on the Julia side. The arrays get alloced using malloc and we appropriately free them on the next GC. However, for some reason glibc refuses to release the memory to the operating system. This can be worked around manually by ccalling malloc_trim, which cuts the memory use right back down to acceptable levels. We may want to consider doing that automatically on full GC, though looking into exactly why glibc doesn't release any of the memory here might be interesting as well.

@cstjean
Copy link
Contributor Author

cstjean commented Apr 9, 2019

malloc_trim freed 450MB out of 1020MB in our application, after GC!

What I don't get is that these are large allocations, which should be done with mmap, according to the man page:

M_MMAP_THRESHOLD
For allocations greater than or equal to the limit specified
(in bytes) by M_MMAP_THRESHOLD that can't be satisfied from
the free list, the memory-allocation functions employ mmap(2)
instead of increasing the program break using sbrk(2).
...
Balancing these factors leads to a default setting
of 128*1024 for the M_MMAP_THRESHOLD parameter.

And thus malloc_trim should be irrelevant. Is Julia overriding this parameter?

For anyone else like me who had no clue what brkis, this page was instructive. This thread suggests that the Linux man page for malloc_trim is inaccurate (it now scans and frees the whole heap, not just the top).

@tbenst
Copy link

tbenst commented Apr 11, 2019

@Keno could you share the line for ccall of malloc_trim? I believe that will solve JuliaImages/Images.jl#670 as well

@Keno
Copy link
Member

Keno commented Apr 11, 2019

ccall(:malloc_trim, Cvoid, (Cint,), 0)

@ViralBShah
Copy link
Member

ViralBShah commented Apr 11, 2019

Is this something we can call when users do gc()? Or perhaps periodically?

@cstjean
Copy link
Contributor Author

cstjean commented Jun 19, 2019

Is there any chance that the proposed fix could be tried? The work-around solves our immediate issue, but having to pepper the code with malloc_trim() is less than ideal.

@ViralBShah
Copy link
Member

Pinging @JeffBezanson and @vtjnash here.

@JeffBezanson JeffBezanson self-assigned this Jun 26, 2019
JeffBezanson added a commit that referenced this issue Jun 26, 2019
this works around what seems to be a glibc bug
JeffBezanson added a commit that referenced this issue Jun 27, 2019
this works around what seems to be a glibc bug
JeffBezanson added a commit that referenced this issue Jun 27, 2019
this works around what seems to be a glibc bug
(cherry picked from commit f77743c)
@nh2
Copy link

nh2 commented Mar 25, 2020

Hi, I found this via Hacker News.

In the last 2 years I got some experience with this topic, from debugging memory usage of C modules in my Haskell applications (also found a bug in realloc() on the way).

What I don't get is that these are large allocations, which should be done with mmap, according to the man page:

@cstjean You skipped quoting the relevant part of the man page:

Note: Nowadays, glibc uses a dynamic mmap threshold by
default. The initial value of the threshold is 128*1024, but
when blocks larger than the current threshold and less than or
equal to DEFAULT_MMAP_THRESHOLD_MAX are freed, the threshold
is adjusted upward to the size of the freed block. When
dynamic mmap thresholding is in effect, the threshold for
trimming the heap is also dynamically adjusted to be twice the
dynamic mmap threshold. Dynamic adjustment of the mmap
threshold is disabled if any of the M_TRIM_THRESHOLD,
M_TOP_PAD, M_MMAP_THRESHOLD, or M_MMAP_MAX parameters is set.

Where DEFAULT_MMAP_THRESHOLD_MAX defaults to 32 MiB on 64-bit systems:

The lower limit for this parameter is 0. The upper limit is
DEFAULT_MMAP_THRESHOLD_MAX: 512*1024 on 32-bit systems or
4*1024*1024*sizeof(long) on 64-bit systems.

That means you can easily get into a situation where allocations < 32 MiB are not served with mmap.

See also the glibc docs on the topic.


@JeffBezanson From #32428 (comment)

what seems to be a glibc bug

That statement seems unfounded, is there anything that hints at this being a bug?

It seems that Julia is simply discovering the effects of memory fragmentation and the corresponding glibc malloc tunables.

On my deployed application I correspondingly use:

# malloc()s larger than this many bytes are served with their own mmap() that
# can be free()d individually.
# This overrides the glibc default (dynamic threshold growing up to 32MB)
# with a fixed value. Not giving a value keeps glibc's default.
# We found that the given value is best for our use case,
# reducing memory fragmentation by 8x, which is many GB for our use case.
M_MMAP_THRESHOLD=65536

which as stated, for my use case reduces memory fragmentation by 8x.

@Keno
Copy link
Member

Keno commented Mar 25, 2020

That may be, but frankly an allocator not using gigabytes of space that are available to it is pretty egregious no matter what the allocation pattern is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GC Garbage collector
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants