Deadlock in jemalloc when a signal occurs in malloc and hhvm bt_handler() tries to malloc again #4533

wjywbs · 2014-12-26T23:44:21Z

I added a fopen() in Logger::log() if the output file is closed. (I'm not sure if this is related to this issue.) This doesn't happen very often, but all threads deadlock in jemalloc.

#0  0x00007fcc372ea264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fcc372e5508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007fcc372e53d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fcc3a8488dd in je_malloc_mutex_lock (mutex=0x7fcc32800bf8)
    at include/jemalloc/internal/mutex.h:77
#4  je_arena_tcache_fill_small (arena=0x7fcc328000c0, 
    tbin=tbin@entry=0x7fcc10228248, binind=binind@entry=17, 
    prof_accumbytes=prof_accumbytes@entry=0) at src/arena.c:1475
#5  0x00007fcc3a86f8cf in je_tcache_alloc_small_hard (
    tcache=tcache@entry=0x7fcc10228000, tbin=tbin@entry=0x7fcc10228248, 
    binind=binind@entry=17) at src/tcache.c:72
#6  0x00007fcc3a83e5c5 in je_tcache_alloc_small (zero=false, size=640, 
    tcache=0x7fcc10228000) at include/jemalloc/internal/tcache.h:303
#7  je_arena_malloc (try_tcache=true, zero=false, size=<optimized out>, 
    arena=0x0) at include/jemalloc/internal/arena.h:957
#8  je_imalloct (arena=0x0, try_tcache=true, size=<optimized out>)
    at include/jemalloc/internal/jemalloc_internal.h:771
#9  je_imalloc (size=<optimized out>)
    at include/jemalloc/internal/jemalloc_internal.h:780
#10 malloc (size=<optimized out>) at src/jemalloc.c:929
#11 0x00007fcc365cbecb in __fopen_internal () from /lib64/libc.so.6
#12 0x0000000001a1130b in real_fopen (
    filename=filename@entry=0x7fff4455f773 "hhvm", modes=<optimized out>)
    at bfdio.c:120
#13 0x000000000199a708 in bfd_fopen (filename=0x7fff4455f773 "hhvm", 
    target=0x0, mode=0x20f18c5 "r", fd=-1) at opncls.c:219
#14 0x00000000016af7a5 in HPHP::fill_bfd_cache(char const*, HPHP::bfd_cache*)
    ()
#15 0x00000000016b14f2 in HPHP::StackTraceBase::Addr2line(char const*, char const*, HPHP::StackTraceBase::Frame*, void*, void*, unsigned int) ()
#16 0x00000000016b21dc in HPHP::StackTraceBase::Translate(void*, HPHP::StackTraceBase::Frame*, Dl_info&, void*, void*, unsigned int) ()
#17 0x00000000016b22a5 in HPHP::StackTraceNoHeap::Translate(int, void*, int, void*, unsigned int) ()
#18 0x00000000016b2516 in HPHP::StackTraceNoHeap::printStackTrace(int) const ()
#19 0x00000000016b27a2 in HPHP::StackTraceNoHeap::log(char const*, char const*, char const*, char const*, int) const ()
#20 0x0000000000a36e72 in HPHP::bt_handler(int) ()
#21 <signal handler called>
#22 je_bitmap_set (bit=18446744073709551615, 
    binfo=0x7fcc3aa7e9d0 <je_arena_bin_info+1808>, bitmap=0x7fcbc125b010)
    at include/jemalloc/internal/bitmap.h:105
#23 je_bitmap_sfu (binfo=0x7fcc3aa7e9d0 <je_arena_bin_info+1808>, 
    bitmap=0x7fcbc125b010) at include/jemalloc/internal/bitmap.h:140
#24 arena_run_reg_alloc (bin_info=0x7fcc3aa7e9a8 <je_arena_bin_info+1768>, 
    run=0x7fcbc125b000) at src/arena.c:291
#25 je_arena_tcache_fill_small (arena=0x7fcc328000c0, 
    tbin=tbin@entry=0x7fcc10228248, binind=binind@entry=17, 
    prof_accumbytes=prof_accumbytes@entry=0) at src/arena.c:1479
#26 0x00007fcc3a86f8cf in je_tcache_alloc_small_hard (
    tcache=tcache@entry=0x7fcc10228000, tbin=tbin@entry=0x7fcc10228248, 
    binind=binind@entry=17) at src/tcache.c:72
#27 0x00007fcc3a83e5c5 in je_tcache_alloc_small (zero=false, size=640, 
    tcache=0x7fcc10228000) at include/jemalloc/internal/tcache.h:303
#28 je_arena_malloc (try_tcache=true, zero=false, size=<optimized out>, 
    arena=0x0) at include/jemalloc/internal/arena.h:957
#29 je_imalloct (arena=0x0, try_tcache=true, size=<optimized out>)
    at include/jemalloc/internal/jemalloc_internal.h:771
#30 je_imalloc (size=<optimized out>)
    at include/jemalloc/internal/jemalloc_internal.h:780
#31 malloc (size=<optimized out>) at src/jemalloc.c:929
#32 0x00007fcc365cbecb in __fopen_internal () from /lib64/libc.so.6
#33 0x00000000016c9ef2 in HPHP::Logger::Error(std::string const&) ()
#34 0x000000000127cff7 in HPHP::f_error_log(HPHP::String const&, int, HPHP::String const&, HPHP::String const&) ()

So a signal occurred in malloc() with its lock locked, and hhvm tried to print the stack trace. fill_bfd_cache() called bfd_openr(), which tried to fopen() again and malloc tried to acquire lock again.

// In hphp/util/stack-trace.cpp, fill_bfd_cache()
bfd *abfd = bfd_openr(filename, nullptr); // hard to avoid heap here!

I have several questions:

Should I add a VMRegAnchor _; in HHVM_FUNCTION(error_log)? I see that it does not use an anchor when writing to a destination file.
Should the stack trace handler preload some symbols when starting hhvm, at least hhvm itself? Or can it detect if there is a memory deadlock?
I am tracing that the main log file may be closed when ini_set('log_errors', 0) in php, or ini_set('error_log', 'file') (handled by hphp/runtime/base/request-injection-data.cpp). Should this only affect the current script, instead of closing/replacing the main log file?

Thanks.

The text was updated successfully, but these errors were encountered:

paulbiss · 2015-01-05T21:01:21Z

Looks like it's hanging while trying to generate the stacktrace, I'm pretty sure StackTraceNoHeap calls malloc so that's not surprising.

…oid deadlock. jemalloc is most likely crashed and acquired its lock if it's found in the stacktrace. Be conservative and avoid malloc again and deadlock. Part of facebook#4533 and facebook#2408.

…andler. This specifies the maximum number of seconds spent for generating a stack trace when hhvm is crashed. The default is 0 which means no timeout. This can be set to prevent from deadlocks in the backtrace handler. Part of facebook#4533 and facebook#2408.

@markw65

…andler. Summary: This specifies the maximum number of seconds spent for generating a stack trace when hhvm is crashed. The default is 0 which means no timeout. This can be set to prevent from deadlocks in the backtrace handler. Part of #4533 and #2408. Closes #4828 Reviewed By: @markw65 Differential Revision: D1844818 Pulled By: @fredemmott

wjywbs · 2015-08-26T18:21:26Z

Fixed by d43f26b.

paulbiss added the crash label Jan 5, 2015

wjywbs mentioned this issue Jan 26, 2015

Output simple stacktrace if jemalloc is found in the stacktrace to avoid deadlock. #4712

Closed

wjywbs mentioned this issue Feb 12, 2015

Add a StackTraceTimeout option to set the timeout for the backtrace handler. #4828

Closed

wjywbs closed this as completed Aug 26, 2015

wjywbs mentioned this issue Aug 28, 2015

Segmentation fault #6166

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlock in jemalloc when a signal occurs in malloc and hhvm bt_handler() tries to malloc again #4533

Deadlock in jemalloc when a signal occurs in malloc and hhvm bt_handler() tries to malloc again #4533

wjywbs commented Dec 26, 2014

paulbiss commented Jan 5, 2015

wjywbs commented Aug 26, 2015

Deadlock in jemalloc when a signal occurs in malloc and hhvm bt_handler() tries to malloc again #4533

Deadlock in jemalloc when a signal occurs in malloc and hhvm bt_handler() tries to malloc again #4533

Comments

wjywbs commented Dec 26, 2014

paulbiss commented Jan 5, 2015

wjywbs commented Aug 26, 2015