Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v6 has buffer issues on Fedora 24 #6272

Closed
targos opened this issue Apr 19, 2016 · 17 comments · Fixed by #6544
Closed

v6 has buffer issues on Fedora 24 #6272

targos opened this issue Apr 19, 2016 · 17 comments · Fixed by #6544
Labels
buffer Issues and PRs related to the buffer subsystem. memory Issues and PRs related to the memory management or memory footprint. v8 engine Issues and PRs related to the V8 dependency.

Comments

@targos
Copy link
Member

targos commented Apr 19, 2016

make test output:

/usr/bin/python tools/test.py --mode=release message parallel sequential -J
=== release test-buffer-slow ===                                               
Path: parallel/test-buffer-slow
Command: out/Release/node /home/mzasso/git/forks/node/test/parallel/test-buffer-slow.js
--- CRASHED ---
=== release test-fs-read-buffer-tostring-fail ===                              
Path: parallel/test-fs-read-buffer-tostring-fail
Command: out/Release/node /home/mzasso/git/forks/node/test/parallel/test-fs-read-buffer-tostring-fail.js
--- CRASHED ---
=== release test-fs-readfile-tostring-fail ===                          
Path: parallel/test-fs-readfile-tostring-fail
Command: out/Release/node /home/mzasso/git/forks/node/test/parallel/test-fs-readfile-tostring-fail.js
--- CRASHED ---
=== release test-tick-processor ===                                            
Path: parallel/test-tick-processor
assert.js:90
  throw new assert.AssertionError({
  ^
AssertionError: false == true
    at runTest (/home/mzasso/git/forks/node/test/parallel/test-tick-processor.js:58:3)
    at Object.<anonymous> (/home/mzasso/git/forks/node/test/parallel/test-tick-processor.js:21:1)
    at Module._compile (module.js:532:32)
    at Object.Module._extensions..js (module.js:541:10)
    at Module.load (module.js:447:32)
    at tryModuleLoad (module.js:406:12)
    at Function.Module._load (module.js:398:3)
    at Function.Module.runMain (module.js:566:10)
    at startup (node.js:159:18)
    at node.js:444:3
Command: out/Release/node /home/mzasso/git/forks/node/test/parallel/test-tick-processor.js
[00:48|% 100|+ 1072|-   4]: Done                                               
Makefile:118: recipe for target 'test' failed
make: *** [test] Error 1

test/parallel/test-buffer-slow.js segfaults at SlowBuffer(buffer.kMaxLength)
test/parallel/test-fs-read-buffer-tostring-fail.js at fs.read(fd, kStringMaxLength + 1, ...
test/parallel/test-fs-readfile-tostring-fail.js similarly at fs.readFile(file, ...)

Is there anything I can do to investigate this issue ?

@mscdex mscdex added the buffer Issues and PRs related to the buffer subsystem. label Apr 19, 2016
@mscdex
Copy link
Contributor

mscdex commented Apr 19, 2016

/cc @trevnorris

@bnoordhuis
Copy link
Member

FWIW, I run x86_64 FC24 as well and make test passes for me. How much memory does your machine have and are their ulimits in effect?

@targos
Copy link
Member Author

targos commented Apr 19, 2016

How much memory does your machine have

16 GB

are their ulimits in effect?

$ ulimit -a
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         unlimited
-m: resident set size (kbytes)      unlimited
-u: processes                       62844
-n: file descriptors                4096
-l: locked-in-memory size (kbytes)  64
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 62844
-q: bytes in POSIX msg queues       819200
-e: max nice                        0
-r: max rt priority                 0
-N 15:                              unlimited

BTW I don't need to compile it myself. It also fails with the latest RC.

@bnoordhuis
Copy link
Member

That's not too different from my setup. What do the stack traces in gdb look like? What happens with a debug build? (make -j8 -C out BUILDTYPE=Debug)

@targos
Copy link
Member Author

targos commented Apr 19, 2016

#0  0x0000000000b850e8 in v8::internal::IncrementalMarking::ActivateIncrementalWriteBarrier() ()
#1  0x0000000000b85334 in v8::internal::IncrementalMarking::StartMarking() ()
#2  0x0000000000b855bf in v8::internal::IncrementalMarking::Start(char const*) ()
#3  0x0000000000b5b82b in v8::internal::ArrayBufferTracker::RegisterNew(v8::internal::JSArrayBuffer*) ()
#4  0x0000000000c52de3 in v8::internal::JSArrayBuffer::SetupAllocatingData(v8::internal::Handle<v8::internal::JSArrayBuffer>, v8::internal::Isolate*, unsigned long, bool, v8::internal::SharedFlag) ()
#5  0x00000000008d29fb in v8::internal::Builtin_ArrayBufferConstructor_ConstructStub(int, v8::internal::Object**, v8::internal::Isolate*) ()
#6  0x00002cfb9f10959b in ?? ()
#7  0x0000000000000000 in ?? ()

It doesn't crash with a debug build.

@bnoordhuis
Copy link
Member

Is the stack trace always the same? Maybe you can add some printf statements to IncrementalMarking::ActivateIncrementalWriteBarrier() to find out exactly where in the function it's crashing?

Just to be sure, it dies with a SIGSEGV, not e.g. a SIGILL? Did you check the disassembly and the registers at the crash site?

@targos
Copy link
Member Author

targos commented Apr 19, 2016

@bnoordhuis
Copy link
Member

Does that mean lop == nullptr right before the call to SetOldSpacePageFlags? Can you add a printf to check?

@targos
Copy link
Member Author

targos commented Apr 19, 2016

Does that mean lop == nullptr right before the call to SetOldSpacePageFlags? Can you add a printf to check?

Yes

@targos
Copy link
Member Author

targos commented Apr 19, 2016

void IncrementalMarking::ActivateIncrementalWriteBarrier() {
  PrintF("ActivateIncrementalWriteBarrier\n");
  ActivateIncrementalWriteBarrier(heap_->old_space());
  ActivateIncrementalWriteBarrier(heap_->map_space());
  ActivateIncrementalWriteBarrier(heap_->code_space());
  ActivateIncrementalWriteBarrier(heap_->new_space());

  LargePage* lop = heap_->lo_space()->first_page();
  while (lop->is_valid()) {
    PrintF(lop->is_valid() ? "valid\n" : "not valid\n");
    PrintF(lop == nullptr ? "null\n" : "not null\n");
    SetOldSpacePageFlags(lop, true, is_compacting_);
    lop = lop->next_page();
  }
}
% ./node --trace-incremental-marking
> var x = new buffer.SlowBuffer(buffer.kMaxLength)
[IncrementalMarking] Start (external memory allocation limit reached.)
[IncrementalMarking] Start marking
ActivateIncrementalWriteBarrier
valid
null
[1]    1515 segmentation fault (core dumped)  ./node --trace-incremental-marking

@bnoordhuis
Copy link
Member

I think you're hitting undefined behavior in V8. The lop->is_valid() method is essentially a this != nullptr check, which is undefined behavior according to the spec because this is never allowed to be null. I speculate the compiler optimizes away the check completely at -O2 and higher.

The reason you're seeing crashes and I don't is presumably because I haven't upgraded my copy of g++ yet. For the record, here is what I'm currently building with:

$ g++ -v
Using built-in specs.
COLLECT_GCC=/usr/bin/g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/5.3.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --disable-libgcj --with-isl --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 5.3.1 20160406 (Red Hat 5.3.1-6) (GCC) 

@targos
Copy link
Member Author

targos commented Apr 19, 2016

Yeah I was also thinking about a difference of compilers. This is my version:

% g++ -v
Using built-in specs.
COLLECT_GCC=/usr/bin/g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/6.0.0/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --disable-libgcj --with-isl --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 6.0.0 20160406 (Red Hat 6.0.0-0.20) (GCC) 

So is this a bug on V8's side ? With the following patch, all our tests pass again:

diff --git a/deps/v8/src/heap/incremental-marking.cc b/deps/v8/src/heap/incremental-marking.cc
index ce6f6ee..ab61576 100644
--- a/deps/v8/src/heap/incremental-marking.cc
+++ b/deps/v8/src/heap/incremental-marking.cc
@@ -404,7 +404,7 @@ void IncrementalMarking::DeactivateIncrementalWriteBarrier() {
   DeactivateIncrementalWriteBarrierForSpace(heap_->new_space());

   LargePage* lop = heap_->lo_space()->first_page();
-  while (lop->is_valid()) {
+  while (lop != nullptr) {
     SetOldSpacePageFlags(lop, false, false);
     lop = lop->next_page();
   }
@@ -436,7 +436,7 @@ void IncrementalMarking::ActivateIncrementalWriteBarrier() {
   ActivateIncrementalWriteBarrier(heap_->new_space());

   LargePage* lop = heap_->lo_space()->first_page();
-  while (lop->is_valid()) {
+  while (lop != nullptr) {
     SetOldSpacePageFlags(lop, true, is_compacting_);
     lop = lop->next_page();
   }

@bnoordhuis
Copy link
Member

I think that explains it. I'd report it to V8.

@targos targos added v8 engine Issues and PRs related to the V8 dependency. memory Issues and PRs related to the memory management or memory footprint. labels Apr 19, 2016
@kzc
Copy link

kzc commented Apr 19, 2016

https://gcc.gnu.org/gcc-6/changes.html

Value range propagation now assumes that the this pointer of C++ member functions is non-null. This eliminates common null pointer checks but also breaks some non-conforming code-bases (such as Qt-5, Chromium, KDevelop). As a temporary work-around -fno-delete-null-pointer-checks can be used. Wrong code can be identified by using -fsanitize=undefined.

@targos
Copy link
Member Author

targos commented Apr 20, 2016

@bnoordhuis
Copy link
Member

bnoordhuis commented Apr 20, 2016

Enabling -fno-delete-null-pointer-checks for now is a good idea. We can float that as a patch on top of V8 to unbreak the build but it should be upstreamed at some point.

targos added a commit to targos/node that referenced this issue Apr 27, 2016
V8 erroneously does null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Enable -fno-delete-null-pointer-checks to circumvent this issue.

Ref: nodejs#6272
@jeisinger
Copy link
Contributor

this was fixed in https://codereview.chromium.org/1900423002 (as part of a bigger change).

You could backport the pieces (lop->is_valid() -> LargePage::IsValid(lop))

We should not call methods on this == nullptr as this is undefined behavior, so I'd rather not disable this optimization but fix the places that incorrectly use this == nullptr.

targos added a commit to targos/node that referenced this issue May 4, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: nodejs#6272
PR-URL: nodejs#6544
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
joelostrowski pushed a commit to joelostrowski/node that referenced this issue May 4, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: nodejs#6272
PR-URL: nodejs#6544
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
Fishrock123 pushed a commit to Fishrock123/node that referenced this issue May 4, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: nodejs#6272
PR-URL: nodejs#6544
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
targos added a commit to targos/node that referenced this issue May 10, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: nodejs#6272
MylesBorins pushed a commit that referenced this issue May 11, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: #6272

PR-URL: #6669
Reviewed-By: Myles Borins <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
MylesBorins pushed a commit that referenced this issue May 18, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: #6272

PR-URL: #6669
Reviewed-By: Myles Borins <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
gibfahn pushed a commit to ibmruntimes/node that referenced this issue May 25, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: nodejs/node#6272

PR-URL: nodejs/node#6669
Reviewed-By: Myles Borins <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
targos added a commit to targos/node that referenced this issue Jun 3, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: nodejs#6272
PR-URL: nodejs#6544
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
targos added a commit to targos/node that referenced this issue Jun 29, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: nodejs#6272
PR-URL: nodejs#6544
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
ofrobots pushed a commit to ofrobots/node that referenced this issue Aug 25, 2016
V8 erroneously did null pointer checks on `this`.
It can lead to a SIGSEGV crash if node is compiled with GCC 6.
Backport relevant changes from [1] that fix this issue.

[1]: https://codereview.chromium.org/1900423002

Fixes: nodejs#6272
PR-URL: nodejs#6544
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Fedor Indutny <[email protected]>
This was referenced Aug 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. memory Issues and PRs related to the memory management or memory footprint. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants