Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some test failures with GDB 10.1 #2740

Closed
bernhardu opened this issue Nov 15, 2020 · 6 comments
Closed

Some test failures with GDB 10.1 #2740

bernhardu opened this issue Nov 15, 2020 · 6 comments

Comments

@bernhardu
Copy link
Contributor

GDB 10.1 appeared a few days ago in debian testing.
Some tests fail because of the output of command show architecture changed.
The first two changes below should take care of it.

But there is another issue with the mmap_replace_most_mappings test, but I could track it just down to a mapping that seems to point to /usr/lib/x86_64-linux-gnu/ld-2.31.so, but for some reason the name is not correctly retrieved.
Installing just GDB 9.2 makes the test succeed with the same build directory.
Returning early in this case made this test succeed too, but I don't know if that would be right.

    gdb_9.2-1_amd64.deb
    (rr) show architecture
    The target architecture is set automatically (currently i386:x86-64)
    
    gdb_10.1-1_amd64.deb
    (rr) show architecture
    The target architecture is set to "auto" (currently "i386:x86-64").
diff --git a/src/test/util.py b/src/test/util.py
index 3d1c44bd..d7891265 100644
--- a/src/test/util.py
+++ b/src/test/util.py
@@ -89,9 +89,9 @@ def expect(prog, what):
 
 def get_exe_arch():
     send_gdb('show architecture')
-    expect_gdb('The target architecture is set automatically \\(currently ([0-9a-z:-]+)\\)')
+    expect_gdb(r'The target architecture is set (automatically|to "auto") \(currently "?([0-9a-z:-]+)"?\)\.?')
     global gdb_rr
-    return gdb_rr.match.group(1)
+    return gdb_rr.match.group(2)
 
 def get_rr_cmd():
     '''Return the command that should be used to invoke rr, as the tuple
diff --git a/src/test/x86/fxregs.py b/src/test/x86/fxregs.py
index 915bb928..775d23c0 100644
--- a/src/test/x86/fxregs.py
+++ b/src/test/x86/fxregs.py
@@ -18,7 +18,7 @@ for i in range(8):
     expect_gdb(' = %d'%(i + 10))
 
 send_gdb('show architecture')
-have_64 = 0 == expect_list([re.compile('i386:x86-64\)'), re.compile('i386\)')])
+have_64 = 0 == expect_list([re.compile('i386:x86-64"?\)'), re.compile('i386"?\)')])
 
 if have_64:
     for i in range(8,16):
diff --git a/src/GdbServer.cc b/src/GdbServer.cc
index d1896f3b..772a1dd8 100644
--- a/src/GdbServer.cc
+++ b/src/GdbServer.cc
@@ -1908,7 +1908,14 @@ static remote_ptr<void> base_addr_from_rendezvous(Task* t, string fname)
     return nullptr;
   }
   string ld_path = t->vm()->mapping_of(interpreter_base).map.fsname();
+  if (ld_path.length() == 0) {
+    return nullptr;
+    //FATAL() << "Failed to retrieve interpreter name with interpreter_base=" << interpreter_base;
+  }
   ScopedFd ld(ld_path.c_str(), O_RDONLY);
+  if (ld < 0) {
+    FATAL() << "Open failed: " << ld_path;
+  }
   ElfFileReader reader(ld);
   auto syms = reader.read_symbols(".dynsym", ".dynstr");
   static const char r_debug[] = "_r_debug";
904: Thread 1 (Thread 0x7fa2639cdf80 (LWP 1112536) "rr"):
904: #0  __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7ffe3778c0d0, rem=rem@entry=0x7ffe3778c0d0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:79
904: #1  0x00007fa263a9a3a3 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffe3778c0d0, remaining=remaining@entry=0x7ffe3778c0d0) at nanosleep.c:27
904: #2  0x00007fa263a9a2da in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
904: #3  0x0000562805a5126f in rr::notifying_abort () at .../rr/src/util.cc:1446
904: #4  0x00005628058d89de in rr::FatalOstream::~FatalOstream (this=0x7ffe3778c296, __in_chrg=<optimized out>) at .../rr/src/log.cc:360
904: #5  0x00005628058a96ab in rr::base_addr_from_rendezvous (t=0x5628060bc430, fname="/tmp/rr-test-mmap_replace_most_mappings-8OhP9pBy5/mmap_replace_most_mappings-8OhP9pBy5-0/mmap_hardlink_3_mmap_replace_most_mappings-8OhP9pBy5") at .../rr/src/GdbServer.cc:1914
904: #6  0x00005628058aa1e2 in rr::GdbServer::open_file (this=0x7ffe3778ebd0, session=..., continue_task=0x5628060bc430, file_name="/tmp/rr-test-mmap_replace_most_mappings-8OhP9pBy5/mmap_replace_most_mappings-8OhP9pBy5-0/mmap_hardlink_3_mmap_replace_most_mappings-8OhP9pBy5") at .../rr/src/GdbServer.cc:2009
904: #7  0x00005628058a1095 in rr::GdbServer::dispatch_debugger_request (this=0x7ffe3778ebd0, session=..., req=..., state=rr::GdbServer::REPORT_NORMAL) at .../rr/src/GdbServer.cc:420
904: #8  0x00005628058a50f5 in rr::GdbServer::process_debugger_requests (this=0x7ffe3778ebd0, state=rr::GdbServer::REPORT_NORMAL) at .../rr/src/GdbServer.cc:1116
904: #9  0x00005628058a5afe in rr::GdbServer::debug_one_step (this=0x7ffe3778ebd0, last_resume_request=...) at .../rr/src/GdbServer.cc:1219
904: #10 0x00005628058a7e83 in rr::GdbServer::serve_replay (this=0x7ffe3778ebd0, flags=...) at .../rr/src/GdbServer.cc:1689
904: #11 0x00005628059a7c62 in rr::replay (trace_dir="", flags=...) at .../rr/src/ReplayCommand.cc:495
904: #12 0x00005628059a85da in rr::ReplayCommand::run (this=0x562805c497a0 <rr::ReplayCommand::singleton>, args=std::vector of length 0, capacity 8) at .../rr/src/ReplayCommand.cc:616
904: #13 0x0000562805a6b551 in main (argc=9, argv=0x7ffe3778f698) at .../rr/src/main.cc:249
@rocallahan
Copy link
Collaborator

The first two changes below should take care of it.

Great, please submit those as a PR.

for some reason the name is not correctly retrieved.

This probably needs more investigation. Can you show us the rr dump -b -m -p of the trace? Looks like perhaps the ld mapping was not a mapped file, maybe it was copied into the trace for some reason...

@bernhardu
Copy link
Contributor Author

Attached file contains the dump, the test run before and some source changes to show the commands in the bash script and show the output of /proc/$pid/maps in test-monitor of the processes.

904: [FATAL /home/bernhard/data/entwicklung/2020/rr/2020-11-15/rr/src/GdbServer.cc:1912:base_addr_from_rendezvous() errno: EIO] Failed to retrieve interpreter name with interpreter_base=0x7f1f8c18a000

And this is the last occourence of 0x7f1f8c18a000 in the dump file:

{
  real_time:101377.081015 global_time:253, event:`SYSCALL: mmap' (state:EXITING_SYSCALL) tid:4045101, ticks:94714
rax:0x7f1f8c18a000 rbx:0x0 rcx:0xffffffffffffffff rdx:0x0 rsi:0x1000 rdi:0x7f1f8c18a000 rbp:0x7ffd8e7e6100 rsp:0x7ffd8e7e4178 r8:0xffffffffffffffff r9:0x0 r10:0x32 r11:0x246 r12:0x5594a301a0c0 r13:0x0 r14:0x0 r15:0x0 rip:0x5594a301a4ac eflags:0x246 cs:0x33 ss:0x2b ds:0x0 es:0x0 fs:0x0 gs:0x0 orig_rax:0x9 fs_base:0x7f1f8bf81b80 gs_base:0x0
  { map_file:"<ZERO>", addr:0x7f1f8c18a000, length:0x1000, prot_flags:"---p", file_offset:0x0, device:0, inode:0, data_file:"", data_offset:0x0, file_size:0x1000 }
}

rr-ctest_debian_x86_64_mmap_replace_most_mappings_2020-11-16_18-27-16.tar.gz

@rocallahan
Copy link
Collaborator

According to this the tracee wipes out the mapping of the first page of the interpreter with an inaccessible guard page. This causes rr to be unable to determine the interpreter file path.

We probably should capture the ld_path and interpreter_base when the address space is created during execve, and store them alongside the saved_auxv, so we don't have to look them up here and get confused by later mapping changes. Can you implement that?

bernhardu added a commit to bernhardu/rr that referenced this issue Nov 18, 2020
This aids compatibility with GDB 10.1.
Related issue rr-debugger#2740.

Is signalling of "found" needed, or is interpreter_base != nullptr enough?
@bernhardu
Copy link
Contributor Author

Hello @rocallahan, a first attempt to implement it is 498a016. It moves the time when it gets saved to the same when the auxv is stored.

@rocallahan
Copy link
Collaborator

That looks good to me, please submit it.

bernhardu added a commit to bernhardu/rr that referenced this issue Nov 19, 2020
This aids compatibility with GDB 10.1.
Related issue rr-debugger#2740.
bernhardu added a commit to bernhardu/rr that referenced this issue Nov 20, 2020
Retrieving the interpreter_base fails if done after
interpreter gets invalidated.
Related to rr-debugger#2740
rocallahan pushed a commit that referenced this issue Nov 21, 2020
This aids compatibility with GDB 10.1.
Related issue #2740.
rocallahan pushed a commit that referenced this issue Nov 21, 2020
Retrieving the interpreter_base fails if done after
interpreter gets invalidated.
Related to #2740
@bernhardu
Copy link
Contributor Author

Hello, thanks for merging.
I just finished test runs now and all finished ok (a few on the second run) on real hardware x86_64 (zen) and a i386 VM on intel, with faaf9fe on current debian testing.
So I assume it is ok now this issue to close.

bkin pushed a commit to bkin/rr that referenced this issue May 20, 2021
This aids compatibility with GDB 10.1.
Related issue rr-debugger#2740.
bkin pushed a commit to bkin/rr that referenced this issue May 20, 2021
Retrieving the interpreter_base fails if done after
interpreter gets invalidated.
Related to rr-debugger#2740
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants