gdb: attach to a process when the executable has been deleted

Bug PR gdb/28313 describes attaching to a process when the executable has been deleted. The bug is for S390 and describes how a user sees a message 'PC not saved'. On x86-64 (GNU/Linux) I don't see a 'PC not saved' message, but instead I see this: (gdb) attach 901877 Attaching to process 901877 No executable file now. warning: Could not load vsyscall page because no executable was specified 0x00007fa9d9c121e7 in ?? () (gdb) bt #0 0x00007fa9d9c121e7 in ?? () #1 0x00007fa9d9c1211e in ?? () #2 0x0000000000000007 in ?? () #3 0x000000002dc8b18d in ?? () #4 0x0000000000000000 in ?? () (gdb) Notice that the addresses in the backtrace don't seem right, quickly heading to 0x7 and finally ending at 0x0. What's going on, in both the s390 case and the x86-64 case is that the architecture's prologue scanner is going wrong and causing the stack unwinding to fail. The prologue scanner goes wrong because GDB has no unwind information. And GDB has no unwind information because, of course, the executable has been deleted. Notice in the example session above we get this line in the output: No executable file now. which indicates that GDB failed to find an executable to debug. For GNU/Linux when GDB tries to find an executable for a given pid we end up calling linux_proc_pid_to_exec_file in gdb/nat/linux-procfs.c. Within this function we call `readlink` on /proc/PID/exe to find the path of the actual executable. If the `readlink` call fails then we already fallback on using /proc/PID/exe as the path to the executable to debug. However, when the executable has been deleted the `readlink` call doesn't fail, but the path that is returned points to a non-existent file. I propose that we add an `access` call to linux_proc_pid_to_exec_file to check that the target file exists and can be read. If the target can't be read then we should fall back to /proc/PID/exe (assuming that /proc/PID/exe can be read). Now on x86-64 the output looks like this: (gdb) attach 901877 Attaching to process 901877 Reading symbols from /proc/901877/exe... Reading symbols from /lib64/libc.so.6... (No debugging symbols found in /lib64/libc.so.6) Reading symbols from /lib64/ld-linux-x86-64.so.2... (No debugging symbols found in /lib64/ld-linux-x86-64.so.2) 0x00007fa9d9c121e7 in nanosleep () from /lib64/libc.so.6 (gdb) bt #0 0x00007fa9d9c121e7 in nanosleep () from /lib64/libc.so.6 #1 0x00007fa9d9c1211e in sleep () from /lib64/libc.so.6 #2 0x000000000040117e in spin_forever () at attach-test.c:17 #3 0x0000000000401198 in main () at attach-test.c:24 (gdb) which is much better. I've also tagged the bug PR gdb/29782 which concerns the test gdb.server/connect-with-no-symbol-file.exp. After making this change, when running gdb.server/connect-with-no-symbol-file.exp GDB would now pick up the /proc/PID/exe file as the executable in some cases. As GDB is not restarted for the multiple iterations of this test GDB (or rather BFD) would given a warning/error like: (gdb) PASS: gdb.server/connect-with-no-symbol-file.exp: sysroot=target:: action=permission: setup: disconnect set sysroot target: BFD: reopening /proc/3283001/exe: No such file or directory (gdb) FAIL: gdb.server/connect-with-no-symbol-file.exp: sysroot=target:: action=permission: setup: adjust sysroot What's happening is that an executable found for an earlier iteration of the test is still registered for the inferior when we are setting up for a second iteration of the test. When the sysroot changes, if there's an executable registered GDB tries to reopen it, but in this case the file has disappeared (the previous inferior has exited by this point). I did think about maybe, when the executable is /proc/PID/exe, we should auto-delete the file from the inferior. But in the end I thought this was a bad idea. Not only would this require a lot of special code in GDB just to support this edge case: we'd need to track if the exe file name came from /proc and should be auto-deleted, or we'd need target specific code to check if a path should be auto-deleted..... ... in addition, we'd still want to warn the user when we auto-deleted the file from the inferior, otherwise they might be surprised to find their inferior suddenly has no executable attached, so we wouldn't actually reduce the number of warnings the user sees. So in the end I figured that the best solution is to just update the test to avoid the warning. This is easily done by manually removing the executable from the inferior once each iteration of the test has completed. Now, in bug PR gdb/29782 GDB is clearly managing to pick up an executable from the NFS cache somehow. I guess what's happening is that when the original file is deleted /proc/PID/exe is actually pointing to a file in the NFS cache which is only deleted at some later point, and so when GDB starts up we do manage to associate a file with the inferior, this results in the same message being emitted from BFD as I was seeing. The fix included in this commit should also fix that bug. One final note: On x86-64 GNU/Linux, the gdb.server/connect-with-no-symbol-file.exp test will produce 2 core files. This is due to a bug in gdbserver that is nothing to do with this test. These core files are created before and after this commit. I am working on a fix for the gdbserver issue, but will post that separately. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28313 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29782 Approved-By: Tom Tromey <[email protected]>
tromey · Feb 2, 2024 · e58beed · e58beed
1 parent 46bd909
commit e58beed
Show file tree

Hide file tree

Showing 4 changed files with 99 additions and 0 deletions.
diff --git a/gdb/nat/linux-procfs.c b/gdb/nat/linux-procfs.c
@@ -352,6 +352,11 @@ linux_proc_pid_to_exec_file (int pid)
   else
     buf[len] = '\0';
 
+  /* Use /proc/PID/exe if the actual file can't be read, but /proc/PID/exe
+     can be.  */
+  if (access (buf, R_OK) != 0 && access (name, R_OK) == 0)
+    strcpy (buf, name);
+
   return buf;
 }
 

diff --git a/gdb/testsuite/gdb.base/attach-deleted-exec.c b/gdb/testsuite/gdb.base/attach-deleted-exec.c
@@ -0,0 +1,29 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <unistd.h>
+
+int
+main ()
+{
+  alarm (60);
+
+  while (1)
+    usleep (100000);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/attach-deleted-exec.exp b/gdb/testsuite/gdb.base/attach-deleted-exec.exp
@@ -0,0 +1,53 @@
+# Copyright (C) 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Attach to a process, the executable for which has been deleted.  On
+# GNU/Linux GDB will spot the missing executable and fallback to use
+# /proc/PID/exe instead.
+
+require can_spawn_for_attach
+require {istarget *-linux*}
+
+standard_testfile
+
+if { [build_executable "failed to prepare" $testfile $srcfile] } {
+    return -1
+}
+
+set test_spawn_id [spawn_wait_for_attach $binfile]
+set testpid [spawn_id_get_pid $test_spawn_id]
+
+# Move the executable rather than deleting it.  This just to aid with
+# debugging if someone needs to reproduce this test.
+set binfile_moved ${binfile}_moved
+
+# Don't move BINFILE as the kernel will just assign a new name to the
+# same inode; and the /proc/PID/exe link will continue to point to the
+# renamed inode.
+remote_exec host "cp $binfile $binfile_moved"
+remote_exec host "rm $binfile"
+
+# Don't pass the executable when GDB starts.  Instead rely on GDB
+# finding the executable from the PID we attach too.
+clean_restart
+
+# Attach.  GDB should spot that the executable is gone and fallback to
+# use /proc/PID/exe.
+gdb_test "attach $testpid" \
+    "Attaching to process $decimal\r\nReading symbols from /proc/${testpid}/exe\\.\\.\\..*" \
+    "attach to process with deleted executable"
+
+# Cleanup.
+kill_wait_spawned_process $test_spawn_id
diff --git a/gdb/testsuite/gdb.server/connect-with-no-symbol-file.exp b/gdb/testsuite/gdb.server/connect-with-no-symbol-file.exp
@@ -89,6 +89,18 @@ proc connect_no_symbol_file { sysroot action } {
 	}
     }
     gdb_assert $ok "connection to GDBserver succeeded"
+
+    # GDB will register /proc/PID/exe as the executable for some of
+    # these tests.  Once the test has finished the inferior will still
+    # have /proc/PID/exe registered as its executable even though that
+    # file no longer exists (most likely).  GDB will then complain
+    # about the inferior's executable having disappeared.  Silence
+    # these warnings by removing any registered file from the
+    # executable.
+    gdb_test "with confirm off -- file" \
+	[multi_line \
+	     "No executable file now\\." \
+	     "No symbol file now\\."]
 }
 
 # Make sure we have the original symbol file in a safe place to copy from.