write more tests #1

tromey · 2016-04-14T15:59:31Z

We need more test coverage of the rust additions:

test rust_operator_check; this is a bit tricky, involving a disappearing objfile and a display IIRC
test linespecs
some tests are commented out; if the bugs can be fixed, uncomment these
test lexing of "23.f()"
test some error cases
test convenience functions
build with coverage enabled and make sure we have good coverage from the test suite

The text was updated successfully, but these errors were encountered:

tromey · 2016-04-26T22:56:58Z

Not bothering with operator_check.

Nowadays, GDB can't unwind successfully from epilogue on arm, (gdb) bt #0 0x76ff65a2 in shr1 () from /home/yao/Source/gnu/build/gdb/testsuite/gdb.reverse/shr1.sl #1 0x0000869e in main () at /home/yao/Source/gnu/build/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.reverse/solib-reverse.c:34 Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) disassemble shr1 Dump of assembler code for function shr1: .... 0x76ff659a <+10>: adds r7, #12 0x76ff659c <+12>: mov sp, r7 0x76ff659e <+14>: ldr.w r7, [sp], #4 0x76ff65a2 <+18>: bx lr End of assembler dump. in this case, prologue unwinder is used. It analyzes the prologue and get the offsets of saved registers to SP. However, in epilogue, the SP has been restored, prologue unwinder gets the registers from the wrong address, and even the frame id is wrong. In reverse debugging, this case (program stops at the last instruction of function) happens quite frequently due to the reverse execution. There are many test fails due to missing epilogue unwinder. This adds epilogue unwinder, but the frame cache is still get by prologue unwinder except that SP is fixed up separately, because SP is restored in epilogue. This patch fixes many fails in solib-precsave.exp, and solib-reverse.exp. gdb: 2016-03-30 Yao Qi <[email protected]> * arm-tdep.c: (arm_make_epilogue_frame_cache): New function. (arm_epilogue_frame_this_id): New function. (arm_epilogue_frame_prev_register): New function. (arm_epilogue_frame_sniffer): New function. (arm_epilogue_frame_unwind): New. (arm_gdbarch_init): Append unwinder arm_epilogue_frame_unwind.

immediate_quit used to be necessary back when prompt_for_continue used blocking fread, but nowadays it uses gdb_readline_wrapper, which is implemented in terms of a nested event loop, which already knows how to react to SIGINT: #0 throw_it (reason=RETURN_QUIT, error=GDB_NO_ERROR, fmt=0x9d6d7e "Quit", ap=0x7fffffffcb88) at .../src/gdb/common/common-exceptions.c:324 #1 0x00000000007bab5d in throw_vquit (fmt=0x9d6d7e "Quit", ap=0x7fffffffcb88) at .../src/gdb/common/common-exceptions.c:366 #2 0x00000000007bac9f in throw_quit (fmt=0x9d6d7e "Quit") at .../src/gdb/common/common-exceptions.c:385 #3 0x0000000000773a2d in quit () at .../src/gdb/utils.c:1039 #4 0x000000000065d81b in async_request_quit (arg=0x0) at .../src/gdb/event-top.c:893 #5 0x000000000065c27b in invoke_async_signal_handlers () at .../src/gdb/event-loop.c:949 #6 0x000000000065aeef in gdb_do_one_event () at .../src/gdb/event-loop.c:280 #7 0x0000000000770838 in gdb_readline_wrapper (prompt=0x7fffffffcd40 "---Type <return> to continue, or q <return> to quit---") at .../src/gdb/top.c:873 The need for the QUIT in stdin_event_handler is then exposed by the gdb.base/double-prompt-target-event-error.exp test, which has: # We're now stopped in a pagination query while handling a # target event (printing where the program stopped). Quitting # the pagination should result in only one prompt being # output. send_gdb "\003p 1\n" Without that change we'd get: Continuing. ---Type <return> to continue, or q <return> to quit---PASS: gdb.base/double-prompt-target-event-error.exp: ctrlc target event: continue: continue to pagination ^CpQuit (gdb) 1 Undefined command: "1". Try "help". (gdb) PASS: gdb.base/double-prompt-target-event-error.exp: ctrlc target event: continue: first prompt ERROR: Undefined command "". UNRESOLVED: gdb.base/double-prompt-target-event-error.exp: ctrlc target event: continue: no double prompt Vs: Continuing. ---Type <return> to continue, or q <return> to quit---PASS: gdb.base/double-prompt-target-event-error.exp: ctrlc target event: continue: continue to pagination ^CQuit (gdb) p 1 $1 = 1 (gdb) PASS: gdb.base/double-prompt-target-event-error.exp: ctrlc target event: continue: first prompt PASS: gdb.base/double-prompt-target-event-error.exp: ctrlc target event: continue: no double prompt gdb/ChangeLog: 2016-04-12 Pedro Alves <[email protected]> * event-top.c (stdin_event_handler): Call QUIT; (prompt_for_continue): Don't run with immediate_quit set.

I see the following test fail in arm-linux with -marm and -fomit-frame-pointer, step callee () at /home/yao/SourceCode/gnu/gdb/git/gdb/testsuite/gdb.reverse/step-reverse.c:27 27 } /* RETURN FROM CALLEE */ (gdb) step main () at /home/yao/SourceCode/gnu/gdb/git/gdb/testsuite/gdb.reverse/step-reverse.c:58 58 callee(); /* STEP INTO THIS CALL */ (gdb) FAIL: gdb.reverse/step-precsave.exp: reverse step into fn call As we can see, the "step" has already stepped into the function callee, but in the last line. The second "step" attempts to step to function body, but it goes out of callee, which isn't expected. The program is compiled with -marm and -fomit-frame-pointer, the function callee is prologue-less, because nothing needs to be saved on stack, (gdb) disassemble callee Dump of assembler code for function callee: 0x00010680 <+0>: movw r3, #2364 ; 0x93c 0x00010684 <+4>: movt r3, #2 0x00010688 <+8>: ldr r3, [r3] 0x0001068c <+12>: add r2, r3, #1 0x00010690 <+16>: movw r3, #2364 ; 0x93c 0x00010694 <+20>: movt r3, #2 0x00010698 <+24>: str r2, [r3] 0x0001069c <+28>: mov r3, #0 0x000106a0 <+32>: mov r0, r3 0x000106a4 <+36>: bx lr program stops at the 0x106a0 (passed the epilogue) after the first "step". When second "step" is executed, the stepping range is [0x10680-0x106a0], which starts from the first instruction of function callee (because it doesn't have prologue). infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 2461] at 0x1069c^M infrun: prepare_to_wait^M infrun: target_wait (-1.0.0, status) =^M infrun: 2461.2461.0 [LWP 2461],^M infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP^M infrun: TARGET_WAITKIND_STOPPED^M infrun: stop_pc = 0x10698^M infrun: stepping inside range [0x10680-0x106a0] When program goes out of the range, it stops at the caller of callee, and test fails. IOW, if function callee has prologue, the stepping range won't start from the first instruction of the function, and program stops at the prologue and test passes. IMO, GDB does nothing wrong, but test shouldn't expect the program stops in callee after the second "step". I decide to fix test rather than GDB. In this patch, I change to test to do one "step", and check the program is still in callee, then, do multiple "step" until program goes out of the callee. gdb/testsuite: 2016-04-22 Yao Qi <[email protected]> * gdb.reverse/step-precsave.exp: Do one step and test program stops in "callee" and do multiple steps until program goes out of "callee". * gdb.reverse/step-reverse.exp: Likewise.

Nowadays, read_memory may throw NOT_AVAILABLE_ERROR (it is done by patch http://sourceware.org/ml/gdb-patches/2013-08/msg00625.html) however, read_stack and read_code still throws MEMORY_ERROR only. This causes PR 19947, that is prologue unwinder is unable unwind because code memory isn't available, but MEMORY_ERROR is thrown, while unwinder catches NOT_AVAILABLE_ERROR. #0 memory_error (err=err@entry=TARGET_XFER_E_IO, memaddr=memaddr@entry=140737349781158) at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:217 tromey#1 0x000000000065f5ba in read_code (memaddr=memaddr@entry=140737349781158, myaddr=myaddr@entry=0x7fffffffd7b0 "\340\023<\001", len=len@entry=1) at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:288 tromey#2 0x000000000065f7b5 in read_code_unsigned_integer (memaddr=memaddr@entry=140737349781158, len=len@entry=1, byte_order=byte_order@entry=BFD_ENDIAN_LITTLE) at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:363 tromey#3 0x00000000004717e0 in amd64_analyze_prologue (gdbarch=gdbarch@entry=0x13c13e0, pc=140737349781158, current_pc=140737349781165, cache=cache@entry=0xda0cb0) at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2267 tromey#4 0x0000000000471f6d in amd64_frame_cache_1 (cache=0xda0cb0, this_frame=0xda0bf0) at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2437 tromey#5 amd64_frame_cache (this_frame=0xda0bf0, this_cache=<optimised out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2508 tromey#6 0x000000000047214d in amd64_frame_this_id (this_frame=<optimised out>, this_cache=<optimised out>, this_id=0xda0c50) at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2541 tromey#7 0x00000000006b94c4 in compute_frame_id (fi=0xda0bf0) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:481 tromey#8 get_prev_frame_if_no_cycle (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1809 tromey#9 0x00000000006bb6c9 in get_prev_frame_always_1 (this_frame=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1983 tromey#10 get_prev_frame_always (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1999 tromey#11 0x00000000006bbe11 in get_prev_frame (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:2241 tromey#12 0x00000000006bc13c in unwind_to_current_frame (ui_out=<optimised out>, args=args@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1485 The fix is to let read_stack and read_code throw NOT_AVAILABLE_ERROR too, in order to align with read_memory. gdb: 2016-05-04 Yao Qi <[email protected]> PR gdb/19947 * corefile.c (read_memory): Rename it to ... (read_memory_object): ... it. Add parameter object. (read_memory): Call read_memory_object. (read_stack): Likewise. (read_code): Likewise.

Nowadays, GDB can't insert breakpoint on the return address of the exception handler on ARM M-profile, because the address is a magic one 0xfffffff9, (gdb) bt #0 CT32B1_IRQHandler () at ../src/timer.c:67 tromey#1 <signal handler called> tromey#2 main () at ../src/timer.c:127 (gdb) info frame Stack level 0, frame at 0x200ffa8: pc = 0x4ec in CT32B1_IRQHandler (../src/timer.c:67); saved pc = 0xfffffff9 called by frame at 0x200ffc8 source language c. Arglist at 0x200ffa0, args: Locals at 0x200ffa0, Previous frame's sp is 0x200ffa8 Saved registers: r7 at 0x200ffa0, lr at 0x200ffa4 (gdb) x/x 0xfffffff9 0xfffffff9: Cannot access memory at address 0xfffffff9 (gdb) finish Run till exit from #0 CT32B1_IRQHandler () at ../src/timer.c:67 Ed:15: Target error from Set break/watch: Et:96: Pseudo-address (0xFFFFFFxx) for EXC_RETURN is invalid (GDB error?) Warning: Cannot insert hardware breakpoint 0. Could not insert hardware breakpoints: You may have requested too many hardware breakpoints/watchpoints. Command aborted. even some debug probe can't set hardware breakpoint on the magic address too, (gdb) hbreak *0xfffffff9 Hardware assisted breakpoint 2 at 0xfffffff9 (gdb) c Continuing. Ed:15: Target error from Set break/watch: Et:96: Pseudo-address (0xFFFFFFxx) for EXC_RETURN is invalid (GDB error?) Warning: Cannot insert hardware breakpoint 2. Could not insert hardware breakpoints: You may have requested too many hardware breakpoints/watchpoints. Command aborted. The problem described above is quite similar to PR 8841, in which GDB can't set breakpoint on signal trampoline, which is mapped to a read-only page by kernel. The rationale of this patch is to skip "unwritable" frames when looking for caller frames in command "finish", and a new gdbarch method code_of_frame_writable is added. This patch fixes the problem on ARM cortex-m target, but it can be used to fix PR 8841 too. gdb: 2016-05-10 Yao Qi <[email protected]> * arch-utils.c (default_code_of_frame_writable): New function. * arch-utils.h (default_code_of_frame_writable): Declare. * arm-tdep.c (arm_code_of_frame_writable): New function. (arm_gdbarch_init): Install gdbarch method code_of_frame_writable if the target is M-profile. * frame.c (skip_unwritable_frames): New function. * frame.h (skip_unwritable_frames): Declare. * gdbarch.sh (code_of_frame_writable): New. * gdbarch.c, gdbarch.h: Re-generated. * infcmd.c (finish_command): Call skip_unwritable_frames.

As reported in PR 19998, after type ctrl-c, GDB hang there and does not send interrupt. It causes a fail in gdb.base/interrupt.exp. All targets support remote fileio should be affected. When we type ctrc-c, SIGINT is handled by remote_fileio_sig_set, as shown below, #0 remote_fileio_sig_set (sigint_func=0x4495d0 <remote_fileio_ctrl_c_signal_handler(int)>) at /home/yao/SourceCode/gnu/gdb/git/gdb/remote-fileio.c:325 tromey#1 0x00000000004495de in remote_fileio_ctrl_c_signal_handler (signo=<optimised out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/remote-fileio.c:349 tromey#2 <signal handler called> tromey#3 0x00007ffff647ed83 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81 tromey#4 0x00000000005530ce in interruptible_select (n=10, readfds=readfds@entry=0x7fffffffd730, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x0) at /home/yao/SourceCode/gnu/gdb/git/gdb/event-top.c:1017 tromey#5 0x000000000061ab20 in stdio_file_read (file=<optimised out>, buf=0x12d02e0 "\n\022-\001", length_buf=16383) at /home/yao/SourceCode/gnu/gdb/git/gdb/ui-file.c:577 tromey#6 0x000000000044a4dc in remote_fileio_func_read (buf=0x12c0360 "") at /home/yao/SourceCode/gnu/gdb/git/gdb/remote-fileio.c:583 tromey#7 0x0000000000449598 in do_remote_fileio_request (uiout=<optimised out>, buf_arg=buf_arg@entry=0x12c0340) at /home/yao/SourceCode/gnu/gdb/git/gdb/remote-fileio.c:1179 we don't set quit_serial_event, do { res = gdb_select (n, readfds, writefds, exceptfds, timeout); } while (res == -1 && errno == EINTR); if (res == 1 && FD_ISSET (fd, readfds)) { errno = EINTR; return -1; } return res; we can't go out of the loop above, and that is why GDB can't send interrupt. Recently, we stop throwing exception from SIGINT handler (remote_fileio_ctrl_c_signal_handler) https://sourceware.org/ml/gdb-patches/2016-03/msg00372.html, which is correct, because gdb_select is interruptible. However, in the same patch series, we add interruptible_select later as a wrapper to gdb_select, https://sourceware.org/ml/gdb-patches/2016-03/msg00375.html and it is not interruptible (because of the loop in it) unless select/poll-able file descriptors are marked. This fix in this patch is to call quit_serial_event_set, so that we can go out of the loop above, return -1 and set errno to EINTR. 2016-06-01 Yao Qi <[email protected]> PR remote/19998 * remote-fileio.c (remote_fileio_ctrl_c_signal_handler): Call quit_serial_event_set.

I see the following GDBserver internal error in two cases, gdb/gdbserver/linux-low.c:1922: A problem internal to GDBserver has been detected. unsuspend LWP 17200, suspended=-1 1. step over a breakpoint on fork/vfork syscall instruction, 2. step over a breakpoint on clone syscall instruction and child threads hits a breakpoint, the stack backtrace is #0 internal_error (file=file@entry=0x44c4c0 "gdb/gdbserver/linux-low.c", line=line@entry=1922, fmt=fmt@entry=0x44c7d0 "unsuspend LWP %ld, suspended=%d\n") at gdb/gdbserver/../common/errors.c:51 tromey#1 0x0000000000424014 in lwp_suspended_decr (lwp=<optimised out>, lwp=<optimised out>) at gdb/gdbserver/linux-low.c:1922 tromey#2 0x000000000042403a in unsuspend_one_lwp (entry=<optimised out>, except=0x66e8c0) at gdb/gdbserver/linux-low.c:2885 tromey#3 0x0000000000405f45 in find_inferior (list=<optimised out>, func=func@entry=0x424020 <unsuspend_one_lwp>, arg=arg@entry=0x66e8c0) at gdb/gdbserver/inferiors.c:243 tromey#4 0x00000000004297de in unsuspend_all_lwps (except=0x66e8c0) at gdb/gdbserver/linux-low.c:2895 tromey#5 linux_wait_1 (ptid=..., ourstatus=ourstatus@entry=0x665ec0 <last_status>, target_options=target_options@entry=0) at gdb/gdbserver/linux-low.c:3632 tromey#6 0x000000000042a764 in linux_wait (ptid=..., ourstatus=0x665ec0 <last_status>, target_options=0) at gdb/gdbserver/linux-low.c:3770 tromey#7 0x0000000000411163 in mywait (ptid=..., ourstatus=ourstatus@entry=0x665ec0 <last_status>, options=options@entry=0, connected_wait=connected_wait@entry=1) at gdb/gdbserver/target.c:214 tromey#8 0x000000000040b1f2 in resume (actions=0x66f800, num_actions=1) at gdb/gdbserver/server.c:2757 tromey#9 0x000000000040f660 in handle_v_cont (own_buf=0x66a630 "vCont;c:p45e9.-1") at gdb/gdbserver/server.c:2719 when GDBserver steps over a thread, other threads have been suspended, the "stepping" thread may create new thread, but GDBserver doesn't set it suspend count to 1. When GDBserver unsuspend threads, the child's suspend count goes to -1, and the assert is triggered. In fact, GDBserver has already taken care of suspend count of new thread when GDBserver is suspending all threads except the one GDBserver wants to step over by https://sourceware.org/ml/gdb-patches/2015-07/msg00946.html + /* If we're suspending all threads, leave this one suspended + too. */ + if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS) + { + if (debug_threads) + debug_printf ("HEW: leaving child suspended\n"); + child_lwp->suspended = 1; + } but that is not enough, because new thread is still can be spawned in the thread which is being stepped over. This patch extends the condition that GDBserver set child's suspend count to one if it is suspending threads or stepping over the thread. gdb/gdbserver: 2016-03-03 Yao Qi <[email protected]> PR server/19736 * linux-low.c (handle_extended_wait): Set child suspended if event_lwp->bp_reinsert isn't zero.

Fix this GDB crash: $ gdb -ex "set architecture mips:10000" Segmentation fault (core dumped) Backtrace: Program received signal SIGSEGV, Segmentation fault. 0x0000000000495b1b in mips_gdbarch_init (info=..., arches=0x0) at /home/pedro/gdb/mygit/cxx-convertion/src/gdb/mips-tdep.c:8436 8436 if (bfd_get_flavour (info.abfd) == bfd_target_elf_flavour (top-gdb) bt #0 0x0000000000495b1b in mips_gdbarch_init (info=..., arches=0x0) at .../src/gdb/mips-tdep.c:8436 tromey#1 0x00000000007348a6 in gdbarch_find_by_info (info=...) at .../src/gdb/gdbarch.c:5155 tromey#2 0x000000000073563c in gdbarch_update_p (info=...) at .../src/gdb/arch-utils.c:522 tromey#3 0x0000000000735585 in set_architecture (ignore_args=0x0, from_tty=1, c=0x26bc870) at .../src/gdb/arch-utils.c:496 tromey#4 0x00000000005f29fd in do_sfunc (c=0x26bc870, args=0x0, from_tty=1) at .../src/gdb/cli/cli-decode.c:121 tromey#5 0x00000000005fd3f3 in do_set_command (arg=0x7fffffffdcdd "mips:10000", from_tty=1, c=0x26bc870) at .../src/gdb/cli/cli-setshow.c:455 tromey#6 0x0000000000836157 in execute_command (p=0x7fffffffdcdd "mips:10000", from_tty=1) at .../src/gdb/top.c:460 tromey#7 0x000000000071abfb in catch_command_errors (command=0x835f6b <execute_command>, arg=0x7fffffffdccc "set architecture mips:10000", from_tty=1) at .../src/gdb/main.c:368 tromey#8 0x000000000071bf4f in captured_main (data=0x7fffffffd750) at .../src/gdb/main.c:1132 tromey#9 0x0000000000716737 in catch_errors (func=0x71af44 <captured_main>, func_args=0x7fffffffd750, errstring=0x106b9a1 "", mask=RETURN_MASK_ALL) at .../src/gdb/exceptions.c:240 tromey#10 0x000000000071bfe6 in gdb_main (args=0x7fffffffd750) at .../src/gdb/main.c:1164 tromey#11 0x000000000040a6ad in main (argc=4, argv=0x7fffffffd858) at .../src/gdb/gdb.c:32 (top-gdb) We already check whether info.abfd is NULL before all other bfd_get_flavour calls in the same function. Just this one case was missing. (This was exposed by a WIP test that tries all "set architecture ARCH" values.) gdb/ChangeLog: 2016-03-07 Pedro Alves <[email protected]> * mips-tdep.c (mips_gdbarch_init): Check whether info.abfd is NULL before calling bfd_get_flavour.

This change adds support for specifying a negative repeat count to all the formats of the 'x' command to examine memory backward. A new testcase 'examine-backward' is added to cover this new feature. Here's the example output from the new feature: <format 'i'> (gdb) bt #0 Func1 (n=42, p=0x40432e "hogehoge") at main.cpp:5 #1 0x00000000004041fa in main (argc=1, argv=0x7fffffffdff8) at main.cpp:19 (gdb) x/-4i 0x4041fa 0x4041e5 <main(int, char**)+11>: mov %rsi,-0x10(%rbp) 0x4041e9 <main(int, char**)+15>: lea 0x13e(%rip),%rsi 0x4041f0 <main(int, char**)+22>: mov $0x2a,%edi 0x4041f5 <main(int, char**)+27>: callq 0x404147 <format 'x'> (gdb) x/-4xw 0x404200 0x4041f0 <main(int, char**)+22>: 0x00002abf 0xff4de800 0x76e8ffff 0xb8ffffff (gdb) x/-4 0x4041e0 <main(int, char**)+6>: 0x7d8910ec 0x758948fc 0x358d48f0 0x0000013e gdb/ChangeLog: * NEWS: Mention that GDB now supports a negative repeat count in the 'x' command. * printcmd.c (decode_format): Allow '-' in the parameter "string_ptr" to accept a negative repeat count. (find_instruction_backward): New function. (read_memory_backward): New function. (integer_is_zero): New function. (find_string_backward): New function. (do_examine): Use new functions to examine memory backward. (_initialize_printcmd): Mention that 'x' command supports a negative repeat count. gdb/doc/ChangeLog: * gdb.texinfo (Examining Memory): Document negative repeat count in the 'x' command. gdb/testsuite/ChangeLog: * gdb.base/examine-backward.c: New file. * gdb.base/examine-backward.exp: New file.

… out value With something like: struct A { int bitfield:4; } var; If 'var' ends up wholly-optimized out, printing 'var.bitfield' crashes gdb here: (top-gdb) bt #0 0x000000000058b89f in extract_unsigned_integer (addr=0x2 <error: Cannot access memory at address 0x2>, len=2, byte_order=BFD_ENDIAN_LITTLE) at /home/pedro/gdb/mygit/src/gdb/findvar.c:109 tromey#1 0x00000000005a187a in unpack_bits_as_long (field_type=0x16cff70, valaddr=0x0, bitpos=16, bitsize=12) at /home/pedro/gdb/mygit/src/gdb/value.c:3347 tromey#2 0x00000000005a1b9d in unpack_value_bitfield (dest_val=0x1b5d9d0, bitpos=16, bitsize=12, valaddr=0x0, embedded_offset=0, val=0x1b5d8d0) at /home/pedro/gdb/mygit/src/gdb/value.c:3441 tromey#3 0x00000000005a2a5f in value_fetch_lazy (val=0x1b5d9d0) at /home/pedro/gdb/mygit/src/gdb/value.c:3958 tromey#4 0x00000000005a10a7 in value_primitive_field (arg1=0x1b5d8d0, offset=0, fieldno=0, arg_type=0x16d04c0) at /home/pedro/gdb/mygit/src/gdb/value.c:3161 tromey#5 0x00000000005b01e5 in do_search_struct_field (name=0x1727c60 "bitfield", arg1=0x1b5d8d0, offset=0, type=0x16d04c0, looking_for_baseclass=0, result_ptr=0x7fffffffcaf8, [...] unpack_value_bitfield is already optimized-out/unavailable -aware: (...) VALADDR points to the contents of VAL. If the VAL's contents required to extract the bitfield from are unavailable/optimized out, DEST_VAL is correspondingly marked unavailable/optimized out. however, it is not considering the case of the value having no contents buffer at all, as can happen through allocate_optimized_out_value. gdb/ChangeLog: 2016-08-09 Pedro Alves <[email protected]> * value.c (unpack_value_bitfield): Skip unpacking if the parent has no contents buffer to begin with. gdb/testsuite/ChangeLog: 2016-08-09 Pedro Alves <[email protected]> * gdb.dwarf2/bitfield-parent-optimized-out.exp: New file.

I build GDB with -fsanitize=address, and see the error in tests, (gdb) PASS: gdb.linespec/ls-errs.exp: lang=C++: break 3 foo break -line 3 foo^M =================================================================^M ==4401==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000047487 at pc 0x819d8e bp 0x7fff4e4e6bb0 sp 0x7fff4e4e6ba8^M READ of size 1 at 0x603000047487 thread T0^[[1m^[[0m^M #0 0x819d8d in explicit_location_lex_one /home/yao/SourceCode/gnu/gdb/git/gdb/location.c:502^M tromey#1 0x81a185 in string_to_explicit_location(char const**, language_defn const*, int) /home/yao/SourceCode/gnu/gdb/git/gdb/location.c:556^M tromey#2 0x81ac10 in string_to_event_location(char**, language_defn const*) /home/yao/SourceCode/gnu/gdb/git/gdb/location.c:687^ the code in question is: > /* Special case: C++ operator,. */ > if (language->la_language == language_cplus > && strncmp (*inp, "operator", 8) <--- [1] > && (*inp)[9] == ',') > (*inp) += 9; > ++(*inp); The error is caused by the access to (*inp)[9] if 9 is out of its bounds. However [1] looks odd to me, because if strncmp returns true (non-zero), the following check "(*inp)[9] == ','" makes no sense any more. I suspect it was a typo in the code we meant to "strncmp () == 0". Another problem in the code above is that if *inp is "operator,", we first increment *inp by 9, and then increment it by one again, which is wrong to me. We should only increment *inp by 8 to skip "operator", and go back to the loop header to decide where we stop. gdb: 2016-08-15 Yao Qi <[email protected]> * location.c (explicit_location_lex_one): Compare the return value of strncmp with zero. Don't check (*inp)[9]. Increment *inp by 8.

If I build gdb with -fsanitize=address and run tests, I get error, malformed linespec error: unexpected colon^M (gdb) PASS: gdb.linespec/ls-errs.exp: lang=C: break : break :=================================================================^M ==3266==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000051451 at pc 0x2b5797a972a8 bp 0x7fffd8e0f3c0 sp 0x7fffd8e0f398^M READ of size 2 at 0x602000051451 thread T0 #0 0x2b5797a972a7 in __interceptor_strlen (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x322a7)^M tromey#1 0x7bd004 in compare_filenames_for_search(char const*, char const*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:316^M tromey#2 0x7bd310 in iterate_over_some_symtabs(char const*, char const*, int (*)(symtab*, void*), void*, compunit_symtab*, compunit_symtab*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:411^M tromey#3 0x7bd775 in iterate_over_symtabs(char const*, int (*)(symtab*, void*), void*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:481^M tromey#4 0x7bda15 in lookup_symtab(char const*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:527^M tromey#5 0x7d5e2a in make_file_symbol_completion_list_1 /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:5635^M tromey#6 0x7d61e1 in make_file_symbol_completion_list(char const*, char const*, char const*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:5684^M tromey#7 0x88dc06 in linespec_location_completer /home/yao/SourceCode/gnu/gdb/git/gdb/completer.c:288 .... 0x602000051451 is located 0 bytes to the right of 1-byte region [0x602000051450,0x602000051451)^M mallocated by thread T0 here: #0 0x2b5797ab97ef in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x547ef)^M tromey#1 0xbbfb8d in xmalloc /home/yao/SourceCode/gnu/gdb/git/gdb/common/common-utils.c:43^M tromey#2 0x88dabd in linespec_location_completer /home/yao/SourceCode/gnu/gdb/git/gdb/completer.c:273^M tromey#3 0x88e5ef in location_completer(cmd_list_element*, char const*, char const*) /home/yao/SourceCode/gnu/gdb/git/gdb/completer.c:531^M tromey#4 0x8902e7 in complete_line_internal /home/yao/SourceCode/gnu/gdb/git/gdb/completer.c:964^ The code in question is here file_to_match = (char *) xmalloc (colon - text + 1); strncpy (file_to_match, text, colon - text + 1); it is likely that file_to_match is not null-terminated. The patch is to strncpy 'colon - text' bytes and explicitly set '\0'. gdb: 2016-08-19 Yao Qi <[email protected]> * completer.c (linespec_location_completer): Make file_to_match null-terminated.

This patch fixes a problem that problem triggers if you start an inferior, e.g., with the "start" command, in a UI created with the new-ui command, and then run a foreground execution command in the main UI. Once the program stops for the latter command, typing in the main UI no longer echoes back to the user. The problem revolves around this: - gdb_has_a_terminal computes its result lazily, on first call. that is what saves gdb's initial main UI terminal state (the UI associated with stdin): our_terminal_info.ttystate = serial_get_tty_state (stdin_serial); This is the state that target_terminal_ours() restores. - In this scenario, the gdb_has_a_terminal function happens to be first ever called from within the target_terminal_init call in startup_inferior: (top-gdb) bt #0 gdb_has_a_terminal () at src/gdb/inflow.c:157 tromey#1 0x000000000079db22 in child_terminal_init_with_pgrp () at src/gdb/inflow.c:217 [...] tromey#4 0x000000000065bacb in target_terminal_init () at src/gdb/target.c:456 tromey#5 0x00000000004676d2 in startup_inferior () at src/gdb/fork-child.c:531 [...] tromey#7 0x000000000046b168 in linux_nat_create_inferior () at src/gdb/linux-nat.c:1112 [...] tromey#9 0x00000000005f20c9 in start_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:657 If the command to start the inferior is issued on the main UI, then readline will have deprepped the terminal when we reach the above, and the problem doesn't appear. If however the command is issued on a non-main UI, then when we reach that gdb_has_a_terminal call, the main UI's terminal state is still set to whatever readline has sets it to in rl_prep_terminal, which happens to have echo disabled. Later, when the following synchronous execution command finishes, we'll call target_terminal_ours to restore gdb's the main UI's terminal settings, and that restores the terminal state with echo disabled... Conceptually, the fix is to move the gdb_has_a_terminal call earlier, to someplace during GDB initialization, before readline/ncurses have had a chance to change terminal settings. Turns out that "set_initial_gdb_ttystate" is exactly such a place. I say conceptually, because the fix actually inlines the gdb_has_a_terminal part that saves the terminal state in set_initial_gdb_ttystate and then simplifies gdb_has_a_terminal, since there's no point in making gdb_has_a_terminal do lazy computation. gdb/ChangeLog: 2016-08-23 Pedro Alves <[email protected]> PR gdb/20494 * inflow.c (our_terminal_info, initial_gdb_ttystate): Update comments. (enum gdb_has_a_terminal_flag_enum, gdb_has_a_terminal_flag): Delete. (set_initial_gdb_ttystate): Record our_terminal_info here too, instead of ... (gdb_has_a_terminal): ... here. Reimplement in terms of initial_gdb_ttystate. Make static. * terminal.h (gdb_has_a_terminal): Delete declaration. (set_initial_gdb_ttystate): Add comment. * top.c (show_interactive_mode): Use input_interactive_p instead of gdb_has_a_terminal. gdb/testsuite/ChangeLog: 2016-08-23 Pedro Alves <[email protected]> PR gdb/20494 * gdb.base/new-ui-echo.c: New file. * gdb.base/new-ui-echo.exp: New file.

This test case verifies that GDB will not attempt to invoke a python unwinder recursively. At the moment, the behavior exhibited by GDB looks like this: (gdb) source py-recurse-unwind.py Python script imported (gdb) b ccc Breakpoint 1 at 0x4004bd: file py-recurse-unwind.c, line 23. (gdb) run Starting program: py-recurse-unwind TestUnwinder: Recursion detected - returning early. TestUnwinder: Recursion detected - returning early. TestUnwinder: Recursion detected - returning early. TestUnwinder: Recursion detected - returning early. Breakpoint 1, ccc (arg=<unavailable>) at py-recurse-unwind.c:23 23 } (gdb) bt #-1 ccc (arg=<unavailable>) at py-recurse-unwind.c:23 Backtrace stopped: previous frame identical to this frame (corrupt stack?) [I've shortened pathnames for easier reading.] The desired / expected behavior looks like this: (gdb) source py-recurse-unwind.py Python script imported (gdb) b ccc Breakpoint 1 at 0x4004bd: file py-recurse-unwind.c, line 23. (gdb) run Starting program: py-recurse-unwind Breakpoint 1, ccc (arg=789) at py-recurse-unwind.c:23 23 } (gdb) bt #0 ccc (arg=789) at py-recurse-unwind.c:23 tromey#1 0x00000000004004d5 in bbb (arg=456) at py-recurse-unwind.c:28 tromey#2 0x00000000004004ed in aaa (arg=123) at py-recurse-unwind.c:34 tromey#3 0x00000000004004fe in main () at py-recurse-unwind.c:40 Note that GDB's problems go well beyond the fact that it invokes the unwinder recursively. In the process it messes up some internal state (the frame stash) leading to display of (only) the sentinel frame in the backtrace. gdb/testsuite/ChangeLog: * gdb.python/py-recurse-unwind.c: New file. * gdb.python/py-recurse-unwind.py: New file. * gdb.python/py-recurse-unwind.exp: New file.

…_eval This fixes the problem exercised by Kevin's test at: https://sourceware.org/ml/gdb-patches/2016-08/msg00216.html This was originally exposed by the OpenJDK Python-based unwinder. If an unwinder attempts to call parse_and_eval from within its sniffing method, GDB's unwinding machinery enters infinite recursion. However, parse_and_eval is a pretty reasonable thing to call, because Python/Scheme-based unwinders will often need to read globals out of inferior memory. The recursion happens because: - get_current_frame() is called soon after the target stops. - current_frame is NULL, and so we unwind it from the sentinel frame (which is special and has level == -1). - We reach get_prev_frame_if_no_cycle, which does cycle detection based on frame id, and thus tries to compute the frame id of the new frame. - Frame id computation requires an unwinder, so we go through all unwinder sniffers trying to see if one accepts the new frame (the current frame). - the unwinder's sniffer calls parse_and_eval(). - parse_and_eval depends on the selected frame/block, and if not set yet, the selected frame is set to the current frame. - get_current_frame () is called again. current_frame is still NULL, so ... - recurse forever. In Kevin's test at: https://sourceware.org/ml/gdb-patches/2016-08/msg00216.html gdb doesn't recurse forever simply because the Python unwinder contains code to detect and stop the recursion itself. However, GDB goes downhill from here, e.g., by showing the sentinel frame as current frame (note the -1): Breakpoint 1, ccc (arg=<unavailable>) at py-recurse-unwind.c:23 23 } (gdb) bt #-1 ccc (arg=<unavailable>) at py-recurse-unwind.c:23 Backtrace stopped: previous frame identical to this frame (corrupt stack?) That "-1" frame level comes from this: if (catch_exceptions (current_uiout, unwind_to_current_frame, sentinel_frame, RETURN_MASK_ERROR) != 0) { /* Oops! Fake a current frame? Is this useful? It has a PC of zero, for instance. */ current_frame = sentinel_frame; } which is bogus. It's never correct to set the current frame to the sentinel frame. The only reason this has survived so long is that getting here normally indicates something wrong has already happened before and we fix that. And this case is no exception -- it doesn't really matter how precisely we managed to get to that bogus code (it has to do with the the stash), because anything after recursion happens is going to be invalid. So the fix is to avoid the recursion in the first place. Observations: #1 - The recursion happens because we try to do cycle detection from within get_prev_frame_if_no_cycle. That requires computing the frame id of the frame being unwound, and that itself requires calling into the unwinders. #2 - But, the first time we're unwinding from the sentinel frame, when we reach get_prev_frame_if_no_cycle, there's no frame chain at all yet: - current_frame is NULL. - the frame stash is empty. Thus, there's really no need to do cycle detection the first time we reach get_prev_frame_if_no_cycle, when building the current frame. So we can break the recursion by making get_current_frame call a simplified version of get_prev_frame_if_no_cycle that results in setting the current_frame global _before_ computing the current frame's id. But, we can go a little bit further. As there's really no reason anymore to compute the current frame's frame id immediately, we can defer computing it to when some caller of get_current_frame might need it. This was actually how the frame id was computed for all frames before the stash-based cycle detection was added. So in a way, this patch reintroduces the lazy frame id computation, but unlike before, only for the case of the current frame, which turns out to be special. This lazyness, however, requires adjusting gdb.python/py-unwind-maint.exp, because that assumes unwinders are immediately called as side effect of some commands. I didn't see a need to preserve the behavior expected by that test (all it would take is call get_frame_id inside get_current_frame), so I adjusted the test. gdb/ChangeLog: 2016-09-05 Pedro Alves <[email protected]> PR backtrace/19927 * frame.c (get_frame_id): Compute the frame id if not computed yet. (unwind_to_current_frame): Delete. (get_current_frame): Use get_prev_frame_always_1 to get the current frame and assert that that always succeeds. (get_prev_frame_if_no_cycle): Skip cycle detection if returning the current frame. gdb/testsuite/ChangeLog: 2016-09-05 Pedro Alves <[email protected]> PR backtrace/19927 * gdb.python/py-unwind-maint.exp: Adjust tests to not expect that unwinders are immediately called as side effect of "source" or "disable unwinder" commands. * gdb.python/py-recurse-unwind.exp: Remove setup_kfail calls.

aarch64_reg_parse_32_64 is currently used to parse address registers, among other things. It returns two bits of information about the register: whether it's W rather than X, and whether it's a zero register. SVE adds addressing modes in which the base or offset can be a vector register instead of a scalar, so a choice between W and X is no longer enough. It's more convenient to pass the type of register around as a qualifier instead. As it happens, two callers of aarch64_reg_parse_32_64 already wanted the information in the form of a qualifier, so the change feels pretty natural even without SVE. Also, the function took two parameters to control whether {W}SP and (W|X)ZR should be accepted. We tend to get slightly better error messages by accepting them regardless and getting the caller to do the check, rather than potentially treating "xzr", "sp" etc. as constants. This is easier to do if the function returns the reg_entry rather than just the register number. This does create a corner case where: .equ sp, 1 ldr w0, [x0, sp] was previously an acceptable way of writing "ldr w0, [x0, #1]", but I don't think it's important to continue supporting that. We already rejected things like: .equ sp, 1 add x0, x1, sp To ensure these new error messages "win" when matching against several candidate instruction entries, we need to use the same address-parsing code for all addresses, including ADDR_SIMPLE and SIMD_ADDR_SIMPLE. The next patch also relies on this. Finally, aarcch64_check_reg_type was written in a pretty conservative way. It should always be equivalent to a single bit test. gas/ * config/tc-aarch64.c (REG_TYPE_R_Z, REG_TYPE_R_SP): New register types. (get_reg_expected_msg): Handle them and REG_TYPE_R64_SP. (aarch64_check_reg_type): Simplify. (aarch64_reg_parse_32_64): Return the reg_entry instead of the register number. Return the type as a qualifier rather than an "isreg32" boolean. Remove reject_sp, reject_rz and isregzero parameters. (parse_shifter_operand): Update call to aarch64_parse_32_64_reg. Use get_reg_expected_msg. (parse_address_main): Likewise. Use aarch64_check_reg_type. (po_int_reg_or_fail): Replace reject_sp and reject_rz parameters with a reg_type parameter. Update call to aarch64_parse_32_64_reg. Use aarch64_check_reg_type to test the result. (parse_operands): Update after the above changes. Parse ADDR_SIMPLE addresses normally before enforcing the syntax restrictions. * testsuite/gas/aarch64/diagnostic.s: Add tests for a post-index zero register and for a stack pointer index. * testsuite/gas/aarch64/diagnostic.l: Update accordingly. Also update existing diagnostic messages after the above changes. * testsuite/gas/aarch64/illegal-lse.l: Update the error message for 32-bit register bases.

In the review of the original version of this series, Richard didn't like the use of boolean parameters to parse_address_main. I think we can just get rid of them and leave the callers to check the addressing modes. As it happens, the handling of ADDR_SIMM9{,_2} already did this for relocation operators (i.e. it used parse_address_reloc and then rejected relocations). The callers are already set up to reject invalid register post-indexed addressing, so we can simply remove the accept_reg_post_index parameter without adding any more checks. This again creates a corner case where: .equ x2, 1 ldr w0, [x1], x2 was previously an acceptable way of writing "ldr w0, [x1], #1" but is now rejected. Removing the "reloc" parameter means that two cases need to check explicitly for relocation operators. ADDR_SIMM9_2 appers to be unused. I'll send a separate patch to remove it. This patch makes parse_address temporarily equivalent to parse_address_main, but later patches in the series will need to keep the distinction. gas/ * config/tc-aarch64.c (parse_address_main): Remove reloc and accept_reg_post_index parameters. Parse relocations and register post indexes unconditionally. (parse_address): Remove accept_reg_post_index parameter. Update call to parse_address_main. (parse_address_reloc): Delete. (parse_operands): Call parse_address instead of parse_address_main. Update existing callers of parse_address and make them check inst.reloc.type where appropriate. * testsuite/gas/aarch64/diagnostic.s: Add tests for relocations in ADDR_SIMPLE, SIMD_ADDR_SIMPLE, ADDR_SIMM7 and ADDR_SIMM9 addresses. Also test for invalid uses of post-index register addressing. * testsuite/gas/aarch64/diagnostic.l: Update accordingly.

This patch adds the new SVE integer immediate operands. There are three kinds: - simple signed and unsigned ranges, but with new widths and positions. - 13-bit logical immediates. These have the same form as in base AArch64, but at a different bit position. In the case of the "MOV Zn.<T>, #<limm>" alias of DUPM, the logical immediate <limm> is not allowed to be a valid DUP immediate, since DUP is preferred over DUPM for constants that both instructions can handle. - a new 9-bit arithmetic immediate, of the form "<imm8>{, LSL #8}". In some contexts the operand is signed and in others it's unsigned. As an extension, we allow shifted immediates to be written as a single integer, e.g. "#256" is equivalent to "#1, LSL #8". We also use the shiftless form as the preferred disassembly, except for the special case of "#0, LSL #8" (a redundant encoding of 0). include/ * opcode/aarch64.h (AARCH64_OPND_SIMM5): New aarch64_opnd. (AARCH64_OPND_SVE_AIMM, AARCH64_OPND_SVE_ASIMM) (AARCH64_OPND_SVE_INV_LIMM, AARCH64_OPND_SVE_LIMM) (AARCH64_OPND_SVE_LIMM_MOV, AARCH64_OPND_SVE_SHLIMM_PRED) (AARCH64_OPND_SVE_SHLIMM_UNPRED, AARCH64_OPND_SVE_SHRIMM_PRED) (AARCH64_OPND_SVE_SHRIMM_UNPRED, AARCH64_OPND_SVE_SIMM5) (AARCH64_OPND_SVE_SIMM5B, AARCH64_OPND_SVE_SIMM6) (AARCH64_OPND_SVE_SIMM8, AARCH64_OPND_SVE_UIMM3) (AARCH64_OPND_SVE_UIMM7, AARCH64_OPND_SVE_UIMM8) (AARCH64_OPND_SVE_UIMM8_53): Likewise. (aarch64_sve_dupm_mov_immediate_p): Declare. opcodes/ * aarch64-tbl.h (AARCH64_OPERANDS): Add entries for the new SVE integer immediate operands. * aarch64-opc.h (FLD_SVE_immN, FLD_SVE_imm3, FLD_SVE_imm5) (FLD_SVE_imm5b, FLD_SVE_imm7, FLD_SVE_imm8, FLD_SVE_imm9) (FLD_SVE_immr, FLD_SVE_imms, FLD_SVE_tszh): New aarch64_field_kinds. * aarch64-opc.c (fields): Add corresponding entries. (operand_general_constraint_met_p): Handle the new SVE integer immediate operands. (aarch64_print_operand): Likewise. (aarch64_sve_dupm_mov_immediate_p): New function. * aarch64-opc-2.c: Regenerate. * aarch64-asm.h (ins_inv_limm, ins_sve_aimm, ins_sve_asimm) (ins_sve_limm_mov, ins_sve_shlimm, ins_sve_shrimm): New inserters. * aarch64-asm.c (aarch64_ins_limm_1): New function, split out from... (aarch64_ins_limm): ...here. (aarch64_ins_inv_limm): New function. (aarch64_ins_sve_aimm): Likewise. (aarch64_ins_sve_asimm): Likewise. (aarch64_ins_sve_limm_mov): Likewise. (aarch64_ins_sve_shlimm): Likewise. (aarch64_ins_sve_shrimm): Likewise. * aarch64-asm-2.c: Regenerate. * aarch64-dis.h (ext_inv_limm, ext_sve_aimm, ext_sve_asimm) (ext_sve_limm_mov, ext_sve_shlimm, ext_sve_shrimm): New extractors. * aarch64-dis.c (decode_limm): New function, split out from... (aarch64_ext_limm): ...here. (aarch64_ext_inv_limm): New function. (decode_sve_aimm): Likewise. (aarch64_ext_sve_aimm): Likewise. (aarch64_ext_sve_asimm): Likewise. (aarch64_ext_sve_limm_mov): Likewise. (aarch64_top_bit): Likewise. (aarch64_ext_sve_shlimm): Likewise. (aarch64_ext_sve_shrimm): Likewise. * aarch64-dis-2.c: Regenerate. gas/ * config/tc-aarch64.c (parse_operands): Handle the new SVE integer immediate operands.

If xmalloc fails allocating memory, usually because something tried a huge allocation, like xmalloc(-1) or some such, GDB asks the user what to do: .../src/gdb/utils.c:1079: internal-error: virtual memory exhausted. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) If the user says "n", that throws a QUIT exception, which is caught by one of the multiple CATCH(RETURN_MASK_ALL) blocks somewhere up the stack. The default implementations of operator new / operator new[] call malloc directly, and on memory allocation failure throw std::bad_alloc. Currently, if that happens, since nothing catches it, the exception escapes out of main, and GDB aborts from unhandled exception. This patch replaces the default operator new variants with versions that, just like xmalloc: #1 - Raise an internal-error on memory allocation failure. #2 - Throw a QUIT gdb_exception, so that the exact same CATCH blocks continue handling memory allocation problems. A minor complication of #2 is that operator new can _only_ throw std::bad_alloc, or something that extends it: void* operator new (std::size_t size) throw (std::bad_alloc); That means that if we let a gdb QUIT exception escape from within operator new, the C++ runtime aborts due to unexpected exception thrown. So to bridge the gap, this patch adds a new gdb_quit_bad_alloc exception type that inherits both std::bad_alloc and gdb_exception, and throws _that_. If we decide that we should be catching memory allocation errors in fewer places than all the places we currently catch them (everywhere we use RETURN_MASK_ALL currently), then we could change operator new to throw plain std::bad_alloc then. But I'm considering such a change as separate matter from this one -- it'd make sense to do the same to xmalloc at the same time, for instance. Meanwhile, this allows using new/new[] instead of xmalloc/XNEW/etc. without losing the "virtual memory exhausted" internal-error safeguard. Tested on x86_64 Fedora 23. gdb/ChangeLog: 2016-09-23 Pedro Alves <[email protected]> * Makefile.in (SFILES): Add common/new-op.c. (COMMON_OBS): Add common/new-op.o. (new-op.o): New rule. * common/common-exceptions.h: Include <new>. (struct gdb_quit_bad_alloc): New type. * common/new-op.c: New file. gdb/gdbserver/ChangeLog: 2016-09-23 Pedro Alves <[email protected]> * Makefile.in (SFILES): Add common/new-op.c. (OBS): Add common/new-op.o. (new-op.o): New rule.

With this patch, when an inferior, thread or frame is explicitly selected by the user, notifications will appear on all CLI and MI UIs. When a GDB console is integrated in a front-end, this allows the front-end to follow a selection made by the user ont he CLI, and it informs the user about selection changes made behind the scenes by the front-end. This patch addresses PR gdb/20487. In order to communicate frame changes to the front-end, this patch adds a new field to the =thread-selected event for the selected frame. The idea is that since inferior/thread/frame can be seen as a composition, it makes sense to send them together in the same event. The vision would be to eventually send the inferior information as well, if we find that it's needed, although the "=thread-selected" event would be ill-named for that job. Front-ends need to handle this new field if they want to follow the frame selection changes that originate from the console. The format of the frame attribute is the same as what is found in the *stopped events. Here's a detailed example for each command and the events they generate: thread ------ 1. CLI command: thread 1.3 MI event: =thread-selected,id="3",frame={...} 2. MI command: -thread-select 3 CLI event: [Switching to thread 1.3 ...] 3. MI command (CLI-in-MI): thread 1.3 MI event/reply: &"thread 1.3\n" ~"#0 child_sub_function () ... =thread-selected,id="3",frame={level="0",...} ^done frame ----- 1. CLI command: frame 1 MI event: =thread-selected,id="3",frame={level="1",...} 2. MI command: -stack-select-frame 1 CLI event: tromey#1 0x00000000004007f0 in child_function... 3. MI command (CLI-in-MI): frame 1 MI event/reply: &"frame 1\n" ~"tromey#1 0x00000000004007f9 in ..." =thread-selected,id="3",frame={level="1"...} ^done inferior -------- Inferior selection events only go from the console to MI, since there's no way to select the inferior in pure MI. 1. CLI command: inferior 2 MI event: =thread-selected,id="3" Note that if the user selects an inferior that is not started or exited, the MI doesn't receive a notification. Since there is no threads to select, the =thread-selected event does not apply... 2. MI command (CLI-in-MI): inferior 2 MI event/reply: &"inferior 2\n" ~"[Switching to inferior 2 ...]" =thread-selected,id="4",frame={level="0"...} ^done Internal implementation detail: this patch makes it possible to suppress notifications caused by a CLI command, like what is done in mi-interp.c. This means that it's now possible to use the add_com_suppress_notification function to register a command with some event suppressed. It is used to implement the select-frame command in this patch. The function command_notifies_uscc_observer was added to extract the rather complicated logical expression from the if statement. It is also now clearer what that logic does: if the command used by the user already notifies the user_selected_context_changed observer, there is not need to notify it again. It therefore protects again emitting the event twice. No regressions, tested on ubuntu 14.04 x86 with target boards unix and native-extended-gdbserver. gdb/ChangeLog: YYYY-MM-DD Antoine Tremblay <[email protected]> YYYY-MM-DD Simon Marchi <[email protected]> PR gdb/20487 * NEWS: Mention new frame field of =thread-selected event. * cli/cli-decode.c (add_cmd): Initialize c->suppress_notification. (add_com_suppress_notification): New function definition. (cmd_func): Set and restore the suppress_notification flag. * cli/cli-deicode.h (struct cmd_list_element) <suppress_notification>: New field. * cli/cli-interp.c (cli_suppress_notification): New global variable. (cli_on_user_selected_context_changed): New function. (_initialize_cli_interp): Attach to user_selected_context_changed observer. * command.h (struct cli_suppress_notification): New structure. (cli_suppress_notification): New global variable declaration. (add_com_suppress_notification): New function declaration. * defs.h (enum user_selected_what_flag): New enum. (user_selected_what): New enum flag type. * frame.h (print_stack_frame_to_uiout): New function declaration. * gdbthread.h (print_selected_thread_frame): New function declaration. * inferior.c (print_selected_inferior): New function definition. (inferior_command): Remove printing of inferior/thread/frame switch notifications, notify user_selected_context_changed observer. * inferior.h (print_selected_inferior): New function declaration. * mi/mi-cmds.c (struct mi_cmd): Add user_selected_context suppression to stack-select-frame and thread-select commands. * mi/mi-interp.c (struct mi_suppress_notification) <user_selected_context>: Initialize. (mi_user_selected_context_changed): New function definition. (_initialize_mi_interp): Attach to user_selected_context_changed. * mi/mi-main.c (mi_cmd_thread_select): Print thread selection reply. (mi_execute_command): Handle notification suppression. Notify user_selected_context_changed observer on thread change instead of printing event directly. Don't send it if command already sends the notification. (command_notifies_uscc_observer): New function. (mi_cmd_execute): Don't handle notification suppression. * mi/mi-main.h (struct mi_suppress_notification) <user_selected_context>: New field. * stack.c (print_stack_frame_to_uiout): New function definition. (select_frame_command): Notify user_selected_context_changed observer. (frame_command): Call print_selected_thread_frame if there's no frame change or notify user_selected_context_changed observer if there is. (up_command): Notify user_selected_context_changed observer. (down_command): Likewise. (_initialize_stack): Suppress user_selected_context notification for command select-frame. * thread.c (thread_command): Notify user_selected_context_changed if the thread has changed, print thread info directly if it hasn't. (do_captured_thread_select): Do not print thread switch event. (print_selected_thread_frame): New function definition. * tui/tui-interp.c (tui_on_user_selected_context_changed): New function definition. (_initialize_tui_interp): Attach to user_selected_context_changed observer. gdb/doc/ChangeLog: PR gdb/20487 * gdb.texinfo (Context management): Update mention of frame change notifications. (gdb/mi Async Records): Document frame field in =thread-select event. * observer.texi (GDB Observers): New user_selected_context_changed observer. gdb/testsuite/ChangeLog: PR gdb/20487 * gdb.mi/mi-pthreads.exp (check_mi_thread_command_set): Adapt =thread-select-event check.

Even though this was supposedly in the gdb 7.2 timeframe, the testcase in PR11094 crashes current GDB with a segfault: Program received signal SIGSEGV, Segmentation fault. 0x00000000005ee894 in event_location_to_string (location=0x0) at src/gdb/location.c:412 412 if (EL_STRING (location) == NULL) (top-gdb) bt #0 0x00000000005ee894 in event_location_to_string (location=0x0) at src/gdb/location.c:412 tromey#1 0x000000000057411a in print_breakpoint_location (b=0x18288e0, loc=0x0) at src/gdb/breakpoint.c:6201 tromey#2 0x000000000057483f in print_one_breakpoint_location (b=0x18288e0, loc=0x182cf10, loc_number=0, last_loc=0x7fffffffd258, allflag=1) at src/gdb/breakpoint.c:6473 tromey#3 0x00000000005751e1 in print_one_breakpoint (b=0x18288e0, last_loc=0x7fffffffd258, allflag=1) at src/gdb/breakpoint.c:6707 tromey#4 0x000000000057589c in breakpoint_1 (args=0x0, allflag=1, filter=0x0) at src/gdb/breakpoint.c:6947 tromey#5 0x0000000000575aa8 in maintenance_info_breakpoints (args=0x0, from_tty=0) at src/gdb/breakpoint.c:7026 [...] This is GDB trying to print the location spec of the JIT event breakpoint, but that's an internal breakpoint without one. If I add a NULL check, then we see that the JIT breakpoint is now pending (because its location has shlib_disabled set): (gdb) maint info breakpoints Num Type Disp Enb Address What [...] -8 jit events keep y <PENDING> inf 1 [...] But that's incorrect. GDB should have managed to recreate the JIT breakpoint's location for the second run. So the problem is elsewhere. The problem is that if the JIT loads at the same address on the second run, we never recreate the JIT breakpoint, because we hit this early return: static int jit_breakpoint_re_set_internal (struct gdbarch *gdbarch, struct jit_program_space_data *ps_data) { [...] if (ps_data->cached_code_address == addr) return 0; [...] delete_breakpoint (ps_data->jit_breakpoint); [...] ps_data->jit_breakpoint = create_jit_event_breakpoint (gdbarch, addr); Fix this by deleting the breakpoint and discarding the cached code address when the objfile where the previous JIT breakpoint was found is deleted/unloaded in the first place. The test that was originally added for PR11094 doesn't trip on this because: tromey#1 - It doesn't test the case of the JIT descriptor's address _not_ changing between reruns. tromey#2 - And then it doesn't do "maint info breakpoints", or really anything with the JIT at all. tromey#3 - and even then, to trigger the problem the JIT descriptor needs to be in a separate library, while the current test puts it in the main program. The patch extends the test to cover all combinations of these scenarios. gdb/ChangeLog: 2016-10-06 Pedro Alves <[email protected]> * jit.c (free_objfile_data): Delete the JIT breakpoint and clear the cached code address. gdb/testsuite/ChangeLog: 2016-10-06 Pedro Alves <[email protected]> * gdb.base/jit-simple-dl.c: New file. * gdb.base/jit-simple-jit.c: New file, factored out from ... * gdb.base/jit-simple.c: ... this. * gdb.base/jit-simple.exp (jit_run): Delete. (build_jit): New proc. (jit_test_reread): Recompile either the main program or the shared library, depending on what is being tested. Skip changing address if caller wants to. Compare before/after addresses. If testing standalone, explicitly load the binary. Test "maint info breakpoints". (top level): Add "standalone vs shared lib" and "change address" vs "same address" axes.

Nowadays, if we build GDB with -fsanitize=address, we can get the asan error below, (gdb) quit ================================================================= ==9723==ERROR: AddressSanitizer: alloc-dealloc-mismatch (malloc vs operator delete) on 0x60200003bf70 #0 0x7f88f3837527 in operator delete(void*) (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x55527) tromey#1 0xac8e13 in __gnu_cxx::new_allocator<void (*)()>::deallocate(void (**)(), unsigned long) /usr/include/c++/4.9/ext/new_allocator.h:110 tromey#2 0xac8cc2 in __gnu_cxx::__alloc_traits<std::allocator<void (*)()> >::deallocate(std::allocator<void (*)()>&, void (**)(), unsigned long) /usr/include/c++/4.9/ext/alloc_traits.h:185 .... 0x60200003bf70 is located 0 bytes inside of 8-byte region [0x60200003bf70,0x60200003bf78) allocated by thread T0 here: #0 0x7f88f38367ef in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x547ef) tromey#1 0xbd2762 in operator new(unsigned long) /home/yao/SourceCode/gnu/gdb/git/gdb/common/new-op.c:42 tromey#2 0xac8edc in __gnu_cxx::new_allocator<void (*)()>::allocate(unsigned long, void const*) /usr/include/c++/4.9/ext/new_allocator.h:104 tromey#3 0xac8d81 in __gnu_cxx::__alloc_traits<std::allocator<void (*)()> >::allocate(std::allocator<void (*)()>&, unsigned long) /usr/include/c++/4.9/ext/alloc_traits.h:182 The reason for this is that we override operator new but don't override operator delete. This patch does the override if the code is NOT compiled with asan. gdb: 2016-10-25 Yao Qi <[email protected]> PR gdb/20716 * common/new-op.c (__has_feature): New macro. Don't override operator new if asan is used.

Currently GDB never sends more than one action per vCont packet, when connected in non-stop mode. A follow up patch will change that, and it exposed a gdbserver problem with the vCont handling. For example, this in non-stop mode: => vCont;s:p1.1;c <= OK Should be equivalent to: => vCont;s:p1.1 <= OK => vCont;c <= OK But gdbserver currently doesn't handle this. In the latter case, "vCont;c" makes gdbserver clobber the previous step request. This patch fixes that. Note the server side must ignore resume actions for the thread that has a pending %Stopped notification (and any other threads with events pending), until GDB acks the notification with vStopped. Otherwise, e.g., the following case is mishandled: tromey#1 => g (or any other packet) tromey#2 <= [registers] tromey#3 <= %Stopped T05 thread:p1.2 tromey#4 => vCont s:p1.1;c tromey#5 <= OK Above, the server must not resume thread p1.2 when it processes the vCont. GDB can't know that p1.2 stopped until it acks the %Stopped notification. (Otherwise it wouldn't send a default "c" action.) (The vCont documentation already specifies this.) Finally, special care must also be given to handling fork/vfork events. A (v)fork event actually tells us that two processes stopped -- the parent and the child. Until we follow the fork, we must not resume the child. Therefore, if we have a pending fork follow, we must not send a global wildcard resume action (vCont;c). We can still send process-wide wildcards though. (The comments above will be added as code comments to gdb in a follow up patch.) gdb/gdbserver/ChangeLog: 2016-10-26 Pedro Alves <[email protected]> * linux-low.c (handle_extended_wait): Link parent/child fork threads. (linux_wait_1): Unlink them. (linux_set_resume_request): Ignore resume requests for already-resumed and unhandled fork child threads. * linux-low.h (struct lwp_info) <fork_relative>: New field. * server.c (in_queued_stop_replies_ptid, in_queued_stop_replies): New functions. (handle_v_requests) <vCont>: Don't call require_running. * server.h (in_queued_stop_replies): New declaration.

Most of the time, the trace should be in one piece. This case is handled fine by GDB. In some cases, however, there may be gaps in the trace. They result from trace decode errors or from overflows. A gap in the trace means we lost an unknown amount of trace. Gaps can be very small, such as a few instructions in the same function, or they can be rather big. We may, for example, lose a few function calls or returns. The trace may continue in a different function and we likely don't know how we got there. Even though we can't say how the program executed across a gap, higher levels may not be impacted too much by it. Let's assume we have functions a-e and a trace that looks roughly like this: a \ b b \ / c <gap> c / d d \ / e Even though we can't say for sure, it is likely that b and c are the same function instance before and after the gap. This patch is trying to connect the c and b function segments across the gap. This will add a to the back trace of b on the right hand side. The changes are reflected in GDB's internal representation of the trace and will improve: - the output of "record function-call-history /c" - the output of "backtrace" in replay mode - source stepping in replay mode will be improved indirectly via the improved back trace I don't have an automated test for this patch; decode errors will be fixed and overflows occur sporadically and are quite rare. I tested it by hacking GDB to provoke a decode error and on the expected gap in the gdb.btrace/dlopen.exp test. The issue is that we can't predict where we will be able to re-sync in case of errors. For the expected decode error in gdb.btrace/dlopen.exp, for example, we may be able to re-sync somewhere in dlclose, in test, in main, or not at all. Here's one example run of gdb.btrace/dlopen.exp with and without this patch. (gdb) info record Active record target: record-btrace Recording format: Intel Processor Trace. Buffer size: 16kB. warning: Non-contiguous trace at instruction 66608 (offset = 0xa83, pc = 0xb7fdcc31). warning: Non-contiguous trace at instruction 66652 (offset = 0xa9b, pc = 0xb7fdcc31). warning: Non-contiguous trace at instruction 66770 (offset = 0xacb, pc = 0xb7fdcc31). warning: Non-contiguous trace at instruction 66966 (offset = 0xb60, pc = 0xb7ff5ee4). warning: Non-contiguous trace at instruction 66994 (offset = 0xb74, pc = 0xb7ff5f24). warning: Non-contiguous trace at instruction 67334 (offset = 0xbac, pc = 0xb7ff5e6d). warning: Non-contiguous trace at instruction 69022 (offset = 0xc04, pc = 0xb7ff60b3). warning: Non-contiguous trace at instruction 69116 (offset = 0xc1c, pc = 0xb7ff60b3). warning: Non-contiguous trace at instruction 69504 (offset = 0xc74, pc = 0xb7ff605d). warning: Non-contiguous trace at instruction 83648 (offset = 0xecc, pc = 0xb7ff6134). warning: Decode error (-13) at instruction 83876 (offset = 0xf48, pc = 0xb7fd6380): no memory mapped at this address. warning: Non-contiguous trace at instruction 83876 (offset = 0x11b7, pc = 0xb7ff1c70). Recorded 83948 instructions in 912 functions (12 gaps) for thread 1 (process 12996). (gdb) record instruction-history 83876, +2 83876 => 0xb7fec46f <call_init.part.0+95>: call *%eax [decode error (-13): no memory mapped at this address] [disabled] 83877 0xb7ff1c70 <_dl_close_worker.part.0+1584>: nop Without the patch, the trace is disconnected and the backtrace is short: (gdb) record goto 83876 #0 0xb7fec46f in call_init.part () from /lib/ld-linux.so.2 (gdb) backtrace #0 0xb7fec46f in call_init.part () from /lib/ld-linux.so.2 #1 0xb7fec5d0 in _dl_init () from /lib/ld-linux.so.2 #2 0xb7ff0fe3 in dl_open_worker () from /lib/ld-linux.so.2 Backtrace stopped: not enough registers or memory available to unwind further (gdb) record goto 83877 #0 0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2 (gdb) backtrace #0 0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2 #1 0xb7ff287a in _dl_close () from /lib/ld-linux.so.2 #2 0xb7fc3d5d in dlclose_doit () from /lib/libdl.so.2 #3 0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2 #4 0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2 #5 0xb7fc3d98 in dlclose () from /lib/libdl.so.2 #6 0x0804860a in test () #7 0x08048628 in main () With the patch, GDB is able to connect the trace pieces and we get a full backtrace. (gdb) record goto 83876 #0 0xb7fec46f in call_init.part () from /lib/ld-linux.so.2 (gdb) backtrace #0 0xb7fec46f in call_init.part () from /lib/ld-linux.so.2 #1 0xb7fec5d0 in _dl_init () from /lib/ld-linux.so.2 #2 0xb7ff0fe3 in dl_open_worker () from /lib/ld-linux.so.2 #3 0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2 #4 0xb7ff02e2 in _dl_open () from /lib/ld-linux.so.2 #5 0xb7fc3c65 in dlopen_doit () from /lib/libdl.so.2 #6 0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2 #7 0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2 #8 0xb7fc3d0e in dlopen@@GLIBC_2.1 () from /lib/libdl.so.2 #9 0xb7ff28ee in _dl_runtime_resolve () from /lib/ld-linux.so.2 #10 0x0804841c in ?? () #11 0x08048470 in dlopen@plt () #12 0x080485a3 in test () #13 0x08048628 in main () (gdb) record goto 83877 #0 0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2 (gdb) backtrace #0 0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2 #1 0xb7ff287a in _dl_close () from /lib/ld-linux.so.2 #2 0xb7fc3d5d in dlclose_doit () from /lib/libdl.so.2 #3 0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2 #4 0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2 #5 0xb7fc3d98 in dlclose () from /lib/libdl.so.2 #6 0x0804860a in test () #7 0x08048628 in main () It worked nicely in this case but it may, of course, also lead to weird connections; it is a heuristic, after all. It works best when the gap is small and the trace pieces are long. gdb/ * btrace.c (bfun_s): New typedef. (ftrace_update_caller): Print caller in debug dump. (ftrace_get_caller, ftrace_match_backtrace, ftrace_fixup_level) (ftrace_compute_global_level_offset, ftrace_connect_bfun) (ftrace_connect_backtrace, ftrace_bridge_gap, btrace_bridge_gaps): New. (btrace_compute_ftrace_bts): Pass vector of gaps. Collect gaps. (btrace_compute_ftrace_pt): Likewise. (btrace_compute_ftrace): Split into this, ... (btrace_compute_ftrace_1): ... this, and ... (btrace_finalize_ftrace): ... this. Call btrace_bridge_gaps.

… frame This patch ensures that the frame id for the current frame is stashed before that of the previous frame (to the current frame). First, it should be noted that the frame id for the current frame is not stashed by get_current_frame(). The current frame's frame id is lazily computed and stashed via calls to get_frame_id(). However, it's possible for get_prev_frame() to be called without first stashing the current frame. The frame stash is used not only to speed up frame lookups, but also to detect cycles. When attempting to compute the frame id for a "previous" frame (in get_prev_frame_if_no_cycle), a cycle is detected if the computed frame id is already in the stash. If it should happen that a previous frame id is stashed which should represent a cycle for the current frame, then an assertion failure will trigger should get_frame_id() be later called to determine the frame id for the current frame. As of late 2016, with the "Tweak meaning of VALUE_FRAME_ID" patch in place, this actually occurs when running the gdb.dwarf2/dw2-dup-frame.exp test. While attempting to generate a backtrace, the python frame filter code is invoked, leading to frame_info_to_frame_object() (in python/py-frame.c) being called. That function will potentially call get_prev_frame() before get_frame_id() is called. The call to get_prev_frame() can eventually end up in get_prev_frame_if_no_cycle() which, in turn, calls compute_frame_id(), after which the frame id is stashed for the previous frame. If the frame id for the current frame is stashed, the cycle detection code (which relies on the frame stash) in get_prev_frame_if_no_cycle() will be triggered for a cycle starting with the current frame. If the current frame's id is not stashed, the cycle detecting code can't operate as designed. Instead, when get_frame_id() is called on the current frame at some later point, the current frame's id will found to be already in the stash, triggering an assertion failure. Below is an in depth examination of the failure which lead to this change. I've shortened pathnames for brevity and readability. Here's the portion of the log file showing the failure/internal error: (gdb) break stop_frame Breakpoint 1 at 0x40059a: file dw2-dup-frame.c, line 22. (gdb) run Starting program: testsuite/outputs/gdb.dwarf2/dw2-dup-frame/dw2-dup-frame Breakpoint 1, stop_frame () at dw2-dup-frame.c:22 22 } (gdb) bt gdb/frame.c:544: internal-error: frame_id get_frame_id(frame_info*): Assertion `stashed' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.dwarf2/dw2-dup-frame.exp: backtrace from stop_frame (GDB internal error) Here's a partial backtrace from the internal error, showing the frames which I think are relevant, plus several extra to provide context: #0 internal_error ( file=0x932b98 "gdb/frame.c", line=544, fmt=0x932b20 "%s: Assertion `%s' failed.") at gdb/common/errors.c:54 #1 0x000000000072207e in get_frame_id (fi=0xe5a760) at gdb/frame.c:544 #2 0x00000000004eb50d in frame_info_to_frame_object (frame=0xe5a760) at gdb/python/py-frame.c:390 #3 0x00000000004ef5be in bootstrap_python_frame_filters (frame=0xe5a760, frame_low=0, frame_high=-1) at gdb/python/py-framefilter.c:1453 #4 0x00000000004ef7a9 in gdbpy_apply_frame_filter ( extlang=0x8857e0 <extension_language_python>, frame=0xe5a760, flags=7, args_type=CLI_SCALAR_VALUES, out=0xf6def0, frame_low=0, frame_high=-1) at gdb/python/py-framefilter.c:1548 #5 0x00000000005f2c5a in apply_ext_lang_frame_filter (frame=0xe5a760, flags=7, args_type=CLI_SCALAR_VALUES, out=0xf6def0, frame_low=0, frame_high=-1) at gdb/extension.c:572 #6 0x00000000005ea896 in backtrace_command_1 (count_exp=0x0, show_locals=0, no_filters=0, from_tty=1) at gdb/stack.c:1834 Examination of the code in frame_info_to_frame_object(), which is in python/py-frame.c, is key to understanding this problem: if (get_prev_frame (frame) == NULL && get_frame_unwind_stop_reason (frame) != UNWIND_NO_REASON && get_next_frame (frame) != NULL) { frame_obj->frame_id = get_frame_id (get_next_frame (frame)); frame_obj->frame_id_is_next = 1; } else { frame_obj->frame_id = get_frame_id (frame); frame_obj->frame_id_is_next = 0; } I will first note that the frame id for frame has not been computed yet. (This was verified by placing a breakpoint on compute_frame_id().) The call to get_prev_frame() causes the the frame id to (eventually) be computed for the previous frame. Here's a backtrace showing how we get there: #0 compute_frame_id (fi=0x10e2810) at gdb/frame.c:496 #1 0x0000000000724a67 in get_prev_frame_if_no_cycle (this_frame=0xe5a760) at gdb/frame.c:1871 #2 0x0000000000725136 in get_prev_frame_always_1 (this_frame=0xe5a760) at gdb/frame.c:2045 #3 0x000000000072516b in get_prev_frame_always (this_frame=0xe5a760) at gdb/frame.c:2061 #4 0x000000000072570f in get_prev_frame (this_frame=0xe5a760) at gdb/frame.c:2303 #5 0x00000000004eb471 in frame_info_to_frame_object (frame=0xe5a760) at gdb/python/py-frame.c:381 For this particular case, we end up in the else clause of the code above which calls get_frame_id (frame). It's at this point that the frame id for frame is computed. Again, here's a backtrace: #0 compute_frame_id (fi=0xe5a760) at gdb/frame.c:496 #1 0x000000000072203d in get_frame_id (fi=0xe5a760) at gdb/frame.c:539 #2 0x00000000004eb50d in frame_info_to_frame_object (frame=0xe5a760) at gdb/python/py-frame.c:390 The test in question, dw2-dup-frame.exp, deliberately creates a broken (cyclic) stack. So, in this instance, the frame id for the prev `frame' will be the same as that for `frame'. But that particular frame id ended up in the stash during the previous frame operation. When, just a few lines later, we compute the frame id for `frame', the id in question is already in the stash, thus triggering the assertion failure. I considered two other solutions to solving this problem: We could prevent get_prev_frame() from being called before get_frame_id() in frame_info_to_frame_object(). (See above for the snippet of code where this happens.) A call to get_frame_id (frame) could be placed ahead of that code snippet above. I have tested this approach and, while it does work, I can't be certain that get_prev_frame() isn't called ahead of stashing the current frame somewhere else in GDB, but in a less obvious way. Another approach is to stash the current frame's id by calling get_frame_id() in get_current_frame(). This approach is conceptually simpler, but when importing a python unwinder, has the unwelcome side effect of causing the unwinder to be called during import. A cleaner looking fix would be to place this code after code corresponding to the "Don't compute the frame id of the current frame yet..." comment in get_prev_frame_if_no_cycle(). Sadly, this does not work though; by the time we get to this point, the frame state for the prev frame has been modified just enough to cause an internal error to occur when attempting to compute the (current) frame id for inline frames. (The unexpected failure count increases by roughly 130 failures.) Therefore, I decided to place it as early as possible in get_prev_frame(). gdb/ChangeLog: * frame.c (get_prev_frame): Stash frame id for current frame prior to computing frame id for previous frame.

This patch fixes a few problems with GDB's time handling. #1 - It avoids problems with gnulib's C++ namespace support On MinGW, the struct timeval that should be passed to gnulib's gettimeofday replacement is incompatible with libiberty's timeval_sub/timeval_add. That's because gnulib also replaces "struct timeval" with its own definition, while libiberty expects the system's. E.g., in code like this: gettimeofday (&prompt_ended, NULL); timeval_sub (&prompt_delta, &prompt_ended, &prompt_started); timeval_add (&prompt_for_continue_wait_time, &prompt_for_continue_wait_time, &prompt_delta); That's currently handled in gdb by not using gnulib's gettimeofday at all (see common/gdb_sys_time.h), but that #undef hack won't work with if/when we enable gnulib's C++ namespace support, because that mode adds compile time warnings for uses of ::gettimeofday, which are hard errors with -Werror. #2 - But there's an elephant in the room: gettimeofday is not monotonic... We're using it to: a) check how long functions take, for performance analysis b) compute when in the future to fire events in the event-loop c) print debug timestamps But that's exactly what gettimeofday is NOT meant for. Straight from the man page: ~~~ The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the system time). If you need a monotonically increasing clock, see clock_gettime(2). ~~~ std::chrono (part of the C++11 standard library) has a monotonic clock exactly for such purposes (std::chrono::steady_clock). This commit switches to use that instead of gettimeofday, fixing all the issues mentioned above. gdb/ChangeLog: 2016-11-23 Pedro Alves <[email protected]> * Makefile.in (SFILES): Add common/run-time-clock.c. (HFILES_NO_SRCDIR): Add common/run-time-clock.h. (COMMON_OBS): Add run-time-clock.o. * common/run-time-clock.c, common/run-time-clock.h: New files. * defs.h (struct timeval, print_transfer_performance): Delete declarations. * event-loop.c (struct gdb_timer) <when>: Now a std::chrono::steady_clock::time_point. (create_timer): use std::chrono::steady_clock instead of gettimeofday. Use new instead of malloc. (delete_timer): Use delete instead of xfree. (duration_cast_timeval): New. (update_wait_timeout): Use std::chrono::steady_clock instead of gettimeofday. * maint.c: Include <chrono> instead of "gdb_sys_time.h", <time.h> and "timeval-utils.h". (scoped_command_stats::~scoped_command_stats) (scoped_command_stats::scoped_command_stats): Use std::chrono::steady_clock instead of gettimeofday. Use user_cpu_time_clock instead of get_run_time. * maint.h: Include "run-time-clock.h" and <chrono>. (scoped_command_stats): <m_start_cpu_time>: Now a user_cpu_time_clock::time_point. <m_start_wall_time>: Now a std::chrono::steady_clock::time_point. * mi/mi-main.c: Include "run-time-clock.h" and <chrono> instead of "gdb_sys_time.h" and <sys/resource.h>. (rusage): Delete. (mi_execute_command): Use new instead of XNEW. (mi_load_progress): Use std::chrono::steady_clock instead of gettimeofday. (timestamp): Rewrite in terms of std::chrono::steady_clock, user_cpu_time_clock and system_cpu_time_clock. (timeval_diff): Delete. (print_diff): Adjust to use std::chrono::steady_clock, user_cpu_time_clock and system_cpu_time_clock. * mi/mi-parse.h: Include "run-time-clock.h" and <chrono> instead of "gdb_sys_time.h". (struct mi_timestamp): Change fields types to std::chrono::steady_clock::time_point, user_cpu_time_clock::time and system_cpu_time_clock::time_point, instead of struct timeval. * symfile.c: Include <chrono> instead of <time.h> and "gdb_sys_time.h". (struct time_range): New. (generic_load): Use std::chrono::steady_clock instead of gettimeofday. (print_transfer_performance): Replace timeval parameters with a std::chrono::steady_clock::duration parameter. Adjust. * utils.c: Include <chrono> instead of "timeval-utils.h", "gdb_sys_time.h", and <time.h>. (prompt_for_continue_wait_time): Now a std::chrono::steady_clock::duration. (defaulted_query, prompt_for_continue): Use std::chrono::steady_clock instead of gettimeofday/timeval_sub/timeval_add. (reset_prompt_for_continue_wait_time): Use std::chrono::steady_clock::duration instead of struct timeval. (get_prompt_for_continue_wait_time): Return a std::chrono::steady_clock::duration instead of struct timeval. (vfprintf_unfiltered): Use std::chrono::steady_clock instead of gettimeofday. Use std::string. Use '.' instead of ':'. * utils.h: Include <chrono>. (get_prompt_for_continue_wait_time): Return a std::chrono::steady_clock::duration instead of struct timeval. gdb/gdbserver/ChangeLog: 2016-11-23 Pedro Alves <[email protected]> * debug.c: Include <chrono> instead of "gdb_sys_time.h". (debug_vprintf): Use std::chrono::steady_clock instead of gettimeofday. Use '.' instead of ':'. * tracepoint.c: Include <chrono> instead of "gdb_sys_time.h". (get_timestamp): Use std::chrono::steady_clock instead of gettimeofday.

On ppc64le-linux, I run into: ... (gdb) bt^M #0 0x00000000100006dc in foobar (J=2)^M #1 0x000000001000070c in prog ()^M (gdb) FAIL: gdb.dwarf2/dw2-entry-points.exp: bt foo ... The test-case attemps to emulate additional entry points of a function, with function bar having entry points foo and foobar: ... (gdb) p bar $1 = {void (int, int)} 0x1000064c <bar> (gdb) p foo $2 = {void (int, int)} 0x10000698 <foo> (gdb) p foobar $3 = {void (int)} 0x100006d0 <foobar> ... However, when setting a breakpoint on the entry point foo: ... (gdb) b foo Breakpoint 1 at 0x100006dc ... it ends up in foobar instead of in foo, due to prologue skipping, and consequently the backtrace show foobar instead foo. The problem is that the test-case does not emulate an actual prologue at each entry point. Fix this by disabling the prologue skipping when setting a breakpoint, using "break *foo". Tested on ppc64le-linux and x86_64-linux. Tested-By: Guinevere Larsen <[email protected]> Approved-By: Ulrich Weigand <[email protected]> PR testsuite/31232 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31232

The testsuite for SCFI contains target-specific tests. When a test is executed with --scfi=experimental command line option, the CFI annotations in the test .s files are skipped altogether by the GAS for processing. The CFI directives in the input assembly files are, however, validated by running the assembler one more time without --scfi=experimental. Some testcases are used to highlight those asm constructs that the SCFI machinery in GAS currently does not support: - Only System V AMD64 ABI is supported for now. Using either --32 or --x32 with SCFI results in hard error. See scfi-unsupported-1.s. - Untraceable stack-pointer manipulation in function epilougue and prologue. See scfi-unsupported-2.s. - Using Dynamically Realigned Arguement Pointer (DRAP) register to realign the stack. For SCFI, the CFA must be only REG_SP or REG_FP based. See scfi-unsupported-drap-1.s Some testcases are used to highlight some diagnostics that the SCFI machinery in GAS currently issues, with an intent to help user correct inadvertent errors in their hand-written asm. An error is issued when GAS finds that input asm is not amenable to correct CFI synthesis. - (#1) "Warning: SCFI: Asymetrical register restore" - (#2) "Error: SCFI: usage of REG_FP as scratch not supported" - (#3) "Error: SCFI: unsupported stack manipulation pattern" In case of (#2) and (#3), SCFI generation is skipped for the respective function. Above is a subset of the warnings/errors implemented in the code. gas/testsuite/: * gas/scfi/README: New test. * gas/scfi/x86_64/ginsn-add-1.l: New test. * gas/scfi/x86_64/ginsn-add-1.s: New test. * gas/scfi/x86_64/ginsn-dw2-regnum-1.l: New test. * gas/scfi/x86_64/ginsn-dw2-regnum-1.s: New test. * gas/scfi/x86_64/ginsn-pop-1.l: New test. * gas/scfi/x86_64/ginsn-pop-1.s: New test. * gas/scfi/x86_64/ginsn-push-1.l: New test. * gas/scfi/x86_64/ginsn-push-1.s: New test. * gas/scfi/x86_64/scfi-add-1.d: New test. * gas/scfi/x86_64/scfi-add-1.l: New test. * gas/scfi/x86_64/scfi-add-1.s: New test. * gas/scfi/x86_64/scfi-add-2.d: New test. * gas/scfi/x86_64/scfi-add-2.l: New test. * gas/scfi/x86_64/scfi-add-2.s: New test. * gas/scfi/x86_64/scfi-asm-marker-1.d: New test. * gas/scfi/x86_64/scfi-asm-marker-1.l: New test. * gas/scfi/x86_64/scfi-asm-marker-1.s: New test. * gas/scfi/x86_64/scfi-asm-marker-2.d: New test. * gas/scfi/x86_64/scfi-asm-marker-2.l: New test. * gas/scfi/x86_64/scfi-asm-marker-2.s: New test. * gas/scfi/x86_64/scfi-asm-marker-3.d: New test. * gas/scfi/x86_64/scfi-asm-marker-3.l: New test. * gas/scfi/x86_64/scfi-asm-marker-3.s: New test. * gas/scfi/x86_64/scfi-bp-sp-1.d: New test. * gas/scfi/x86_64/scfi-bp-sp-1.l: New test. * gas/scfi/x86_64/scfi-bp-sp-1.s: New test. * gas/scfi/x86_64/scfi-bp-sp-2.d: New test. * gas/scfi/x86_64/scfi-bp-sp-2.l: New test. * gas/scfi/x86_64/scfi-bp-sp-2.s: New test. * gas/scfi/x86_64/scfi-callee-saved-1.d: New test. * gas/scfi/x86_64/scfi-callee-saved-1.l: New test. * gas/scfi/x86_64/scfi-callee-saved-1.s: New test. * gas/scfi/x86_64/scfi-callee-saved-2.d: New test. * gas/scfi/x86_64/scfi-callee-saved-2.l: New test. * gas/scfi/x86_64/scfi-callee-saved-2.s: New test. * gas/scfi/x86_64/scfi-callee-saved-3.d: New test. * gas/scfi/x86_64/scfi-callee-saved-3.l: New test. * gas/scfi/x86_64/scfi-callee-saved-3.s: New test. * gas/scfi/x86_64/scfi-callee-saved-4.d: New test. * gas/scfi/x86_64/scfi-callee-saved-4.l: New test. * gas/scfi/x86_64/scfi-callee-saved-4.s: New test. * gas/scfi/x86_64/scfi-cfg-1.d: New test. * gas/scfi/x86_64/scfi-cfg-1.l: New test. * gas/scfi/x86_64/scfi-cfg-1.s: New test. * gas/scfi/x86_64/scfi-cfg-2.d: New test. * gas/scfi/x86_64/scfi-cfg-2.l: New test. * gas/scfi/x86_64/scfi-cfg-2.s: New test. * gas/scfi/x86_64/scfi-cfi-label-1.d: New test. * gas/scfi/x86_64/scfi-cfi-label-1.l: New test. * gas/scfi/x86_64/scfi-cfi-label-1.s: New test. * gas/scfi/x86_64/scfi-cfi-sections-1.d: New test. * gas/scfi/x86_64/scfi-cfi-sections-1.l: New test. * gas/scfi/x86_64/scfi-cfi-sections-1.s: New test. * gas/scfi/x86_64/scfi-cofi-1.d: New test. * gas/scfi/x86_64/scfi-cofi-1.l: New test. * gas/scfi/x86_64/scfi-cofi-1.s: New test. * gas/scfi/x86_64/scfi-diag-1.l: New test. * gas/scfi/x86_64/scfi-diag-1.s: New test. * gas/scfi/x86_64/scfi-diag-2.l: New test. * gas/scfi/x86_64/scfi-diag-2.s: New test. * gas/scfi/x86_64/scfi-dyn-stack-1.d: New test. * gas/scfi/x86_64/scfi-dyn-stack-1.l: New test. * gas/scfi/x86_64/scfi-dyn-stack-1.s: New test. * gas/scfi/x86_64/scfi-enter-1.d: New test. * gas/scfi/x86_64/scfi-enter-1.l: New test. * gas/scfi/x86_64/scfi-enter-1.s: New test. * gas/scfi/x86_64/scfi-fp-diag-2.l: New test. * gas/scfi/x86_64/scfi-fp-diag-2.s: New test. * gas/scfi/x86_64/scfi-indirect-mov-1.d: New test. * gas/scfi/x86_64/scfi-indirect-mov-1.l: New test. * gas/scfi/x86_64/scfi-indirect-mov-1.s: New test. * gas/scfi/x86_64/scfi-indirect-mov-2.d: New test. * gas/scfi/x86_64/scfi-indirect-mov-2.l: New test. * gas/scfi/x86_64/scfi-indirect-mov-2.s: New test. * gas/scfi/x86_64/scfi-indirect-mov-3.d: New test. * gas/scfi/x86_64/scfi-indirect-mov-3.l: New test. * gas/scfi/x86_64/scfi-indirect-mov-3.s: New test. * gas/scfi/x86_64/scfi-indirect-mov-4.d: New test. * gas/scfi/x86_64/scfi-indirect-mov-4.l: New test. * gas/scfi/x86_64/scfi-indirect-mov-4.s: New test. * gas/scfi/x86_64/scfi-indirect-mov-5.s: New test. * gas/scfi/x86_64/scfi-lea-1.d: New test. * gas/scfi/x86_64/scfi-lea-1.l: New test. * gas/scfi/x86_64/scfi-lea-1.s: New test. * gas/scfi/x86_64/scfi-leave-1.d: New test. * gas/scfi/x86_64/scfi-leave-1.l: New test. * gas/scfi/x86_64/scfi-leave-1.s: New test. * gas/scfi/x86_64/scfi-pushq-1.d: New test. * gas/scfi/x86_64/scfi-pushq-1.l: New test. * gas/scfi/x86_64/scfi-pushq-1.s: New test. * gas/scfi/x86_64/scfi-pushsection-1.d: New test. * gas/scfi/x86_64/scfi-pushsection-1.l: New test. * gas/scfi/x86_64/scfi-pushsection-1.s: New test. * gas/scfi/x86_64/scfi-pushsection-2.d: New test. * gas/scfi/x86_64/scfi-pushsection-2.l: New test. * gas/scfi/x86_64/scfi-pushsection-2.s: New test. * gas/scfi/x86_64/scfi-selfalign-func-1.d: New test. * gas/scfi/x86_64/scfi-selfalign-func-1.l: New test. * gas/scfi/x86_64/scfi-selfalign-func-1.s: New test. * gas/scfi/x86_64/scfi-simple-1.d: New test. * gas/scfi/x86_64/scfi-simple-1.l: New test. * gas/scfi/x86_64/scfi-simple-1.s: New test. * gas/scfi/x86_64/scfi-simple-2.d: New test. * gas/scfi/x86_64/scfi-simple-2.l: New test. * gas/scfi/x86_64/scfi-simple-2.s: New test. * gas/scfi/x86_64/scfi-sub-1.d: New test. * gas/scfi/x86_64/scfi-sub-1.l: New test. * gas/scfi/x86_64/scfi-sub-1.s: New test. * gas/scfi/x86_64/scfi-sub-2.d: New test. * gas/scfi/x86_64/scfi-sub-2.l: New test. * gas/scfi/x86_64/scfi-sub-2.s: New test. * gas/scfi/x86_64/scfi-unsupported-1.l: New test. * gas/scfi/x86_64/scfi-unsupported-1.s: New test. * gas/scfi/x86_64/scfi-unsupported-2.l: New test. * gas/scfi/x86_64/scfi-unsupported-2.s: New test. * gas/scfi/x86_64/scfi-unsupported-3.l: New test. * gas/scfi/x86_64/scfi-unsupported-3.s: New test. * gas/scfi/x86_64/scfi-unsupported-4.l: New test. * gas/scfi/x86_64/scfi-unsupported-4.s: New test. * gas/scfi/x86_64/scfi-unsupported-cfg-1.l: New test. * gas/scfi/x86_64/scfi-unsupported-cfg-1.s: New test. * gas/scfi/x86_64/scfi-unsupported-cfg-2.l: New test. * gas/scfi/x86_64/scfi-unsupported-cfg-2.s: New test. * gas/scfi/x86_64/scfi-unsupported-drap-1.l: New test. * gas/scfi/x86_64/scfi-unsupported-drap-1.s: New test. * gas/scfi/x86_64/scfi-unsupported-insn-1.l: New test. * gas/scfi/x86_64/scfi-unsupported-insn-1.s: New test. * gas/scfi/x86_64/scfi-x86-64.exp: New file.

A review comment on the SCFI V4 series was to handle ginsn creation for certain lea opcodes more precisely. Specifically, we should preferably handle the following two cases of lea opcodes similarly: - #1 lea with "index register and scale factor of 1, but no base register", - #2 lea with "no index register, but base register present". Currently, a ginsn of type GINSN_TYPE_OTHER is generated for the case of #1 above. For #2, however, the lea insn is translated to either a GINSN_TYPE_ADD or GINSN_TYPE_MOV depending on whether the immediate for displacement is non-zero or not respectively. Change the handling in x86_ginsn_lea so that both of the above lea manifestations are handled similarly. While at it, remove the code paths creating GINSN_TYPE_OTHER altogether from the function. It makes sense to piggy back on the x86_ginsn_unhandled code path to create GINSN_TYPE_OTHER if the destination register is interesting. This was also suggested in one of the previous review rounds; the other functions already follow that model, so this keeps functions symmetrical looking. gas/ * gas/config/tc-i386.c (x86_ginsn_lea): Handle select lea ops with no base register similar to the case of no index register. Remove creation of GINSN_TYPE_OTHER from the function. gas/testsuite/ * gas/scfi/x86_64/ginsn-lea-1.l: New test. * gas/scfi/x86_64/ginsn-lea-1.s: Likewise. * gas/scfi/x86_64/scfi-x86-64.exp: Add new test.

Bug PR gdb/28313 describes attaching to a process when the executable has been deleted. The bug is for S390 and describes how a user sees a message 'PC not saved'. On x86-64 (GNU/Linux) I don't see a 'PC not saved' message, but instead I see this: (gdb) attach 901877 Attaching to process 901877 No executable file now. warning: Could not load vsyscall page because no executable was specified 0x00007fa9d9c121e7 in ?? () (gdb) bt #0 0x00007fa9d9c121e7 in ?? () #1 0x00007fa9d9c1211e in ?? () #2 0x0000000000000007 in ?? () #3 0x000000002dc8b18d in ?? () #4 0x0000000000000000 in ?? () (gdb) Notice that the addresses in the backtrace don't seem right, quickly heading to 0x7 and finally ending at 0x0. What's going on, in both the s390 case and the x86-64 case is that the architecture's prologue scanner is going wrong and causing the stack unwinding to fail. The prologue scanner goes wrong because GDB has no unwind information. And GDB has no unwind information because, of course, the executable has been deleted. Notice in the example session above we get this line in the output: No executable file now. which indicates that GDB failed to find an executable to debug. For GNU/Linux when GDB tries to find an executable for a given pid we end up calling linux_proc_pid_to_exec_file in gdb/nat/linux-procfs.c. Within this function we call `readlink` on /proc/PID/exe to find the path of the actual executable. If the `readlink` call fails then we already fallback on using /proc/PID/exe as the path to the executable to debug. However, when the executable has been deleted the `readlink` call doesn't fail, but the path that is returned points to a non-existent file. I propose that we add an `access` call to linux_proc_pid_to_exec_file to check that the target file exists and can be read. If the target can't be read then we should fall back to /proc/PID/exe (assuming that /proc/PID/exe can be read). Now on x86-64 the output looks like this: (gdb) attach 901877 Attaching to process 901877 Reading symbols from /proc/901877/exe... Reading symbols from /lib64/libc.so.6... (No debugging symbols found in /lib64/libc.so.6) Reading symbols from /lib64/ld-linux-x86-64.so.2... (No debugging symbols found in /lib64/ld-linux-x86-64.so.2) 0x00007fa9d9c121e7 in nanosleep () from /lib64/libc.so.6 (gdb) bt #0 0x00007fa9d9c121e7 in nanosleep () from /lib64/libc.so.6 #1 0x00007fa9d9c1211e in sleep () from /lib64/libc.so.6 #2 0x000000000040117e in spin_forever () at attach-test.c:17 #3 0x0000000000401198 in main () at attach-test.c:24 (gdb) which is much better. I've also tagged the bug PR gdb/29782 which concerns the test gdb.server/connect-with-no-symbol-file.exp. After making this change, when running gdb.server/connect-with-no-symbol-file.exp GDB would now pick up the /proc/PID/exe file as the executable in some cases. As GDB is not restarted for the multiple iterations of this test GDB (or rather BFD) would given a warning/error like: (gdb) PASS: gdb.server/connect-with-no-symbol-file.exp: sysroot=target:: action=permission: setup: disconnect set sysroot target: BFD: reopening /proc/3283001/exe: No such file or directory (gdb) FAIL: gdb.server/connect-with-no-symbol-file.exp: sysroot=target:: action=permission: setup: adjust sysroot What's happening is that an executable found for an earlier iteration of the test is still registered for the inferior when we are setting up for a second iteration of the test. When the sysroot changes, if there's an executable registered GDB tries to reopen it, but in this case the file has disappeared (the previous inferior has exited by this point). I did think about maybe, when the executable is /proc/PID/exe, we should auto-delete the file from the inferior. But in the end I thought this was a bad idea. Not only would this require a lot of special code in GDB just to support this edge case: we'd need to track if the exe file name came from /proc and should be auto-deleted, or we'd need target specific code to check if a path should be auto-deleted..... ... in addition, we'd still want to warn the user when we auto-deleted the file from the inferior, otherwise they might be surprised to find their inferior suddenly has no executable attached, so we wouldn't actually reduce the number of warnings the user sees. So in the end I figured that the best solution is to just update the test to avoid the warning. This is easily done by manually removing the executable from the inferior once each iteration of the test has completed. Now, in bug PR gdb/29782 GDB is clearly managing to pick up an executable from the NFS cache somehow. I guess what's happening is that when the original file is deleted /proc/PID/exe is actually pointing to a file in the NFS cache which is only deleted at some later point, and so when GDB starts up we do manage to associate a file with the inferior, this results in the same message being emitted from BFD as I was seeing. The fix included in this commit should also fix that bug. One final note: On x86-64 GNU/Linux, the gdb.server/connect-with-no-symbol-file.exp test will produce 2 core files. This is due to a bug in gdbserver that is nothing to do with this test. These core files are created before and after this commit. I am working on a fix for the gdbserver issue, but will post that separately. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28313 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29782 Approved-By: Tom Tromey <[email protected]>

Currently, if frame-filters are active, raw-values is used instead of raw-frame-arguments to decide if a pretty-printer should be invoked for frame arguments in a backtrace. In this example, "super struct" is the output of the pretty-printer: (gdb) disable frame-filter global BasicFrameFilter (gdb) bt #0 foo (x=42, ss=super struct = {...}) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 If no frame-filter is active, then the raw-values print option does not affect the backtrace output: (gdb) set print raw-values on (gdb) bt #0 foo (x=42, ss=super struct = {...}) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 (gdb) set print raw-values off Instead, the raw-frame-arguments option disables the pretty-printer in the backtrace: (gdb) bt -raw-frame-arguments on #0 foo (x=42, ss=...) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 But if a frame-filter is active, the same rules don't apply. The option raw-frame-arguments is ignored, but raw-values decides if the pretty-printer is used: (gdb) enable frame-filter global BasicFrameFilter (gdb) bt #0 foo (x=42, ss=super struct = {...}) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 (gdb) set print raw-values on (gdb) bt #0 foo (x=42, ss=...) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 (gdb) set print raw-values off (gdb) bt -raw-frame-arguments on #0 foo (x=42, ss=super struct = {...}) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 So this adds the PRINT_RAW_FRAME_ARGUMENTS flag to frame_filter_flag, which is then used in the frame-filter to override the raw flag in enumerate_args. Then the output is the same if a frame-filter is active, the pretty-printer for backtraces is only disabled with the raw-frame-arguments option: (gdb) enable frame-filter global BasicFrameFilter (gdb) bt #0 foo (x=42, ss=super struct = {...}) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 (gdb) set print raw-values on (gdb) bt #0 foo (x=42, ss=super struct = {...}) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 (gdb) set print raw-values off (gdb) bt -raw-frame-arguments on #0 foo (x=42, ss=...) at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:47 #1 0x004016aa in main () at C:/src/repos/gdb-testsuite/gdb/testsuite/gdb.python/py-frame-args.c:57 Co-Authored-By: Andrew Burgess <[email protected]> Approved-By: Tom Tromey <[email protected]>

When running test-case gdb.dap/eof.exp, it occasionally coredumps. The thread triggering the coredump is: ... #0 0x0000ffff42bb2280 in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x0000ffff42b65800 [PAC] in raise () from /lib64/libc.so.6 #2 0x00000000007b03e8 [PAC] in handle_fatal_signal (sig=11) at gdb/event-top.c:926 #3 0x00000000007b0470 in handle_sigsegv (sig=11) at gdb/event-top.c:976 #4 <signal handler called> #5 0x0000000000606080 in cli_ui_out::do_message (this=0xffff2f7ed728, style=..., format=0xffff0c002af1 "%s", args=...) at gdb/cli-out.c:232 #6 0x0000000000ce6358 in ui_out::call_do_message (this=0xffff2f7ed728, style=..., format=0xffff0c002af1 "%s") at gdb/ui-out.c:584 #7 0x0000000000ce6610 in ui_out::vmessage (this=0xffff2f7ed728, in_style=..., format=0x16f93ea "", args=...) at gdb/ui-out.c:621 #8 0x0000000000ce3a9c in ui_file::vprintf (this=0xfffffbea1b18, ...) at gdb/ui-file.c:74 #9 0x0000000000d2b148 in gdb_vprintf (stream=0xfffffbea1b18, format=0x16f93e8 "%s", args=...) at gdb/utils.c:1898 #10 0x0000000000d2b23c in gdb_printf (stream=0xfffffbea1b18, format=0x16f93e8 "%s") at gdb/utils.c:1913 #11 0x0000000000ab5208 in gdbpy_write (self=0x33fe35d0, args=0x342ec280, kw=0x345c08b0) at gdb/python/python.c:1464 #12 0x0000ffff434acedc in cfunction_call () from /lib64/libpython3.12.so.1.0 #13 0x0000ffff4347c500 [PAC] in _PyObject_MakeTpCall () from /lib64/libpython3.12.so.1.0 #14 0x0000ffff43488b64 [PAC] in _PyEval_EvalFrameDefault () from /lib64/libpython3.12.so.1.0 #15 0x0000ffff434d8cd0 [PAC] in method_vectorcall () from /lib64/libpython3.12.so.1.0 #16 0x0000ffff434b9824 [PAC] in PyObject_CallOneArg () from /lib64/libpython3.12.so.1.0 #17 0x0000ffff43557674 [PAC] in PyFile_WriteObject () from /lib64/libpython3.12.so.1.0 #18 0x0000ffff435577a0 [PAC] in PyFile_WriteString () from /lib64/libpython3.12.so.1.0 #19 0x0000ffff43465354 [PAC] in thread_excepthook () from /lib64/libpython3.12.so.1.0 #20 0x0000ffff434ac6e0 [PAC] in cfunction_vectorcall_O () from /lib64/libpython3.12.so.1.0 #21 0x0000ffff434a32d8 [PAC] in PyObject_Vectorcall () from /lib64/libpython3.12.so.1.0 #22 0x0000ffff43488b64 [PAC] in _PyEval_EvalFrameDefault () from /lib64/libpython3.12.so.1.0 #23 0x0000ffff434d8d88 [PAC] in method_vectorcall () from /lib64/libpython3.12.so.1.0 #24 0x0000ffff435e0ef4 [PAC] in thread_run () from /lib64/libpython3.12.so.1.0 #25 0x0000ffff43591ec0 [PAC] in pythread_wrapper () from /lib64/libpython3.12.so.1.0 #26 0x0000ffff42bb0584 [PAC] in start_thread () from /lib64/libc.so.6 #27 0x0000ffff42c1fd4c [PAC] in thread_start () from /lib64/libc.so.6 ... The direct cause for the coredump seems to be that cli_ui_out::do_message is trying to write to a stream variable which does not look sound: ... (gdb) p *stream $8 = {_vptr.ui_file = 0x0, m_applied_style = {m_foreground = {m_simple = true, { m_value = 0, {m_red = 0 '\000', m_green = 0 '\000', m_blue = 0 '\000'}}}, m_background = {m_simple = 32, {m_value = 65535, {m_red = 255 '\377', m_green = 255 '\377', m_blue = 0 '\000'}}}, m_intensity = (unknown: 0x438fe710), m_reverse = 255}} ... The string that is being printed is: ... (gdb) p str $9 = "Exception in thread " ... so AFAICT this is a DAP thread running into an exception and trying to print it. If we look at the state of gdb's main thread, we have: ... #0 0x0000ffff42bac914 in __futex_abstimed_wait_cancelable64 () from /lib64/libc.so.6 #1 0x0000ffff42bafb44 [PAC] in pthread_cond_timedwait@@GLIBC_2.17 () from /lib64/libc.so.6 #2 0x0000ffff43466e9c [PAC] in take_gil () from /lib64/libpython3.12.so.1.0 #3 0x0000ffff43484fe0 [PAC] in PyEval_RestoreThread () from /lib64/libpython3.12.so.1.0 #4 0x0000000000ab8698 [PAC] in gdbpy_allow_threads::~gdbpy_allow_threads ( this=0xfffffbea1cf8, __in_chrg=<optimized out>) at gdb/python/python-internal.h:769 #5 0x0000000000ab2fec in execute_gdb_command (self=0x33fe35d0, args=0x34297b60, kw=0x34553d20) at gdb/python/python.c:681 #6 0x0000ffff434acedc in cfunction_call () from /lib64/libpython3.12.so.1.0 #7 0x0000ffff4347c500 [PAC] in _PyObject_MakeTpCall () from /lib64/libpython3.12.so.1.0 #8 0x0000ffff43488b64 [PAC] in _PyEval_EvalFrameDefault () from /lib64/libpython3.12.so.1.0 #9 0x0000ffff4353bce8 [PAC] in _PyObject_VectorcallTstate.lto_priv.3 () from /lib64/libpython3.12.so.1.0 #10 0x0000000000ab87fc [PAC] in gdbpy_event::operator() (this=0xffff14005900) at gdb/python/python.c:1061 #11 0x0000000000ab93e8 in std::__invoke_impl<void, gdbpy_event&> (__f=...) at /usr/include/c++/13/bits/invoke.h:61 #12 0x0000000000ab9204 in std::__invoke_r<void, gdbpy_event&> (__fn=...) at /usr/include/c++/13/bits/invoke.h:111 #13 0x0000000000ab8e90 in std::_Function_handler<..>::_M_invoke(...) (...) at /usr/include/c++/13/bits/std_function.h:290 #14 0x000000000062e0d0 in std::function<void ()>::operator()() const ( this=0xffff14005830) at /usr/include/c++/13/bits/std_function.h:591 #15 0x0000000000b67f14 in run_events (error=0, client_data=0x0) at gdb/run-on-main-thread.c:76 #16 0x000000000157e290 in handle_file_event (file_ptr=0x33dae3a0, ready_mask=1) at gdbsupport/event-loop.cc:573 #17 0x000000000157e760 in gdb_wait_for_event (block=1) at gdbsupport/event-loop.cc:694 #18 0x000000000157d464 in gdb_do_one_event (mstimeout=-1) at gdbsupport/event-loop.cc:264 #19 0x0000000000943a84 in start_event_loop () at gdb/main.c:401 #20 0x0000000000943bfc in captured_command_loop () at gdb/main.c:465 #21 0x000000000094567c in captured_main (data=0xfffffbea23e8) at gdb/main.c:1335 #22 0x0000000000945700 in gdb_main (args=0xfffffbea23e8) at gdb/main.c:1354 #23 0x0000000000423ab4 in main (argc=14, argv=0xfffffbea2578) at gdb/gdb.c:39 ... AFAIU, there's a race between the two threads on gdb_stderr: - the DAP thread samples the gdb_stderr value, and uses it a bit later to print to - the gdb main thread changes the gdb_stderr value forth and back, using a temporary value for string capture purposes The non-sound stream value is caused by gdb_stderr being sampled while pointing to a str_file object, and used once the str_file object is already destroyed. The error here is that the DAP thread attempts to print to gdb_stderr. Fix this by adding a thread_wrapper that: - catches all exceptions and logs them to dap.log, and - while we're at it, logs when exiting and using the thread_wrapper for each DAP thread. Tested on aarch64-linux. Approved-By: Tom Tromey <[email protected]>

When running test-case gdb.dap/eof.exp, we're likely to get a coredump due to a segfault in new_threadstate. At the point of the core dump, the gdb main thread looks like: ... (gdb) bt #0 0x0000fffee30d2280 in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x0000fffee3085800 [PAC] in raise () from /lib64/libc.so.6 #2 0x00000000007b03e8 [PAC] in handle_fatal_signal (sig=11) at gdb/event-top.c:926 #3 0x00000000007b0470 in handle_sigsegv (sig=11) at gdb/event-top.c:976 #4 <signal handler called> #5 0x0000fffee3a4db14 in new_threadstate () from /lib64/libpython3.12.so.1.0 #6 0x0000fffee3ab0548 [PAC] in PyGILState_Ensure () from /lib64/libpython3.12.so.1.0 #7 0x0000000000a6d034 [PAC] in gdbpy_gil::gdbpy_gil (this=0xffffcb279738) at gdb/python/python-internal.h:787 #8 0x0000000000ab87ac in gdbpy_event::~gdbpy_event (this=0xfffea8001ee0, __in_chrg=<optimized out>) at gdb/python/python.c:1051 #9 0x0000000000ab9460 in std::_Function_base::_Base_manager<...>::_M_destroy (__victim=...) at /usr/include/c++/13/bits/std_function.h:175 #10 0x0000000000ab92dc in std::_Function_base::_Base_manager<...>::_M_manager (__dest=..., __source=..., __op=std::__destroy_functor) at /usr/include/c++/13/bits/std_function.h:203 #11 0x0000000000ab8f14 in std::_Function_handler<...>::_M_manager(...) (...) at /usr/include/c++/13/bits/std_function.h:282 #12 0x000000000042dd9c in std::_Function_base::~_Function_base (this=0xfffea8001c10, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/std_function.h:244 #13 0x000000000042e654 in std::function<void ()>::~function() (this=0xfffea8001c10, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/std_function.h:334 #14 0x0000000000b68e60 in std::_Destroy<std::function<void ()> >(...) (...) at /usr/include/c++/13/bits/stl_construct.h:151 #15 0x0000000000b68cd0 in std::_Destroy_aux<false>::__destroy<...>(...) (...) at /usr/include/c++/13/bits/stl_construct.h:163 #16 0x0000000000b689d8 in std::_Destroy<...>(...) (...) at /usr/include/c++/13/bits/stl_construct.h:196 #17 0x0000000000b68414 in std::_Destroy<...>(...) (...) at /usr/include/c++/13/bits/alloc_traits.h:948 #18 std::vector<...>::~vector() (this=0x2a183c8 <runnables>) at /usr/include/c++/13/bits/stl_vector.h:732 #19 0x0000fffee3088370 in __run_exit_handlers () from /lib64/libc.so.6 #20 0x0000fffee3088450 [PAC] in exit () from /lib64/libc.so.6 #21 0x0000000000c95600 [PAC] in quit_force (exit_arg=0x0, from_tty=0) at gdb/top.c:1822 #22 0x0000000000609140 in quit_command (args=0x0, from_tty=0) at gdb/cli/cli-cmds.c:508 #23 0x0000000000c926a4 in quit_cover () at gdb/top.c:300 #24 0x00000000007b09d4 in async_disconnect (arg=0x0) at gdb/event-top.c:1230 #25 0x0000000000548acc in invoke_async_signal_handlers () at gdb/async-event.c:234 #26 0x000000000157d2d4 in gdb_do_one_event (mstimeout=-1) at gdbsupport/event-loop.cc:199 #27 0x0000000000943a84 in start_event_loop () at gdb/main.c:401 #28 0x0000000000943bfc in captured_command_loop () at gdb/main.c:465 #29 0x000000000094567c in captured_main (data=0xffffcb279d08) at gdb/main.c:1335 #30 0x0000000000945700 in gdb_main (args=0xffffcb279d08) at gdb/main.c:1354 #31 0x0000000000423ab4 in main (argc=14, argv=0xffffcb279e98) at gdb/gdb.c:39 ... The direct cause of the segfault is calling PyGILState_Ensure after calling Py_Finalize. AFAICT the problem is a race between the gdb main thread and DAP's JSON writer thread. On one side, we have the following events: - DAP's JSON reader thread reads an EOF, and lets DAP's main thread known by writing None into read_queue - DAP's main thread lets DAP's JSON writer thread known by writing None into write_queue - DAP's JSON writer thread sees the None in its queue, and calls send_gdb("quit") - a corresponding gdbpy_event is deposited in the runnables vector, to be run by the gdb main thread On the other side, we have the following events: - the gdb main thread receives a SIGHUP - the corresponding handler calls quit_force, which calls do_final_cleanups - one of the final cleanups is finalize_python, which calls Py_Finalize - quit_force calls exit, which triggers the exit handlers - one of the exit handlers is the destructor of the runnables vector - destruction of the vector triggers destruction of the remaining element - the remaining element is a gdbpy_event, and the destructor (indirectly) calls PyGILState_Ensure It's good to note that both events (EOF and SIGHUP) are caused by this line in the test-case: ... catch "close -i $gdb_spawn_id" ... where "expect close" closes the stdin and stdout file descriptors, which causes the SIGHUP to be send. So, for the system I'm running this on, the send_gdb("quit") is actually not needed. I'm not sure if we support any systems where it's actually needed. Fix this by removing the send_gdb("quit"). Tested on aarch64-linux. PR dap/31306 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31306

When building gdb with -O0 -fsanitize=address, and running test-case gdb.ada/uninitialized_vars.exp, I run into: ... (gdb) info locals a = 0 z = (a => 1, b => false, c => 2.0) ================================================================= ==66372==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000097f58 at pc 0xffff52c0da1c bp 0xffffc90a1d40 sp 0xffffc90a1d80 READ of size 4 at 0x602000097f58 thread T0 #0 0xffff52c0da18 in memmove (/lib64/libasan.so.8+0x6da18) #1 0xbcab24 in unsigned char* std::__copy_move_backward<false, true, std::random_access_iterator_tag>::__copy_move_b<unsigned char const, unsigned char>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:748 #2 0xbc9bf4 in unsigned char* std::__copy_move_backward_a2<false, unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:769 #3 0xbc898c in unsigned char* std::__copy_move_backward_a1<false, unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:778 #4 0xbc715c in unsigned char* std::__copy_move_backward_a<false, unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:807 #5 0xbc4e6c in unsigned char* std::copy_backward<unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:867 #6 0xbc2934 in void gdb::copy<unsigned char const, unsigned char>(gdb::array_view<unsigned char const>, gdb::array_view<unsigned char>) gdb/../gdbsupport/array-view.h:223 #7 0x20e0100 in value::contents_copy_raw(value*, long, long, long) gdb/value.c:1239 #8 0x20e9830 in value::primitive_field(long, int, type*) gdb/value.c:3078 #9 0x20e98f8 in value_field(value*, int) gdb/value.c:3095 #10 0xcafd64 in print_field_values gdb/ada-valprint.c:658 #11 0xcb0fa0 in ada_val_print_struct_union gdb/ada-valprint.c:857 #12 0xcb1bb4 in ada_value_print_inner(value*, ui_file*, int, value_print_options const*) gdb/ada-valprint.c:1042 #13 0xc66e04 in ada_language::value_print_inner(value*, ui_file*, int, value_print_options const*) const (/home/vries/gdb/build/gdb/gdb+0xc66e04) #14 0x20ca1e8 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) gdb/valprint.c:1092 #15 0x20caabc in common_val_print_checked(value*, ui_file*, int, value_print_options const*, language_defn const*) gdb/valprint.c:1184 #16 0x196c524 in print_variable_and_value(char const*, symbol*, frame_info_ptr, ui_file*, int) gdb/printcmd.c:2355 #17 0x1d99ca0 in print_variable_and_value_data::operator()(char const*, symbol*) gdb/stack.c:2308 #18 0x1dabca0 in gdb::function_view<void (char const*, symbol*)>::bind<print_variable_and_value_data>(print_variable_and_value_data&)::{lambda(gdb::fv_detail::erased_callable, char const*, symbol*)#1}::operator()(gdb::fv_detail::erased_callable, char const*, symbol*) const gdb/../gdbsupport/function-view.h:305 #19 0x1dabd14 in gdb::function_view<void (char const*, symbol*)>::bind<print_variable_and_value_data>(print_variable_and_value_data&)::{lambda(gdb::fv_detail::erased_callable, char const*, symbol*)#1}::_FUN(gdb::fv_detail::erased_callable, char const*, symbol*) gdb/../gdbsupport/function-view.h:299 #20 0x1dab34c in gdb::function_view<void (char const*, symbol*)>::operator()(char const*, symbol*) const gdb/../gdbsupport/function-view.h:289 #21 0x1d9963c in iterate_over_block_locals gdb/stack.c:2240 #22 0x1d99790 in iterate_over_block_local_vars(block const*, gdb::function_view<void (char const*, symbol*)>) gdb/stack.c:2259 #23 0x1d9a598 in print_frame_local_vars gdb/stack.c:2380 #24 0x1d9afac in info_locals_command(char const*, int) gdb/stack.c:2458 #25 0xfd7b30 in do_simple_func gdb/cli/cli-decode.c:95 #26 0xfe5a2c in cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735 #27 0x1f03790 in execute_command(char const*, int) gdb/top.c:575 #28 0x1384080 in command_handler(char const*) gdb/event-top.c:566 #29 0x1384e2c in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) gdb/event-top.c:802 #30 0x1f731e4 in tui_command_line_handler gdb/tui/tui-interp.c:104 #31 0x1382a58 in gdb_rl_callback_handler gdb/event-top.c:259 #32 0x21dbb80 in rl_callback_read_char readline/readline/callback.c:290 #33 0x1382510 in gdb_rl_callback_read_char_wrapper_noexcept gdb/event-top.c:195 #34 0x138277c in gdb_rl_callback_read_char_wrapper gdb/event-top.c:234 #35 0x1fe9b40 in stdin_event_handler gdb/ui.c:155 #36 0x35ff1bc in handle_file_event gdbsupport/event-loop.cc:573 #37 0x35ff9d8 in gdb_wait_for_event gdbsupport/event-loop.cc:694 #38 0x35fd284 in gdb_do_one_event(int) gdbsupport/event-loop.cc:264 #39 0x1768080 in start_event_loop gdb/main.c:408 #40 0x17684c4 in captured_command_loop gdb/main.c:472 #41 0x176cfc8 in captured_main gdb/main.c:1342 #42 0x176d088 in gdb_main(captured_main_args*) gdb/main.c:1361 #43 0xb73edc in main gdb/gdb.c:39 #44 0xffff519b09d8 in __libc_start_call_main (/lib64/libc.so.6+0x309d8) #45 0xffff519b0aac in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x30aac) #46 0xb73c2c in _start (/home/vries/gdb/build/gdb/gdb+0xb73c2c) 0x602000097f58 is located 0 bytes after 8-byte region [0x602000097f50,0x602000097f58) allocated by thread T0 here: #0 0xffff52c65218 in calloc (/lib64/libasan.so.8+0xc5218) #1 0xcbc278 in xcalloc gdb/alloc.c:97 #2 0x35f21e8 in xzalloc(unsigned long) gdbsupport/common-utils.cc:29 #3 0x20de270 in value::allocate_contents(bool) gdb/value.c:937 #4 0x20edc08 in value::fetch_lazy() gdb/value.c:4033 #5 0x20dadc0 in value::entirely_covered_by_range_vector(std::vector<range, std::allocator<range> > const&) gdb/value.c:229 #6 0xcb2298 in value::entirely_optimized_out() gdb/value.h:560 #7 0x20ca6fc in value_check_printable gdb/valprint.c:1133 #8 0x20caa8c in common_val_print_checked(value*, ui_file*, int, value_print_options const*, language_defn const*) gdb/valprint.c:1182 #9 0x196c524 in print_variable_and_value(char const*, symbol*, frame_info_ptr, ui_file*, int) gdb/printcmd.c:2355 #10 0x1d99ca0 in print_variable_and_value_data::operator()(char const*, symbol*) gdb/stack.c:2308 #11 0x1dabca0 in gdb::function_view<void (char const*, symbol*)>::bind<print_variable_and_value_data>(print_variable_and_value_data&)::{lambda(gdb::fv_detail::erased_callable, char const*, symbol*)#1}::operator()(gdb::fv_detail::erased_callable, char const*, symbol*) const gdb/../gdbsupport/function-view.h:305 #12 0x1dabd14 in gdb::function_view<void (char const*, symbol*)>::bind<print_variable_and_value_data>(print_variable_and_value_data&)::{lambda(gdb::fv_detail::erased_callable, char const*, symbol*)#1}::_FUN(gdb::fv_detail::erased_callable, char const*, symbol*) gdb/../gdbsupport/function-view.h:299 #13 0x1dab34c in gdb::function_view<void (char const*, symbol*)>::operator()(char const*, symbol*) const gdb/../gdbsupport/function-view.h:289 #14 0x1d9963c in iterate_over_block_locals gdb/stack.c:2240 #15 0x1d99790 in iterate_over_block_local_vars(block const*, gdb::function_view<void (char const*, symbol*)>) gdb/stack.c:2259 #16 0x1d9a598 in print_frame_local_vars gdb/stack.c:2380 #17 0x1d9afac in info_locals_command(char const*, int) gdb/stack.c:2458 #18 0xfd7b30 in do_simple_func gdb/cli/cli-decode.c:95 #19 0xfe5a2c in cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735 #20 0x1f03790 in execute_command(char const*, int) gdb/top.c:575 #21 0x1384080 in command_handler(char const*) gdb/event-top.c:566 #22 0x1384e2c in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) gdb/event-top.c:802 #23 0x1f731e4 in tui_command_line_handler gdb/tui/tui-interp.c:104 #24 0x1382a58 in gdb_rl_callback_handler gdb/event-top.c:259 #25 0x21dbb80 in rl_callback_read_char readline/readline/callback.c:290 #26 0x1382510 in gdb_rl_callback_read_char_wrapper_noexcept gdb/event-top.c:195 #27 0x138277c in gdb_rl_callback_read_char_wrapper gdb/event-top.c:234 #28 0x1fe9b40 in stdin_event_handler gdb/ui.c:155 #29 0x35ff1bc in handle_file_event gdbsupport/event-loop.cc:573 SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib64/libasan.so.8+0x6da18) in memmove ... The error happens when trying to print either variable y or y2: ... type Variable_Record (A : Boolean := True) is record case A is when True => B : Integer; when False => C : Float; D : Integer; end case; end record; Y : Variable_Record := (A => True, B => 1); Y2 : Variable_Record := (A => False, C => 1.0, D => 2); ... when the variables are uninitialized. The error happens only when printing the entire variable: ... (gdb) p y.a $2 = 216 (gdb) p y.b There is no member named b. (gdb) p y.c $3 = 9.18340949e-41 (gdb) p y.d $4 = 1 (gdb) p y <AddressSanitizer: heap-buffer-overflow> ... The error happens as follows: - field a functions as discriminant, choosing either the b, or c+d variant. - when y.a happens to be set to 216, as above, gdb interprets this as the variable having the c+d variant (which is why trying to print y.b fails). - when printing y, gdb allocates a value, copies the bytes into it from the target, and then prints the value. - gdb allocates the value using the type size, which is 8. It's 8 because that's what the DW_AT_byte_size indicates. Note that for valid values of a, it gives correct results: if a is 0 (c+d variant), size is 12, if a is 1 (b variant), size is 8. - gdb tries to print field d, which is at an 8 byte offset, and that results in a out-of-bounds access for the allocated 8-byte value. Fix this by handling this case in value::contents_copy_raw, such that we have: ... (gdb) p y $1 = (a => 24, c => 9.18340949e-41, d => <error reading variable: access outside bounds of object>) ... An alternative (additional) fix could be this: in compute_variant_fields_inner gdb reads the discriminant y.a to decide which variant is active. It would be nice to detect that the value (y.a == 24) is not a valid Boolean, and give up on choosing a variant altoghether. However, the situation regarding the internal type CODE_TYPE_BOOL is currently ambiguous (see PR31282) and it's not possible to reliably decide what valid values are. The test-case source file gdb.ada/uninitialized-variable-record/parse.adb is a reduced version of gdb.ada/uninitialized_vars/parse.adb, so it copies the copyright years. Note that the test-case needs gcc-12 or newer, it's unsupported for older gcc versions. [ So, it would be nice to rewrite it into a dwarf assembly test-case. ] The test-case loops over all languages. This is inherited from an earlier attempt to fix this, which had language-specific fixes (in print_field_values, cp_print_value_fields, pascal_object_print_value_fields and f_language::value_print_inner). I've left this in, but I suppose it's not strictly necessary anymore. Tested on x86_64-linux. PR exp/31258 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31258

From the Python API, we can execute GDB commands via gdb.execute. If the command gives an exception, however, we need to recover the GDB prompt and enable stdin, because the exception does not reach top-level GDB or normal_stop. This was done in commit commit 1ba1ac8 Author: Andrew Burgess <[email protected]> Date: Tue Nov 19 11:17:20 2019 +0000 gdb: Enable stdin on exception in execute_gdb_command with the following code: catch (const gdb_exception &except) { /* If an exception occurred then we won't hit normal_stop (), or have an exception reach the top level of the event loop, which are the two usual places in which stdin would be re-enabled. So, before we convert the exception and continue back in Python, we should re-enable stdin here. */ async_enable_stdin (); GDB_PY_HANDLE_EXCEPTION (except); } In this patch, we explain what happens when we run a GDB command in the context of a synchronous command, e.g. via Python observer notifications. As an example, suppose we have the following objfile event listener, specified in a file named file.py: ~~~ import gdb class MyListener: def __init__(self): gdb.events.new_objfile.connect(self.handle_new_objfile_event) self.processed_objfile = False def handle_new_objfile_event(self, event): if self.processed_objfile: return print("loading " + event.new_objfile.filename) self.processed_objfile = True gdb.execute('add-inferior -no-connection') gdb.execute('inferior 2') gdb.execute('target remote | gdbserver - /tmp/a.out') gdb.execute('inferior 1') the_listener = MyListener() ~~~ Using this Python file, we see the behavior below: $ gdb -q -ex "source file.py" -ex "run" --args a.out Reading symbols from a.out... Starting program: /tmp/a.out loading /lib64/ld-linux-x86-64.so.2 [New inferior 2] Added inferior 2 [Switching to inferior 2 [<null>] (<noexec>)] stdin/stdout redirected Process /tmp/a.out created; pid = 3075406 Remote debugging using stdio Reading /tmp/a.out from remote target... ... [Switching to inferior 1 [process 3075400] (/tmp/a.out)] [Switching to thread 1.1 (process 3075400)] #0 0x00007ffff7fe3290 in ?? () from /lib64/ld-linux-x86-64.so.2 (gdb) [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [Inferior 1 (process 3075400) exited normally] Note how the GDB prompt comes in-between the debugger output. We have this obscure behavior, because the executed command, "target remote", triggers an invocation of `normal_stop` that enables stdin. After that, however, the Python notification context completes and GDB continues with its normal flow of executing the 'run' command. This can be seen in the call stack below: (top-gdb) bt #0 async_enable_stdin () at src/gdb/event-top.c:523 #1 0x00005555561c3acd in normal_stop () at src/gdb/infrun.c:9432 #2 0x00005555561b328e in start_remote (from_tty=0) at src/gdb/infrun.c:3801 #3 0x0000555556441224 in remote_target::start_remote_1 (this=0x5555587882e0, from_tty=0, extended_p=0) at src/gdb/remote.c:5225 #4 0x000055555644166c in remote_target::start_remote (this=0x5555587882e0, from_tty=0, extended_p=0) at src/gdb/remote.c:5316 #5 0x00005555564430cf in remote_target::open_1 (name=0x55555878525e "| gdbserver - /tmp/a.out", from_tty=0, extended_p=0) at src/gdb/remote.c:6175 #6 0x0000555556441707 in remote_target::open (name=0x55555878525e "| gdbserver - /tmp/a.out", from_tty=0) at src/gdb/remote.c:5338 #7 0x00005555565ea63f in open_target (args=0x55555878525e "| gdbserver - /tmp/a.out", from_tty=0, command=0x555558589280) at src/gdb/target.c:824 #8 0x0000555555f0d89a in cmd_func (cmd=0x555558589280, args=0x55555878525e "| gdbserver - /tmp/a.out", from_tty=0) at src/gdb/cli/cli-decode.c:2735 #9 0x000055555661fb42 in execute_command (p=0x55555878529e "t", from_tty=0) at src/gdb/top.c:575 #10 0x0000555555f1a506 in execute_control_command_1 (cmd=0x555558756f00, from_tty=0) at src/gdb/cli/cli-script.c:529 #11 0x0000555555f1abea in execute_control_command (cmd=0x555558756f00, from_tty=0) at src/gdb/cli/cli-script.c:701 #12 0x0000555555f19fc7 in execute_control_commands (cmdlines=0x555558756f00, from_tty=0) at src/gdb/cli/cli-script.c:411 #13 0x0000555556400d91 in execute_gdb_command (self=0x7ffff43b5d00, args=0x7ffff440ab60, kw=0x0) at src/gdb/python/python.c:700 #14 0x00007ffff7a96023 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #15 0x00007ffff7a4dadc in _PyObject_MakeTpCall () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #16 0x00007ffff79e9a1c in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #17 0x00007ffff7b303af in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #18 0x00007ffff7a50358 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #19 0x00007ffff7a4f3f4 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #20 0x00007ffff7a4f883 in PyObject_CallFunctionObjArgs () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #21 0x00005555563a9758 in evpy_emit_event (event=0x7ffff42b5430, registry=0x7ffff42b4690) at src/gdb/python/py-event.c:104 #22 0x00005555563cb874 in emit_new_objfile_event (objfile=0x555558761700) at src/gdb/python/py-newobjfileevent.c:52 #23 0x00005555563b53bc in python_new_objfile (objfile=0x555558761700) at src/gdb/python/py-inferior.c:195 #24 0x0000555555d6dff0 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x5555585b5860: 0x5555563b5360 <python_new_objfile(objfile*)>) at /usr/include/c++/11/bits/invoke.h:61 #25 0x0000555555d6be18 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x5555585b5860: 0x5555563b5360 <python_new_objfile(objfile*)>) at /usr/include/c++/11/bits/invoke.h:111 #26 0x0000555555d69661 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffd080: 0x555558761700) at /usr/include/c++/11/bits/std_function.h:290 #27 0x0000555556314caf in std::function<void (objfile*)>::operator()(objfile*) const (this=0x5555585b5860, __args#0=0x555558761700) at /usr/include/c++/11/bits/std_function.h:590 #28 0x000055555631444e in gdb::observers::observable<objfile*>::notify (this=0x55555836eea0 <gdb::observers::new_objfile>, args#0=0x555558761700) at src/gdb/../gdbsupport/observable.h:166 #29 0x0000555556599b3f in symbol_file_add_with_addrs (abfd=..., name=0x55555875d310 "/lib64/ld-linux-x86-64.so.2", add_flags=..., addrs=0x7fffffffd2f0, flags=..., parent=0x0) at src/gdb/symfile.c:1125 #30 0x0000555556599ca4 in symbol_file_add_from_bfd (abfd=..., name=0x55555875d310 "/lib64/ld-linux-x86-64.so.2", add_flags=..., addrs=0x7fffffffd2f0, flags=..., parent=0x0) at src/gdb/symfile.c:1160 #31 0x0000555556546371 in solib_read_symbols (so=..., flags=...) at src/gdb/solib.c:692 #32 0x0000555556546f0f in solib_add (pattern=0x0, from_tty=0, readsyms=1) at src/gdb/solib.c:1015 #33 0x0000555556539891 in enable_break (info=0x55555874e180, from_tty=0) at src/gdb/solib-svr4.c:2416 #34 0x000055555653b305 in svr4_solib_create_inferior_hook (from_tty=0) at src/gdb/solib-svr4.c:3058 #35 0x0000555556547cee in solib_create_inferior_hook (from_tty=0) at src/gdb/solib.c:1217 #36 0x0000555556196f6a in post_create_inferior (from_tty=0) at src/gdb/infcmd.c:275 #37 0x0000555556197670 in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at src/gdb/infcmd.c:486 #38 0x000055555619783f in run_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:512 #39 0x0000555555f0798d in do_simple_func (args=0x0, from_tty=1, c=0x555558567510) at src/gdb/cli/cli-decode.c:95 #40 0x0000555555f0d89a in cmd_func (cmd=0x555558567510, args=0x0, from_tty=1) at src/gdb/cli/cli-decode.c:2735 #41 0x000055555661fb42 in execute_command (p=0x7fffffffe2c4 "", from_tty=1) at src/gdb/top.c:575 #42 0x000055555626303b in catch_command_errors (command=0x55555661f4ab <execute_command(char const*, int)>, arg=0x7fffffffe2c1 "run", from_tty=1, do_bp_actions=true) at src/gdb/main.c:513 #43 0x000055555626328a in execute_cmdargs (cmdarg_vec=0x7fffffffdaf0, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffda3c) at src/gdb/main.c:612 #44 0x0000555556264849 in captured_main_1 (context=0x7fffffffdd40) at src/gdb/main.c:1293 #45 0x0000555556264a7f in captured_main (data=0x7fffffffdd40) at src/gdb/main.c:1314 #46 0x0000555556264b2e in gdb_main (args=0x7fffffffdd40) at src/gdb/main.c:1343 #47 0x0000555555ceccab in main (argc=9, argv=0x7fffffffde78) at src/gdb/gdb.c:39 (top-gdb) The use of the "target remote" command here is just an example. In principle, we would reproduce the problem with any command that triggers an invocation of `normal_stop`. To omit enabling the stdin in `normal_stop`, we would have to check the context we are in. Since we cannot do that, we add a new field to `struct ui` to track whether the prompt was already blocked, and set the tracker flag in the Python context before executing a GDB command. After applying this patch, the output becomes ... Reading symbols from a.out... Starting program: /tmp/a.out loading /lib64/ld-linux-x86-64.so.2 [New inferior 2] Added inferior 2 [Switching to inferior 2 [<null>] (<noexec>)] stdin/stdout redirected Process /tmp/a.out created; pid = 3032261 Remote debugging using stdio Reading /tmp/a.out from remote target... ... [Switching to inferior 1 [process 3032255] (/tmp/a.out)] [Switching to thread 1.1 (process 3032255)] #0 0x00007ffff7fe3290 in ?? () from /lib64/ld-linux-x86-64.so.2 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [Inferior 1 (process 3032255) exited normally] (gdb) Let's now consider a secondary scenario, where the command executed from the Python raises an error. As an example, suppose we have the Python file below: def handle_new_objfile_event(self, event): ... print("loading " + event.new_objfile.filename) self.processed_objfile = True gdb.execute('print a') The executed command, "print a", gives an error because "a" is not defined. Without this patch, we see the behavior below, where the prompt is again placed incorrectly: ... Reading symbols from /tmp/a.out... Starting program: /tmp/a.out loading /lib64/ld-linux-x86-64.so.2 Python Exception <class 'gdb.error'>: No symbol "a" in current context. (gdb) [Inferior 1 (process 3980401) exited normally] This time, `async_enable_stdin` is called from the 'catch' block in `execute_gdb_command`: (top-gdb) bt #0 async_enable_stdin () at src/gdb/event-top.c:523 #1 0x0000555556400f0a in execute_gdb_command (self=0x7ffff43b5d00, args=0x7ffff440ab60, kw=0x0) at src/gdb/python/python.c:713 #2 0x00007ffff7a96023 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #3 0x00007ffff7a4dadc in _PyObject_MakeTpCall () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #4 0x00007ffff79e9a1c in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #5 0x00007ffff7b303af in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #6 0x00007ffff7a50358 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #7 0x00007ffff7a4f3f4 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #8 0x00007ffff7a4f883 in PyObject_CallFunctionObjArgs () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0 #9 0x00005555563a9758 in evpy_emit_event (event=0x7ffff42b5430, registry=0x7ffff42b4690) at src/gdb/python/py-event.c:104 #10 0x00005555563cb874 in emit_new_objfile_event (objfile=0x555558761410) at src/gdb/python/py-newobjfileevent.c:52 #11 0x00005555563b53bc in python_new_objfile (objfile=0x555558761410) at src/gdb/python/py-inferior.c:195 #12 0x0000555555d6dff0 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x5555585b5860: 0x5555563b5360 <python_new_objfile(objfile*)>) at /usr/include/c++/11/bits/invoke.h:61 #13 0x0000555555d6be18 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x5555585b5860: 0x5555563b5360 <python_new_objfile(objfile*)>) at /usr/include/c++/11/bits/invoke.h:111 #14 0x0000555555d69661 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffd080: 0x555558761410) at /usr/include/c++/11/bits/std_function.h:290 #15 0x0000555556314caf in std::function<void (objfile*)>::operator()(objfile*) const (this=0x5555585b5860, __args#0=0x555558761410) at /usr/include/c++/11/bits/std_function.h:590 #16 0x000055555631444e in gdb::observers::observable<objfile*>::notify (this=0x55555836eea0 <gdb::observers::new_objfile>, args#0=0x555558761410) at src/gdb/../gdbsupport/observable.h:166 #17 0x0000555556599b3f in symbol_file_add_with_addrs (abfd=..., name=0x55555875d020 "/lib64/ld-linux-x86-64.so.2", add_flags=..., addrs=0x7fffffffd2f0, flags=..., parent=0x0) at src/gdb/symfile.c:1125 #18 0x0000555556599ca4 in symbol_file_add_from_bfd (abfd=..., name=0x55555875d020 "/lib64/ld-linux-x86-64.so.2", add_flags=..., addrs=0x7fffffffd2f0, flags=..., parent=0x0) at src/gdb/symfile.c:1160 #19 0x0000555556546371 in solib_read_symbols (so=..., flags=...) at src/gdb/solib.c:692 #20 0x0000555556546f0f in solib_add (pattern=0x0, from_tty=0, readsyms=1) at src/gdb/solib.c:1015 #21 0x0000555556539891 in enable_break (info=0x55555874a670, from_tty=0) at src/gdb/solib-svr4.c:2416 #22 0x000055555653b305 in svr4_solib_create_inferior_hook (from_tty=0) at src/gdb/solib-svr4.c:3058 #23 0x0000555556547cee in solib_create_inferior_hook (from_tty=0) at src/gdb/solib.c:1217 #24 0x0000555556196f6a in post_create_inferior (from_tty=0) at src/gdb/infcmd.c:275 #25 0x0000555556197670 in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at src/gdb/infcmd.c:486 #26 0x000055555619783f in run_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:512 #27 0x0000555555f0798d in do_simple_func (args=0x0, from_tty=1, c=0x555558567510) at src/gdb/cli/cli-decode.c:95 #28 0x0000555555f0d89a in cmd_func (cmd=0x555558567510, args=0x0, from_tty=1) at src/gdb/cli/cli-decode.c:2735 #29 0x000055555661fb42 in execute_command (p=0x7fffffffe2c4 "", from_tty=1) at src/gdb/top.c:575 #30 0x000055555626303b in catch_command_errors (command=0x55555661f4ab <execute_command(char const*, int)>, arg=0x7fffffffe2c1 "run", from_tty=1, do_bp_actions=true) at src/gdb/main.c:513 #31 0x000055555626328a in execute_cmdargs (cmdarg_vec=0x7fffffffdaf0, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffda3c) at src/gdb/main.c:612 #32 0x0000555556264849 in captured_main_1 (context=0x7fffffffdd40) at src/gdb/main.c:1293 #33 0x0000555556264a7f in captured_main (data=0x7fffffffdd40) at src/gdb/main.c:1314 #34 0x0000555556264b2e in gdb_main (args=0x7fffffffdd40) at src/gdb/main.c:1343 #35 0x0000555555ceccab in main (argc=9, argv=0x7fffffffde78) at src/gdb/gdb.c:39 (top-gdb) Again, after we enable stdin, GDB continues with its normal flow of the 'run' command and receives the inferior's exit event, where it would have enabled stdin, if we had not done it prematurely. (top-gdb) bt #0 async_enable_stdin () at src/gdb/event-top.c:523 #1 0x00005555561c3acd in normal_stop () at src/gdb/infrun.c:9432 #2 0x00005555561b5bf1 in fetch_inferior_event () at src/gdb/infrun.c:4700 #3 0x000055555618d6a7 in inferior_event_handler (event_type=INF_REG_EVENT) at src/gdb/inf-loop.c:42 #4 0x000055555620ecdb in handle_target_event (error=0, client_data=0x0) at src/gdb/linux-nat.c:4316 #5 0x0000555556f33035 in handle_file_event (file_ptr=0x5555587024e0, ready_mask=1) at src/gdbsupport/event-loop.cc:573 #6 0x0000555556f3362f in gdb_wait_for_event (block=0) at src/gdbsupport/event-loop.cc:694 #7 0x0000555556f322cd in gdb_do_one_event (mstimeout=-1) at src/gdbsupport/event-loop.cc:217 #8 0x0000555556262df8 in start_event_loop () at src/gdb/main.c:407 #9 0x0000555556262f85 in captured_command_loop () at src/gdb/main.c:471 #10 0x0000555556264a84 in captured_main (data=0x7fffffffdd40) at src/gdb/main.c:1324 #11 0x0000555556264b2e in gdb_main (args=0x7fffffffdd40) at src/gdb/main.c:1343 #12 0x0000555555ceccab in main (argc=9, argv=0x7fffffffde78) at src/gdb/gdb.c:39 (top-gdb) The solution implemented by this patch addresses the problem. After applying the patch, the output becomes $ gdb -q -ex "source file.py" -ex "run" --args a.out Reading symbols from /tmp/a.out... Starting program: /tmp/a.out loading /lib64/ld-linux-x86-64.so.2 Python Exception <class 'gdb.error'>: No symbol "a" in current context. [Inferior 1 (process 3984511) exited normally] (gdb) Regression-tested on X86_64 Linux using the default board file (i.e. unix). Co-Authored-By: Oguzhan Karakaya <[email protected]> Reviewed-By: Guinevere Larsen <[email protected]> Approved-By: Tom Tromey <[email protected]>

This started with a Red Hat bug report which can be seen here: https://bugzilla.redhat.com/show_bug.cgi?id=1850710 The problem reported here was using GDB on GNU/Linux for S390, the user stepped into JIT generated code. As they enter the JIT code GDB would report 'PC not saved', and this same message would be reported after each step/stepi. Additionally, the user had 'set disassemble-next-line on', and once they entered the JIT code this output was not displayed, nor were any 'display' directives displayed. The user is not making use of the JIT plugin API to provide debug information. But that's OK, they aren't expecting any source level debug here, they are happy to use 'stepi', but the missing 'display' directives are a problem, as is the constant 'PC not saved' (error) message. What is happening here is that as GDB is failing to find any debug information for the JIT generated code, it is falling back on to the S390 prologue unwinder to try and unwind frame #0. Unfortunately, without being able to identify the function boundaries, the S390 prologue scanner can't help much, in fact, it doesn't even suggest an arbitrary previous $pc value (some targets that use a link-register will, by default, assume the link-register contains the previous $pc), instead the S390 will just say, "sorry, I have no previous $pc value". The result of this is that when GDB tries to find frame #1 we end throwing an error from frame_unwind_pc (the 'PC not saved' error). This error is not caught anywhere except at the top-level interpreter loop, and so we end up skipping all the 'display' directive handling. While thinking about this, I wondered, could I trigger the same error using the Python Unwinder API? What happens if a Python unwinder claims a frame, but then fails to provide a previous $pc value? Turns out that exactly the same thing happens, which is great, as that means we now have a way to reproduce this bug on any target. And so the test included with this patch does just this. I have a Python unwinder that claims a frame, but doesn't provide any previous register values. I then do two tests, first I stop in the claimed frame (i.e. frame #0 is the frame that can't be unwound), I perform a few steps, and check the backtrace. And second, I stop in a child of the problem frame (i.e. frame #1 is the frame that can't be unwound), and from here I check the backtrace. While all this is going on I have a 'display' directive in place, and each time GDB stops I check that the display directive triggers. Additionally, when checking the backtrace, I am checking that the backtrace finishes with the message 'Backtrace stopped: frame did not save the PC'. As for the fix I chose to add a call to frame_unwind_pc directly to get_prev_frame_always_1. Calling frame_unwind_pc will cache the unwound $pc value, so this doesn't add much additional work as immediately after the new frame_unwind_pc call, we call get_prev_frame_maybe_check_cycle, which actually generates the previous frame, which will always (I think) require a call to frame_unwind_pc anyway. The reason for adding the frame_unwind_pc call into get_prev_frame_always_1, is that if the frame_unwind_pc call fails we want to set the frames 'stop_reason', and get_prev_frame_always_1 seems to be the place where this is done, so I wanted to keep the new stop_reason setting code next to all the existing stop_reason setting code. Additionally, once we enter get_prev_frame_maybe_check_cycle we actually create the previous frame, then, if it turns out that the previous frame can't be created we need to remove the frame .. this seemed more complex than just making the check in get_prev_frame_always_1. With this fix in place the original S390 bug is fixed, and also the test added in this commit, that uses the Python API, is also fixed. Reviewed-By: Kevin Buettner <[email protected]>

This commit fixes bug PR 28942, that is, creating a conditional breakpoint in a multi-threaded inferior, where the breakpoint condition includes an inferior function call. Currently, when a user tries to create such a breakpoint, then GDB will fail with: (gdb) break infcall-from-bp-cond-single.c:61 if (return_true ()) Breakpoint 2 at 0x4011fa: file /tmp/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/infcall-from-bp-cond-single.c, line 61. (gdb) continue Continuing. [New Thread 0x7ffff7c5d700 (LWP 2460150)] [New Thread 0x7ffff745c700 (LWP 2460151)] [New Thread 0x7ffff6c5b700 (LWP 2460152)] [New Thread 0x7ffff645a700 (LWP 2460153)] [New Thread 0x7ffff5c59700 (LWP 2460154)] Error in testing breakpoint condition: Couldn't get registers: No such process. An error occurred while in a function called from GDB. Evaluation of the expression containing the function (return_true) will be abandoned. When the function is done executing, GDB will silently stop. Selected thread is running. (gdb) Or, in some cases, like this: (gdb) break infcall-from-bp-cond-simple.c:56 if (is_matching_tid (arg, 1)) Breakpoint 2 at 0x401194: file /tmp/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/infcall-from-bp-cond-simple.c, line 56. (gdb) continue Continuing. [New Thread 0x7ffff7c5d700 (LWP 2461106)] [New Thread 0x7ffff745c700 (LWP 2461107)] ../../src.release/gdb/nat/x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. The precise error depends on the exact thread state; so there's race conditions depending on which threads have fully started, and which have not. But the underlying problem is always the same; when GDB tries to execute the inferior function call from within the breakpoint condition, GDB will, incorrectly, try to resume threads that are already running - GDB doesn't realise that some threads might already be running. The solution proposed in this patch requires an additional member variable thread_info::in_cond_eval. This flag is set to true (in breakpoint.c) when GDB is evaluating a breakpoint condition. In user_visible_resume_ptid (infrun.c), when the in_cond_eval flag is true, then GDB will only try to resume the current thread, that is, the thread for which the breakpoint condition is being evaluated. This solves the problem of GDB trying to resume threads that are already running. The next problem is that inferior function calls are assumed to be synchronous, that is, GDB doesn't expect to start an inferior function call in thread #1, then receive a stop from thread #2 for some other, unrelated reason. To prevent GDB responding to an event from another thread, we update fetch_inferior_event and do_target_wait in infrun.c, so that, when an inferior function call (on behalf of a breakpoint condition) is in progress, we only wait for events from the current thread (the one evaluating the condition). In do_target_wait I had to change the inferior_matches lambda function, which is used to select which inferior to wait on. Previously the logic was this: auto inferior_matches = [&wait_ptid] (inferior *inf) { return (inf->process_target () != nullptr && ptid_t (inf->pid).matches (wait_ptid)); }; This compares the pid of the inferior against the complete ptid we want to wait on. Before this commit wait_ptid was only ever minus_one_ptid (which is special, and means any process), and so every inferior would match. After this commit though wait_ptid might represent a specific thread in a specific inferior. If we compare the pid of the inferior to a specific ptid then these will not match. The fix is to compare against the pid extracted from the wait_ptid, not against the complete wait_ptid itself. In fetch_inferior_event, after receiving the event, we only want to stop all the other threads, and call inferior_event_handler with INF_EXEC_COMPLETE, if we are not evaluating a conditional breakpoint. If we are, then all the other threads should be left doing whatever they were before. The inferior_event_handler call will be performed once the breakpoint condition has finished being evaluated, and GDB decides to stop or not. The final problem that needs solving relates to GDB's commit-resume mechanism, which allows GDB to collect resume requests into a single packet in order to reduce traffic to a remote target. The problem is that the commit-resume mechanism will not send any resume requests for an inferior if there are already events pending on the GDB side. Imagine an inferior with two threads. Both threads hit a breakpoint, maybe the same conditional breakpoint. At this point there are two pending events, one for each thread. GDB selects one of the events and spots that this is a conditional breakpoint, GDB evaluates the condition. The condition includes an inferior function call, so GDB sets up for the call and resumes the one thread, the resume request is added to the commit-resume queue. When the commit-resume queue is committed GDB sees that there is a pending event from another thread, and so doesn't send any resume requests to the actual target, GDB is assuming that when we wait we will select the event from the other thread. However, as this is an inferior function call for a condition evaluation, we will not select the event from the other thread, we only care about events from the thread that is evaluating the condition - and the resume for this thread was never sent to the target. And so, GDB hangs, waiting for an event from a thread that was never fully resumed. To fix this issue I have added the concept of "forcing" the commit-resume queue. When enabling commit resume, if the force flag is true, then any resumes will be committed to the target, even if there are other threads with pending events. A note on authorship: this patch was based on some work done by Natalia Saiapova and Tankut Baris Aktemur from Intel[1]. I have made some changes to their work in this version. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28942 [1] https://sourceware.org/pipermail/gdb-patches/2020-October/172454.html Co-authored-by: Natalia Saiapova <[email protected]> Co-authored-by: Tankut Baris Aktemur <[email protected]> Reviewed-By: Tankut Baris Aktemur <[email protected]> Tested-By: Luis Machado <[email protected]> Tested-By: Keith Seitz <[email protected]>

…ro linux When running test-case gdb.threads/attach-stopped.exp on aarch64-linux, using the manjaro linux distro, I get: ... (gdb) thread apply all bt^M ^M Thread 2 (Thread 0xffff8d8af120 (LWP 278116) "attach-stopped"):^M #0 0x0000ffff8d964864 in clock_nanosleep () from /usr/lib/libc.so.6^M #1 0x0000ffff8d969cac in nanosleep () from /usr/lib/libc.so.6^M #2 0x0000ffff8d969b68 in sleep () from /usr/lib/libc.so.6^M #3 0x0000aaaade370828 in func (arg=0x0) at attach-stopped.c:29^M #4 0x0000ffff8d930aec in ?? () from /usr/lib/libc.so.6^M #5 0x0000ffff8d99a5dc in ?? () from /usr/lib/libc.so.6^M ^M Thread 1 (Thread 0xffff8db62020 (LWP 278111) "attach-stopped"):^M #0 0x0000ffff8d92d2d8 in ?? () from /usr/lib/libc.so.6^M #1 0x0000ffff8d9324b8 in ?? () from /usr/lib/libc.so.6^M #2 0x0000aaaade37086c in main () at attach-stopped.c:45^M (gdb) FAIL: gdb.threads/attach-stopped.exp: threaded: attach2 to stopped bt ... The problem is that the test-case expects to see start_thread: ... gdb_test "thread apply all bt" ".*sleep.*start_thread.*" \ "$threadtype: attach2 to stopped bt" ... but lack of symbols makes that impossible. Fix this by allowing " in ?? () from " as well. Tested on aarch64-linux. PR testsuite/31451 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31451

When -fsanitize=address,undefined is used to build, the mmap configure check failed with ================================================================= ==231796==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4096 byte(s) in 1 object(s) allocated from: #0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x5750c7f6d72b in main /home/alan/build/gas-san/all/bfd/conftest.c:239 Direct leak of 4096 byte(s) in 1 object(s) allocated from: #0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x5750c7f6d2e1 in main /home/alan/build/gas-san/all/bfd/conftest.c:190 SUMMARY: AddressSanitizer: 8192 byte(s) leaked in 2 allocation(s). Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid the sanitizer configure check failure. config/ * mmap.m4 (GCC_AC_FUNC_MMAP): New. * no-executables.m4 (AC_FUNC_MMAP): Renamed to GCC_AC_FUNC_MMAP. Change AC_FUNC_MMAP to GCC_AC_FUNC_MMAP. libiberty/ * Makefile.in (aclocal_deps): Add $(srcdir)/../config/mmap.m4. * acinclude.m4: Change AC_FUNC_MMAP to GCC_AC_FUNC_MMAP. * aclocal.m4: Regenerated. * configure: Likewise. zlib/ * acinclude.m4: Include ../config/mmap.m4. * Makefile.in: Regenerated. * configure: Likewise.

When -fsanitize=address,undefined is used to build, the mmap configure check failed with ================================================================= ==231796==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4096 byte(s) in 1 object(s) allocated from: #0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x5750c7f6d72b in main /home/alan/build/gas-san/all/bfd/conftest.c:239 Direct leak of 4096 byte(s) in 1 object(s) allocated from: #0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x5750c7f6d2e1 in main /home/alan/build/gas-san/all/bfd/conftest.c:190 SUMMARY: AddressSanitizer: 8192 byte(s) leaked in 2 allocation(s). Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP to avoid the sanitizer configure check failure. bfd/ * configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP. * Makefile.in: Regenerated. * aclocal.m4: Likewise. * configure: Likewise. binutils/ * configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP. * Makefile.in: Regenerated. * aclocal.m4: Likewise. * configure: Likewise. ld/ * configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP. * Makefile.in: Regenerated. * aclocal.m4: Likewise. * configure: Likewise. libctf/ * configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP. * Makefile.in: Regenerated. * aclocal.m4: Likewise. * configure: Likewise. libsframe/ * configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP. * Makefile.in: Regenerated. * aclocal.m4: Likewise. * configure: Likewise.

After installing glibc debuginfo, I ran into: ... FAIL: gdb.threads/threadcrash.exp: test_live_inferior: \ $thread_count == [llength $test_list] ... This happens because the clause: ... -re "^\r\n${hs}main$hs$eol" { ... which is intended to match only: ... #1 <hex> in main () at threadcrash.c:423^M ... also matches "remaining" in: ... #1 <hex> in __GI___nanosleep (requested_time=<hex>, remaining=<hex>) at \ nanosleep.c:27^M ... Fix this by checking for "in main" instead. Tested on x86_64-linux.

When running test-case gdb.server/connect-with-no-symbol-file.exp on aarch64-linux (specifically, an opensuse leap 15.5 container on a fedora asahi 39 system), I run into: ... (gdb) detach^M Detaching from program: target:connect-with-no-symbol-file, process 185104^M Ending remote debugging.^M terminate called after throwing an instance of 'gdb_exception_error'^M ... The detailed backtrace of the corefile is: ... (gdb) bt #0 0x0000ffff75504f54 in raise () from /lib64/libpthread.so.0 #1 0x00000000007a86b4 in handle_fatal_signal (sig=6) at gdb/event-top.c:926 #2 <signal handler called> #3 0x0000ffff74b977b4 in raise () from /lib64/libc.so.6 #4 0x0000ffff74b98c18 in abort () from /lib64/libc.so.6 #5 0x0000ffff74ea26f4 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6 #6 0x0000ffff74ea011c in ?? () from /usr/lib64/libstdc++.so.6 #7 0x0000ffff74ea0180 in std::terminate() () from /usr/lib64/libstdc++.so.6 #8 0x0000ffff74ea0464 in __cxa_throw () from /usr/lib64/libstdc++.so.6 #9 0x0000000001548870 in throw_it (reason=RETURN_ERROR, error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed", ap=...) at gdbsupport/common-exceptions.cc:203 #10 0x0000000001548920 in throw_verror (error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed", ap=...) at gdbsupport/common-exceptions.cc:211 #11 0x0000000001548a00 in throw_error (error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed") at gdbsupport/common-exceptions.cc:226 #12 0x0000000000ac8f2c in remote_target::readchar (this=0x233d3d90, timeout=2) at gdb/remote.c:9856 #13 0x0000000000ac9f04 in remote_target::getpkt (this=0x233d3d90, buf=0x233d40a8, forever=false, is_notif=0x0) at gdb/remote.c:10326 #14 0x0000000000acf3d0 in remote_target::remote_hostio_send_command (this=0x233d3d90, command_bytes=13, which_packet=17, remote_errno=0xfffff1a3cf38, attachment=0xfffff1a3ce88, attachment_len=0xfffff1a3ce90) at gdb/remote.c:12567 #15 0x0000000000ad03bc in remote_target::fileio_fstat (this=0x233d3d90, fd=3, st=0xfffff1a3d020, remote_errno=0xfffff1a3cf38) at gdb/remote.c:12979 #16 0x0000000000c39878 in target_fileio_fstat (fd=0, sb=0xfffff1a3d020, target_errno=0xfffff1a3cf38) at gdb/target.c:3315 #17 0x00000000007eee5c in target_fileio_stream::stat (this=0x233d4400, abfd=0x2323fc40, sb=0xfffff1a3d020) at gdb/gdb_bfd.c:467 #18 0x00000000007f012c in <lambda(bfd*, void*, stat*)>::operator()(bfd *, void *, stat *) const (__closure=0x0, abfd=0x2323fc40, stream=0x233d4400, sb=0xfffff1a3d020) at gdb/gdb_bfd.c:955 #19 0x00000000007f015c in <lambda(bfd*, void*, stat*)>::_FUN(bfd *, void *, stat *) () at gdb/gdb_bfd.c:956 #20 0x0000000000f9b838 in opncls_bstat (abfd=0x2323fc40, sb=0xfffff1a3d020) at bfd/opncls.c:665 #21 0x0000000000f90adc in bfd_stat (abfd=0x2323fc40, statbuf=0xfffff1a3d020) at bfd/bfdio.c:431 #22 0x000000000065fe20 in reopen_exec_file () at gdb/corefile.c:52 #23 0x0000000000c3a3e8 in generic_mourn_inferior () at gdb/target.c:3642 #24 0x0000000000abf3f0 in remote_unpush_target (target=0x233d3d90) at gdb/remote.c:6067 #25 0x0000000000aca8b0 in remote_target::mourn_inferior (this=0x233d3d90) at gdb/remote.c:10587 #26 0x0000000000c387cc in target_mourn_inferior ( ptid=<error reading variable: Cannot access memory at address 0x2d310>) at gdb/target.c:2738 #27 0x0000000000abfff0 in remote_target::remote_detach_1 (this=0x233d3d90, inf=0x22fce540, from_tty=1) at gdb/remote.c:6421 #28 0x0000000000ac0094 in remote_target::detach (this=0x233d3d90, inf=0x22fce540, from_tty=1) at gdb/remote.c:6436 #29 0x0000000000c37c3c in target_detach (inf=0x22fce540, from_tty=1) at gdb/target.c:2526 #30 0x0000000000860424 in detach_command (args=0x0, from_tty=1) at gdb/infcmd.c:2817 #31 0x000000000060b594 in do_simple_func (args=0x0, from_tty=1, c=0x231431a0) at gdb/cli/cli-decode.c:94 #32 0x00000000006108c8 in cmd_func (cmd=0x231431a0, args=0x0, from_tty=1) at gdb/cli/cli-decode.c:2741 #33 0x0000000000c65a94 in execute_command (p=0x232e52f6 "", from_tty=1) at gdb/top.c:570 #34 0x00000000007a7d2c in command_handler (command=0x232e52f0 "") at gdb/event-top.c:566 #35 0x00000000007a8290 in command_line_handler (rl=...) at gdb/event-top.c:802 #36 0x0000000000c9092c in tui_command_line_handler (rl=...) at gdb/tui/tui-interp.c:103 #37 0x00000000007a750c in gdb_rl_callback_handler (rl=0x23385330 "detach") at gdb/event-top.c:258 #38 0x0000000000d910f4 in rl_callback_read_char () at readline/readline/callback.c:290 #39 0x00000000007a7338 in gdb_rl_callback_read_char_wrapper_noexcept () at gdb/event-top.c:194 #40 0x00000000007a73f0 in gdb_rl_callback_read_char_wrapper (client_data=0x22fbf640) at gdb/event-top.c:233 #41 0x0000000000cbee1c in stdin_event_handler (error=0, client_data=0x22fbf640) at gdb/ui.c:154 #42 0x000000000154ed60 in handle_file_event (file_ptr=0x232be730, ready_mask=1) at gdbsupport/event-loop.cc:572 #43 0x000000000154f21c in gdb_wait_for_event (block=1) at gdbsupport/event-loop.cc:693 #44 0x000000000154dec4 in gdb_do_one_event (mstimeout=-1) at gdbsupport/event-loop.cc:263 #45 0x0000000000910f98 in start_event_loop () at gdb/main.c:400 #46 0x0000000000911130 in captured_command_loop () at gdb/main.c:464 #47 0x0000000000912b5c in captured_main (data=0xfffff1a3db58) at gdb/main.c:1338 #48 0x0000000000912bf4 in gdb_main (args=0xfffff1a3db58) at gdb/main.c:1357 #49 0x00000000004170f4 in main (argc=10, argv=0xfffff1a3dcc8) at gdb/gdb.c:38 (gdb) ... The abort happens because a c++ exception escapes to c code, specifically opncls_bstat in bfd/opncls.c. Compiling with -fexceptions works around this. Fix this by catching the exception just before it escapes, in stat_trampoline and likewise in few similar spot. Add a new template catch_exceptions to do so in a consistent way. Tested on aarch64-linux. Approved-by: Pedro Alves <[email protected]> PR remote/31577 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31577

If threads are disabled, either by --disable-threading explicitely, or by missing std::thread support, you get the following ASAN error when loading symbols: ==7310==ERROR: AddressSanitizer: heap-use-after-free on address 0x614000002128 at pc 0x00000098794a bp 0x7ffe37e6af70 sp 0x7ffe37e6af68 READ of size 1 at 0x614000002128 thread T0 #0 0x987949 in index_cache_store_context::store() const ../../gdb/dwarf2/index-cache.c:163 #1 0x943467 in cooked_index_worker::write_to_cache(cooked_index const*, deferred_warnings*) const ../../gdb/dwarf2/cooked-index.c:601 #2 0x1705e39 in std::function<void ()>::operator()() const /gcc/9/include/c++/9.2.0/bits/std_function.h:690 #3 0x1705e39 in gdb::task_group::impl::~impl() ../../gdbsupport/task-group.cc:38 0x614000002128 is located 232 bytes inside of 408-byte region [0x614000002040,0x6140000021d8) freed by thread T0 here: #0 0x7fd75ccf8ea5 in operator delete(void*, unsigned long) ../../.././libsanitizer/asan/asan_new_delete.cc:177 #1 0x9462e5 in cooked_index::index_for_writing() ../../gdb/dwarf2/cooked-index.h:689 #2 0x9462e5 in operator() ../../gdb/dwarf2/cooked-index.c:657 #3 0x9462e5 in _M_invoke /gcc/9/include/c++/9.2.0/bits/std_function.h:300 It's happening because cooked_index_worker::wait always returns true in this case, which tells cooked_index::wait it can delete the m_state cooked_index_worker member, but cooked_index_worker::write_to_cache tries to access it immediately afterwards. Fixed by making cooked_index_worker::wait only return true if desired_state is CACHE_DONE, same as if threading was enabled, so m_state will not be prematurely deleted. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31694 Approved-By: Tom Tromey <[email protected]>

…ames In case a DIE contains a linkage name which cannot be demangled and a source language name (DW_AT_NAME) exists then we want to display this name instead of the non-demangeable linkage name. dwarf2_physname returns the linkage name in case the linkage name cannot be demangled. Before this patch we always set the returned physname as demangled name. This patch changes this by comparing the value of physname with the linkage name. Now after this change in case it is equals to the linkage name and if DW_AT_NAME exists then this is set as the demangled name otherwise like before still linkage name is used. For the reproducer, using the test source file added in this change: "gdb/testsuite/gdb.dwarf2/dw2-wrong-mangled-name.c" Here is an example of the DWARF where wrong linkage name is emitted by the compiler for the "func_demangled_test" function: subprogram { {MACRO_AT_range {func_demangled_test}} {linkage_name "_FUNC_WRONG_MANGLED__"} {name "func_demangled_test"} {external 1 flag} } subprogram { {MACRO_AT_range {main}} {external 1 flag} {name main} {main_subprogram 1 flag} } Before this change for a function having both DIEs DW_AT_name and DW_AT_LINKAGENAME but with the wrong linkage name info, the backtrace command shows following: (gdb) b func_demangled_test (gdb) r Breakpoint 1, 0x0000555555555131 in _FUNC_WRONG_MANGLED__ () (gdb) backtrace \#0 0x0000555555555131 in _FUNC_WRONG_MANGLED__ () \#1 0x000055555555514a in main () After the change now GDB shows the name emitted by DW_AT_NAME: (gdb) b func_demangled_test (gdb) r Breakpoint 1, 0x0000555555555131 in func_demangled_test () (gdb) backtrace \#0 0x0000555555555131 in func_demangled_test () \#1 0x000055555555514a in main () A new test is added to verify this change. Approved-By: Tom Tromey <[email protected]>

Similar to the x86_64 testcases, some .s files contain the corresponding CFI directives. This helps in validating the synthesized CFI by running those tests with and without the --scfi=experimental command line option. GAS issues some diagnostics, enabled by default, with --scfi=experimental. The diagnostics have been added with an intent to help user correct inadvertent errors in their hand-written asm. An error is issued when GAS finds that input asm is not amenable to accurate CFI synthesis. The existing scfi-diag-*.s tests in the gas/testsuite/gas/scfi/x86_64 directory test some SCFI diagnostics already: - (#1) "Warning: SCFI: Asymetrical register restore" - (#2) "Error: SCFI: usage of REG_FP as scratch not supported" - (#3) "Error: SCFI: unsupported stack manipulation pattern" - (#4) "Error: untraceable control flow for func 'XXX'" In the newly added aarch64 testsuite, further tests for additional diagnostics have been added: - scfi-diag-1.s in this patch highlights an aarch64-specific diagnostic: (#5) "Warning: SCFI: ignored probable save/restore op with reg offset" Additionally, some testcases are added to showcase the (currently) unsupported patterns, e.g., scfi-unsupported-1.s mov x16, 4384 sub sp, sp, x16 gas/testsuite/: * gas/scfi/README: Update comment to include aarch64. * gas/scfi/aarch64/scfi-aarch64.exp: New file. * gas/scfi/aarch64/ginsn-arith-1.l: New test. * gas/scfi/aarch64/ginsn-arith-1.s: New test. * gas/scfi/aarch64/ginsn-cofi-1.l: New test. * gas/scfi/aarch64/ginsn-cofi-1.s: New test. * gas/scfi/aarch64/ginsn-ldst-1.l: New test. * gas/scfi/aarch64/ginsn-ldst-1.s: New test. * gas/scfi/aarch64/scfi-callee-saved-fp-1.d: New test. * gas/scfi/aarch64/scfi-callee-saved-fp-1.l: New test. * gas/scfi/aarch64/scfi-callee-saved-fp-1.s: New test. * gas/scfi/aarch64/scfi-callee-saved-fp-2.d: New test. * gas/scfi/aarch64/scfi-callee-saved-fp-2.l: New test. * gas/scfi/aarch64/scfi-callee-saved-fp-2.s: New test. * gas/scfi/aarch64/scfi-cb-1.d: New test. * gas/scfi/aarch64/scfi-cb-1.l: New test. * gas/scfi/aarch64/scfi-cb-1.s: New test. * gas/scfi/aarch64/scfi-cfg-1.d: New test. * gas/scfi/aarch64/scfi-cfg-1.l: New test. * gas/scfi/aarch64/scfi-cfg-1.s: New test. * gas/scfi/aarch64/scfi-cfg-2.d: New test. * gas/scfi/aarch64/scfi-cfg-2.l: New test. * gas/scfi/aarch64/scfi-cfg-2.s: New test. * gas/scfi/aarch64/scfi-cfg-3.d: New test. * gas/scfi/aarch64/scfi-cfg-3.l: New test. * gas/scfi/aarch64/scfi-cfg-3.s: New test. * gas/scfi/aarch64/scfi-cfg-4.l: New test. * gas/scfi/aarch64/scfi-cfg-4.s: New test. * gas/scfi/aarch64/scfi-cond-br-1.d: New test. * gas/scfi/aarch64/scfi-cond-br-1.l: New test. * gas/scfi/aarch64/scfi-cond-br-1.s: New test. * gas/scfi/aarch64/scfi-diag-1.l: New test. * gas/scfi/aarch64/scfi-diag-1.s: New test. * gas/scfi/aarch64/scfi-diag-2.l: New test. * gas/scfi/aarch64/scfi-diag-2.s: New test. * gas/scfi/aarch64/scfi-diag-3.l: New test. * gas/scfi/aarch64/scfi-diag-3.s: New test. * gas/scfi/aarch64/scfi-ldrp-1.d: New test. * gas/scfi/aarch64/scfi-ldrp-1.l: New test. * gas/scfi/aarch64/scfi-ldrp-1.s: New test. * gas/scfi/aarch64/scfi-ldrp-2.d: New test. * gas/scfi/aarch64/scfi-ldrp-2.l: New test. * gas/scfi/aarch64/scfi-ldrp-2.s: New test. * gas/scfi/aarch64/scfi-ldstnap-1.d: New test. * gas/scfi/aarch64/scfi-ldstnap-1.l: New test. * gas/scfi/aarch64/scfi-ldstnap-1.s: New test. * gas/scfi/aarch64/scfi-strp-1.d: New test. * gas/scfi/aarch64/scfi-strp-1.l: New test. * gas/scfi/aarch64/scfi-strp-1.s: New test. * gas/scfi/aarch64/scfi-strp-2.d: New test. * gas/scfi/aarch64/scfi-strp-2.l: New test. * gas/scfi/aarch64/scfi-strp-2.s: New test. * gas/scfi/aarch64/scfi-unsupported-1.l: New test. * gas/scfi/aarch64/scfi-unsupported-1.s: New test. * gas/scfi/aarch64/scfi-unsupported-2.l: New test. * gas/scfi/aarch64/scfi-unsupported-2.s: New test.

On arm-linux, I run into: ... PASS: gdb.ada/mi_task_arg.exp: mi runto task_switch.break_me Expecting: ^(-stack-list-arguments 1[^M ]+)?(\^done,stack-args=\[frame={level="0",args=\[\]},frame={level="1",args=\[{name="<_task>",value="0x[0-9A-Fa-f]+"}(,{name="<_taskL>",value="[0-9]+"})?\]},frame={level="2",args=\[({name="self_id",value="(0x[0-9A-Fa-f]+|<optimized out>)"})?\]},.*[^M ]+[(]gdb[)] ^M [ ]*) -stack-list-arguments 1^M ^done,stack-args=[frame={level="0",args=[]},frame={level="1",args=[{name="<_task>",value="0x40bc48"}]},frame={level="2",args=[]}]^M (gdb) ^M FAIL: gdb.ada/mi_task_arg.exp: -stack-list-arguments 1 (unexpected output) ... The problem is that the test-case expects a level 3 frame, but there is none. This can be reproduced using cli bt: ... $ gdb -q -batch outputs/gdb.ada/mi_task_arg/task_switch \ -ex "b task_switch.break_me" \ -ex run \ -ex bt Breakpoint 1 at 0x34b4: file task_switch.adb, line 57. Thread 3 "my_caller" hit Breakpoint 1, task_switch.break_me () \ at task_switch.adb:57 57 null; #0 task_switch.break_me () at task_switch.adb:57 #1 0x00403424 in task_switch.caller (<_task>=0x40bc48) at task_switch.adb:51 #2 0xf7f95a08 in ?? () from /lib/arm-linux-gnueabihf/libgnarl-12.so Backtrace stopped: previous frame identical to this frame (corrupt stack?) ... The purpose of the test-case is printing the frame at level 1, so I don't think we should bother about the presence of the frame at level 3. Fix this by allowing the backtrace to stop at level 2. Tested on arm-linux. Approved-By: Luis Machado <[email protected]> Approved-By: Andrew Burgess <[email protected]>

Since commit b1da98a ("gdb: remove use of alloca in new_macro_definition"), if cached_argv is empty, we call macro_bcache with a nullptr data. This ends up caught by UBSan deep down in the bcache code: $ ./gdb -nx -q --data-directory=data-directory /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp -readnow Reading symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp... Expanding full symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp... /home/smarchi/src/binutils-gdb/gdb/bcache.c:195:12: runtime error: null pointer passed as argument 2, which is declared to never be null The backtrace: #1 0x00007ffff619a05d in __ubsan::__ubsan_handle_nonnull_arg_abort (Data=<optimized out>) at ../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:750 #2 0x000055556337fba2 in gdb::bcache::insert (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.c:195 #3 0x0000555564b49222 in gdb::bcache::insert<char const*, void> (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.h:158 #4 0x0000555564b481fa in macro_bcache<char const*> (t=0x62100007ae70, addr=0x0, len=0) at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:117 #5 0x0000555564b42b4a in new_macro_definition (t=0x62100007ae70, kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:573 #6 0x0000555564b44674 in macro_define_internal (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:777 #7 0x0000555564b44ae2 in macro_define_function (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:816 #8 0x0000555563f62fc8 in parse_macro_definition (file=0x6210000ab9e0, line=469, body=0x62a00003af2a "__va_arg_pack() __builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:203 This can be reproduced by running gdb.base/macscp.exp. Avoid calling macro_bcache if the macro doesn't have any arguments. Change-Id: I33b5a7c3b3a93d5adba98983fcaae9c8522c383d

Some flavors of indirect call and jmp instructions were not being handled earlier, leading to a GAS error (#1): (#1) "Error: SCFI: unhandled op 0xff may cause incorrect CFI" Not handling jmp/call (direct or indirect) ops is an error (as shown above) because SCFI needs an accurate CFG to synthesize CFI correctly. Recall that the presence of indirect jmp/call, however, does make the CFG ineligible for SCFI. In other words, generating the ginsns for them now, will eventually cause SCFI to bail out later with an error (#2) anyway: (#2) "Error: untraceable control flow for func 'XXX'" The first error (#1) gives the impression of missing functionality in GAS. So, it seems cleaner to synthesize a GINSN_TYPE_JUMP / GINSN_TYPE_CALL now in the backend, and let SCFI machinery complain with the error as expected. The handling for these indirect jmp/call instructions is similar, so reuse the code by carving out a function for the same. Adjust the testcase to include the now handled jmp/call instructions as well. gas/ * config/tc-i386-ginsn.c (x86_ginsn_indirect_branch): New function. (x86_ginsn_new): Refactor out functionality to above. gas/testsuite/ * gas/scfi/x86_64/ginsn-cofi-1.l: Adjust the output. * gas/scfi/x86_64/ginsn-cofi-1.s: Add further varieties of jmp/call opcodes.

With test-case gdb.dwarf2/dw2-lines.exp on arm-linux, I run into: ... (gdb) break bar_label^M Breakpoint 2 at 0x4004f6: file dw2-lines.c, line 29.^M (gdb) continue^M Continuing.^M ^M Breakpoint 2, bar () at dw2-lines.c:29^M 29 foo (2);^M (gdb) PASS: $exp: cv=2: cdw=32: lv=2: ldw=32: continue to breakpoint: foo $1$ ... The pass is incorrect because the continue lands at line 29 with "foo (2)" instead of line line 27 with "foo (1)". A minimal version is: ... $ gdb -q -batch dw2-lines.cv-2-cdw-32-lv-2-ldw-32 -ex "b bar_label" Breakpoint 1 at 0x4f6: file dw2-lines.c, line 29. ... where: ... 000004ec <bar>: 4ec: b580 push {r7, lr} 4ee: af00 add r7, sp, #0 000004f0 <bar_label>: 4f0: 2001 movs r0, #1 4f2: f7ff fff1 bl 4d8 <foo> 000004f6 <bar_label_2>: 4f6: 2002 movs r0, #2 4f8: f7ff ffee bl 4d8 <foo> ... So, how does this happen? In short: - skip_prologue_sal calls arm_skip_prologue with pc == 0x4ec, - thumb_analyze_prologue returns 0x4f2 (overshooting by 1 insn, PR tdep/31981), and - skip_prologue_sal decides that we're mid-line, and updates to 0x4f6. However, this is a test-case about .debug_line info, so why didn't arm_skip_prologue use the line info to skip the prologue? The answer is that the line info starts at bar_label, not at bar. Fixing that allows us to work around PR tdep/31981. Likewise in gdb.dwarf2/dw2-line-number-zero.exp. Instead, add a new test-case gdb.arch/skip-prologue.exp that is dedicated to checking quality of architecture-specific prologue analysis, without being written in an architecture-specific way. If fails on arm-linux for both marm and mthumb: ... FAIL: gdb.arch/skip-prologue.exp: f2: $bp_addr == $prologue_end_addr (skipped too much) FAIL: gdb.arch/skip-prologue.exp: f4: $bp_addr == $prologue_end_addr (skipped too much) ... and passes for: - x86_64-linux for {m64,m32}x{-fno-PIE/-no-pie,-fPIE/-pie} - aarch64-linux. Tested on arm-linux.

The commit: commit c6b4867 Date: Thu Mar 30 19:21:22 2023 +0100 gdb: parse pending breakpoint thread/task immediately Introduce a use bug where the value of a temporary variable was being used after it had gone out of scope. This was picked up by the address sanitizer and would result in this error: (gdb) maintenance selftest create_breakpoint_parse_arg_string Running selftest create_breakpoint_parse_arg_string. ================================================================= ==2265825==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fbb08046511 at pc 0x000001632230 bp 0x7fff7c2fb770 sp 0x7fff7c2fb768 READ of size 1 at 0x7fbb08046511 thread T0 #0 0x163222f in create_breakpoint_parse_arg_string(char const*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, int*, int*, int*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, bool*) ../../src/gdb/break-cond-parse.c:496 #1 0x1633026 in test ../../src/gdb/break-cond-parse.c:582 #2 0x163391b in create_breakpoint_parse_arg_string_tests ../../src/gdb/break-cond-parse.c:649 #3 0x12cfebc in void std::__invoke_impl<void, void (*&)()>(std::__invoke_other, void (*&)()) /usr/include/c++/13/bits/invoke.h:61 #4 0x12cc8ee in std::enable_if<is_invocable_r_v<void, void (*&)()>, void>::type std::__invoke_r<void, void (*&)()>(void (*&)()) /usr/include/c++/13/bits/invoke.h:111 #5 0x12c81e5 in std::_Function_handler<void (), void (*)()>::_M_invoke(std::_Any_data const&) /usr/include/c++/13/bits/std_function.h:290 #6 0x18bb51d in std::function<void ()>::operator()() const /usr/include/c++/13/bits/std_function.h:591 #7 0x4193ef9 in selftests::run_tests(gdb::array_view<char const* const>, bool) ../../src/gdbsupport/selftest.cc:100 #8 0x21c2206 in maintenance_selftest ../../src/gdb/maint.c:1172 ... etc ... The problem was caused by three lines like this one: thread_info *thr = parse_thread_id (std::string (t.get_value ()).c_str (), &tmptok); After parsing the thread-id TMPTOK would be left pointing into the temporary string which had been created on this line. When on the next line we did this: gdb_assert (*tmptok == '\0'); The value of *TMPTOK is undefined. Fix this by creating the std::string earlier in the scope. Now the contents of the string will remain valid when we check *TMPTOK. The address sanitizer issue is now resolved.

The binary provided with bug 32165 [1] has 36139 ELF sections. GDB crashes on it with (note that my GDB is build with -D_GLIBCXX_DEBUG=1: $ ./gdb -nx -q --data-directory=data-directory ./vmlinux Reading symbols from ./vmlinux... (No debugging symbols found in ./vmlinux) (gdb) info func /usr/include/c++/14.2.1/debug/vector:508: In function: std::debug::vector<_Tp, _Allocator>::reference std::debug::vector<_Tp, _Allocator>::operator[](size_type) [with _Tp = long unsigned int; _Allocator = std::allocator<long unsigned int>; reference = long unsigned int&; size_type = long unsigned int] Error: attempt to subscript container with out-of-bounds index -29445, but container only holds 36110 elements. Objects involved in the operation: sequence "this" @ 0x514000007340 { type = std::debug::vector<unsigned long, std::allocator<unsigned long> >; } The crash occurs here: #3 0x00007ffff5e334c3 in __GI_abort () at abort.c:79 #4 0x00007ffff689afc4 in __gnu_debug::_Error_formatter::_M_error (this=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/debug.cc:1320 #5 0x0000555561119a16 in std::__debug::vector<unsigned long, std::allocator<unsigned long> >::operator[] (this=0x514000007340, __n=18446744073709522171) at /usr/include/c++/14.2.1/debug/vector:508 #6 0x0000555562e288e8 in minimal_symbol::value_address (this=0x5190000bb698, objfile=0x514000007240) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:517 #7 0x0000555562e5a131 in global_symbol_searcher::expand_symtabs (this=0x7ffff0f5c340, objfile=0x514000007240, preg=std::optional [no contained value]) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:4983 #8 0x0000555562e5d2ed in global_symbol_searcher::search (this=0x7ffff0f5c340) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5189 #9 0x0000555562e5ffa4 in symtab_symbol_info (quiet=false, exclude_minsyms=false, regexp=0x0, kind=FUNCTION_DOMAIN, t_regexp=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5361 #10 0x0000555562e6131b in info_functions_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5525 That is, at this line of `minimal_symbol::value_address`, where `objfile->section_offsets` is an `std::vector`: return (CORE_ADDR (this->unrelocated_address ()) + objfile->section_offsets[this->section_index ()]); A section index of -29445 is suspicious. The minimal_symbol at play here is: (top-gdb) p m_name $1 = 0x521001de10af "_sinittext" So I restarted debugging, breaking on: (top-gdb) b general_symbol_info::set_section_index if $_streq("_sinittext", m_name) And I see that weird -29445 value: (top-gdb) frame #0 general_symbol_info::set_section_index (this=0x525000082390, idx=-29445) at /home/smarchi/src/binutils-gdb/gdb/symtab.h:611 611 { m_section = idx; } But going up one frame, the section index is 36091: (top-gdb) frame #1 0x0000555562426526 in minimal_symbol_reader::record_full (this=0x7ffff0ead560, name="_sinittext", copy_name=false, address=-2111475712, ms_type=mst_text, section=36091) at /home/smarchi/src/binutils-gdb/gdb/minsyms.c:1228 1228 msymbol->set_section_index (section); It seems like the problem is just that the type used for the section index (short) is not big enough. Change from short to int. If somebody insists, we could even go long long / int64_t, but I doubt it's necessary. With that fixed, I get: (gdb) info func All defined functions: Non-debugging symbols: 0xffffffff81000000 _stext 0xffffffff82257000 _sinittext 0xffffffff822b4ebb _einittext [1] https://sourceware.org/bugzilla/show_bug.cgi?id=32165 Change-Id: Icb1c3de9474ff5adef7e0bbbf5e0b67b279dee04 Reviewed-By: Tom de Vries <[email protected]> Reviewed-by: Keith Seitz <[email protected]>

When building gdb with gcc 12 and -fsanitize=threads while renabling background dwarf reading by setting dwarf_synchronous to false, I run into: ... (gdb) file amd64-watchpoint-downgrade Reading symbols from amd64-watchpoint-downgrade... (gdb) watch global_var ================== WARNING: ThreadSanitizer: data race (pid=20124) Read of size 8 at 0x7b80000500d8 by main thread: #0 cooked_index_entry::full_name(obstack*, bool) const cooked-index.c:220 #1 cooked_index::get_main_name(obstack*, language*) const cooked-index.c:735 #2 cooked_index_worker::wait(cooked_state, bool) cooked-index.c:559 #3 cooked_index::wait(cooked_state, bool) cooked-index.c:631 #4 cooked_index_functions::wait(objfile*, bool) cooked-index.h:729 #5 cooked_index_functions::compute_main_name(objfile*) cooked-index.h:806 #6 objfile::compute_main_name() symfile-debug.c:461 #7 find_main_name symtab.c:6503 #8 main_language() symtab.c:6608 #9 set_initial_language_callback symfile.c:1634 #10 get_current_language() language.c:96 ... Previous write of size 8 at 0x7b80000500d8 by thread T1: #0 cooked_index_shard::finalize(parent_map_map const*) \ dwarf2/cooked-index.c:409 #1 operator() cooked-index.c:663 ... ... SUMMARY: ThreadSanitizer: data race cooked-index.c:220 in \ cooked_index_entry::full_name(obstack*, bool) const ================== Hardware watchpoint 1: global_var (gdb) PASS: gdb.arch/amd64-watchpoint-downgrade.exp: watch global_var ... This was also reported in PR31715. This is due do gcc PR110799 [1], generating wrong code with -fhoist-adjacent-loads, and causing a false positive for -fsanitize=threads. Work around the gcc PR by forcing -fno-hoist-adjacent-loads for gcc <= 13 and -fsanitize=threads. Tested in that same configuration on x86_64-linux. Remaining ThreadSanitizer problems are the ones reported in PR31626 (gdb.rust/dwindex.exp) and PR32247 (gdb.trace/basic-libipa.exp). PR gdb/31715 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31715 Tested-By: Bernd Edlinger <[email protected]> Approved-By: Tom Tromey <[email protected]> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

When calling a function with double arguments, I get this asan error: ==7920==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x0053131ece38 at pc 0x7ff79697a68f bp 0x0053131ec790 sp 0x0053131ebf40 READ of size 16 at 0x0053131ece38 thread T0 #0 0x7ff79697a68e in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long long), void const*, void const*, unsigned long long) C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:814 #1 0x7ff79697aebd in memcmp C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:845 #2 0x7ff79697aebd in memcmp C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:840 #3 0x7ff7927e237f in regcache::raw_write(int, gdb::array_view<unsigned char const>) C:/gdb/src/gdb.git/gdb/regcache.c:874 #4 0x7ff7927e3c85 in regcache::cooked_write(int, gdb::array_view<unsigned char const>) C:/gdb/src/gdb.git/gdb/regcache.c:914 #5 0x7ff7927e5d89 in regcache::cooked_write(int, unsigned char const*) C:/gdb/src/gdb.git/gdb/regcache.c:933 #6 0x7ff7911d5965 in amd64_windows_store_arg_in_reg C:/gdb/src/gdb.git/gdb/amd64-windows-tdep.c:216 Address 0x0053131ece38 is located in stack of thread T0 at offset 40 in frame #0 0x7ff7911d565f in amd64_windows_store_arg_in_reg C:/gdb/src/gdb.git/gdb/amd64-windows-tdep.c:208 This frame has 4 object(s): [32, 40) 'buf' (line 211) <== Memory access at offset 40 overflows this variable It's because the first 4 double arguments are passed via XMM registers, and they need a buffer of 16 bytes, even if we only use 8 bytes of them. Approved-By: Tom Tromey <[email protected]>

On Windows gcore is not implemented, and if you try it, you get an heap-use-after-free error: (gdb) gcore C:/gdb/build64/gdb-git-python3/gdb/testsuite/outputs/gdb.base/gcore-buffer-overflow/gcore-buffer-overflow.test warning: cannot close "================================================================= ==10108==ERROR: AddressSanitizer: heap-use-after-free on address 0x1259ea503110 at pc 0x7ff6806e3936 bp 0x0062e01ed990 sp 0x0062e01ed140 READ of size 111 at 0x1259ea503110 thread T0 #0 0x7ff6806e3935 in strlen C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:391 #1 0x7ff6807169c4 in __pformat_puts C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:558 #2 0x7ff6807186c1 in __mingw_pformat C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:2514 #3 0x7ff680713614 in __mingw_vsnprintf C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_vsnprintf.c:41 #4 0x7ff67f34419f in vsnprintf(char*, unsigned long long, char const*, char*) C:/msys64/mingw64/x86_64-w64-mingw32/include/stdio.h:484 #5 0x7ff67f34419f in string_vprintf[abi:cxx11](char const*, char*) C:/gdb/src/gdb.git/gdbsupport/common-utils.cc:106 #6 0x7ff67b37b739 in cli_ui_out::do_message(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/cli-out.c:227 #7 0x7ff67ce3d030 in ui_out::call_do_message(ui_file_style const&, char const*, ...) C:/gdb/src/gdb.git/gdb/ui-out.c:571 #8 0x7ff67ce4255a in ui_out::vmessage(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/ui-out.c:740 #9 0x7ff67ce2c873 in ui_file::vprintf(char const*, char*) C:/gdb/src/gdb.git/gdb/ui-file.c:73 #10 0x7ff67ce7f83d in gdb_vprintf(ui_file*, char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:1881 #11 0x7ff67ce7f83d in vwarning(char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:181 #12 0x7ff67f3530eb in warning(char const*, ...) C:/gdb/src/gdb.git/gdbsupport/errors.cc:33 #13 0x7ff67baed27f in gdb_bfd_close_warning C:/gdb/src/gdb.git/gdb/gdb_bfd.c:437 #14 0x7ff67baed27f in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:646 #15 0x7ff67baed27f in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739 #16 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82 #17 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91 #18 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176 0x1259ea503110 is located 16 bytes inside of 4064-byte region [0x1259ea503100,0x1259ea5040e0) freed by thread T0 here: #0 0x7ff6806b1687 in free C:/gcc/src/gcc-14.2.0/libsanitizer/asan/asan_malloc_win.cpp:90 #1 0x7ff67f2ae807 in objalloc_free C:/gdb/src/gdb.git/libiberty/objalloc.c:187 #2 0x7ff67d7f56e3 in _bfd_free_cached_info C:/gdb/src/gdb.git/bfd/opncls.c:247 #3 0x7ff67d7f2782 in _bfd_delete_bfd C:/gdb/src/gdb.git/bfd/opncls.c:180 #4 0x7ff67d7f5df9 in bfd_close_all_done C:/gdb/src/gdb.git/bfd/opncls.c:960 #5 0x7ff67d7f62ec in bfd_close C:/gdb/src/gdb.git/bfd/opncls.c:925 #6 0x7ff67baecd27 in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:643 #7 0x7ff67baecd27 in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739 #8 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82 #9 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91 #10 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176 It happens because gdb_bfd_close_or_warn uses a bfd-internal name for the failing-close warning, after the close is finished, and the name already freed: static int gdb_bfd_close_or_warn (struct bfd *abfd) { int ret; const char *name = bfd_get_filename (abfd); for (asection *sect : gdb_bfd_sections (abfd)) free_one_bfd_section (sect); ret = bfd_close (abfd); if (!ret) gdb_bfd_close_warning (name, bfd_errmsg (bfd_get_error ())); return ret; } Fixed by making a copy of the name for the warning. Approved-By: Andrew Burgess <[email protected]>

After the commit: commit b9de07a Date: Thu Oct 10 11:37:34 2024 +0100 gdb: fix handling of DW_AT_entry_pc of inlined subroutines GDB's buildbot CI testing highlighted this assertion failure: (gdb) c Continuing. ../../binutils-gdb/gdb/block.h:203: internal-error: set_entry_pc: Assertion `start >= this->start () && start < this->end ()' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. ----- Backtrace ----- FAIL: gdb.base/break-probes.exp: run til our library loads (GDB internal error) This assertion was in the new function set_entry_pc and is asserting that the default_entry_pc() value is within the blocks start/end range. The default_entry_pc() is the value GDB will use as the entry-pc if the DWARF doesn't specifically override the entry-pc. This value is calculated as: 1. The start address of the first sub-range within the block, if the block has more than 1 range, or 2. The low address (from DW_AT_low_pc) for the block. If the block only has a single range then this means the block was defined with low/high pc attributes (case #2 above). These low/high pc values are what block::start() and block::end() return. This means that by definition, if the block is continuous, the above assert cannot trigger as 'start', the default_entry_pc() would be equivalent to block::start(). This means that, for the assert to trigger, the block must have multiple ranges, and the first address of the first range is not within the blocks low/high address range. This seems wrong. I inspected the state at the time the assert triggered and discovered the block's start() address. Then I removed the assert and restarted GDB. I was now able to inspect the blocks at the offending address: (gdb) maintenance info blocks 0x7ffff7dddaa4 Blocks at 0x7ffff7dddaa4: from objfile: [(objfile *) 0x44a37f0] /lib64/ld-linux-x86-64.so.2 [(block *) 0x46b30c0] 0x7ffff7ddd5a0..0x7ffff7dde8a6 entry pc: 0x7ffff7ddd5a0 is global block symbol count: 4 is contiguous [(block *) 0x46b3020] 0x7ffff7ddd5a0..0x7ffff7dde8a6 entry pc: 0x7ffff7ddd5a0 is static block symbol count: 9 is contiguous [(block *) 0x46b2f70] 0x7ffff7ddda00..0x7ffff7dddac3 entry pc: 0x7ffff7ddda00 function: __GI__dl_find_dso_for_object symbol count: 4 is contiguous [(block *) 0x46b2e10] 0x7ffff7dddaa4..0x7ffff7dddac3 entry pc: 0x7ffff7dddaa4 inline function: __GI__dl_find_dso_for_object symbol count: 5 is contiguous [(block *) 0x46b2a40] 0x7ffff7dddaa4..0x7ffff7dddac3 entry pc: 0x7ffff7dddaa4 symbol count: 1 is contiguous [(block *) 0x46b2970] 0x7ffff7dddaa4..0x7ffff7dddac3 entry pc: 0x7ffff7dddaa4 symbol count: 2 address ranges: 0x7ffff7ddda0e..0x7ffff7ddda77 0x7ffff7ddda90..0x7ffff7ddda96 I've left everything in for context, but the only really interesting bit is the very last block, it's low/high range is: 0x7ffff7dddaa4..0x7ffff7dddac3 but it has separate ranges: 0x7ffff7ddda0e..0x7ffff7ddda77 0x7ffff7ddda90..0x7ffff7ddda96 which are all outside the low/high range. This is what triggers the assert. But why does that block exist at all? What I believe is happening is that we're running into a bug in older versions of GCC. The buildbot failure was with an 8.5 gcc, and Tom de Vries also reported seeing failures when using version 7 and 8 gcc, but not with gcc 9 and onward. Looking at the DWARF I can see that the problematic block is created from this DIE: <4><15efb>: Abbrev Number: 83 (DW_TAG_lexical_block) <15efc> DW_AT_abstract_origin: <0x15e9f> <15efe> DW_AT_low_pc : 0x7ffff7dddaa4 <15f06> DW_AT_high_pc : 31 which links via DW_AT_abstract_origin to: <2><15e9f>: Abbrev Number: 80 (DW_TAG_lexical_block) <15ea0> DW_AT_ranges : 0x38e0 <15ea4> DW_AT_sibling : <0x15eca> And so we can see that <15efb> has got both low/high pc attributes and a ranges attribute. If I widen my checking to parents of DIE <15efb> then I see that they also have DW_AT_abstract_origin, however, there is something interesting going on, the parent DIEs are linking to a different DIE tree than <15efb>. What I believe is happening is this, we have an abstract instance tree, this is rooted at a DW_AT_subprogram, and contains all the blocks, variables, parameters, etc, that you would expect. As this is an abstract instance, then there are no low/high pc attributes, and no ranges attributes in this tree. This makes sense. Now elsewhere we have a DW_TAG_subprogram (not DW_TAG_inlined_subroutine) which links via DW_AT_abstract_origin to the abstract DW_AT_subprogram. This case is documented in the DWARF 5 spec in section 3.3.8.3, and describes an Out-of-Line Instance of an Inlined Subroutine. Within this out of line instance many of the DIE correctly link back, using DW_AT_abstract_origin to the abstract instance tree. This tree also includes the DIE <15e9f>, which is where our problem DIE references. Now, to really confuse things, within this out-of-line instance we have a DW_TAG_inlined_subroutine, which is another instance of the same abstract instance tree! This would seem to indicate a recursive call to the inline function, and the compiler, for some reason, needed to instantiate an out of line instance of this function. And it is within this nested, inlined subroutine, that the problem DIE exists. The problem DIE is referencing the corresponding DIE within the out of line instance tree, but I am convinced this must be a (long fixed) GCC bug, and that the problem DIE should be referencing the DIE within the abstract instance tree. I'm aware that the above is pretty confusing. The actual DWARF would be a around 200 lines long, so I'd like to avoid dumping it in here. But here's my attempt at representing what's going on in a minimal example. The numbers down the side represent the section offset, not the nesting level, and I've removed any attributes that are not relevant: <1> DW_TAG_subprogram <2> DW_TAG_lexical_block <3> DW_TAG_subprogram DW_AT_abstract_origin <1> <4> DW_TAG_lexical_block DW_AT_ranges ... <5> DW_TAG_inlined_subroutine DW_AT_abstract_origin <1> <6> DW_TAG_lexical_block DW_AT_abstract_origin <4> DW_AT_low_pc ... DW_AT_high_pc ... The lexical block at <6> is linking to <4> when it should be linking to <2>. There is one additional thing that we might wonder about, which is, when calculating the low/high pc range for a block, why does GDB not make use of the range information and expand the range beyond the defined low/high values? The answer to this is in dwarf_get_pc_bounds_ranges_or_highlow_pc in dwarf/read.c. This is where the low/high bounds are calculated. What we see is that GDB first checks for a low/high attribute pair, and if that is present, this defines the address range for the block. Only if there is no DW_AT_low_pc do we check for the DW_AT_ranges, and use that to define the extent of the block. And this makes sense, section 3.5 of the DWARF-5 spec says: The lexical block entry may have either a DW_AT_low_pc and DW_AT_high_pc pair of attributes or a DW_AT_ranges attribute whose values encode the contiguous or non-contiguous address ranges, respectively, of the machine instructions generated for the lexical block... Section 3.5 is specifically about lexical blocks, but the same wording, about it being either low/high OR ranges is repeated for other DW_TAG_ types. So this explains why GDB doesn't use the ranges to expand the problem blocks ranges; as the first DIE has low/high addresses, these are used, and the ranges is not consulted. It is only later in dwarf2_record_block_ranges that we create a range based off the low/high pc, and then also process the ranges data, this allows the problem block to exist with ranges that are outside the low/high range. To solve this I considered a number of options: 1. Prevent loading certain attributes from an abstract instance. Section 3.3.8.1 of the DWARF-5 spec talks about which attributes are appropriate to place in an abstract instance. Any attribute that might vary between instances should not appear in an abstract instance. DW_AT_ranges is included as an example in the non-exhaustive list of attributes that should not appear in an abstract instance. Currently in dwarf2_attr (dwarf2/read.c), when we see a DW_AT_abstract_origin attribute, we always follow this to try and find the attribute we are looking for. But we could change this function so that we prevent this following for attributes that we know should not be looked up in an abstract instance. This would solve the problem in this case by preventing us finding the DW_AT_ranges in the incorrect abstract instance. 2. Filter the ranges. Having established a blocks low/high address range in dwarf_get_pc_bounds_ranges_or_highlow_pc, we could allow dwarf2_record_block_ranges to parse the ranges, but we could reject any range that extends outside the blocks defined start and end addresses. For well behaved DWARF where we have either low/high or ranges, then the blocks start/end are defined from the range data, and so, by definition, every range would be acceptable. But in our problem case we would reject all of the invalid ranges. This is my least favourite solution as it feels like rejecting the ranges is tackling the problem too late on. 3. Don't try to parse ranges when we have low/high attributes. This option involves updating dwarf2_record_block_ranges to match the behaviour of dwarf_get_pc_bounds_ranges_or_highlow_pc, and, I believe, to match the DWARF spec: don't try to read range data from DW_AT_ranges if we have low/high pc attributes. In our case this solves the issue because the problematic DIE has the low/high attributes, and it then links to the wrong DIE which happens to have DW_AT_ranges. With this change in place we don't even look for the DW_AT_ranges. If the problem were reversed, and the initial DIE had DW_AT_ranges, but the incorrectly referenced DIE had the low/high pc attributes, we would pick up the wrong addresses, but this wouldn't trigger any asserts. The reason is that dwarf_get_pc_bounds_ranges_or_highlow_pc would also find the low/high addresses from the incorrectly referenced DIE, and so we would just end up with a block which had the wrong address ranges, but the block would be self consistent, which is different to the problem we hit here. In the end, in this commit I went with solution #3, having dwarf_get_pc_bounds_ranges_or_highlow_pc and dwarf2_record_block_ranges be consistent seems sensible. However, I do wonder if in the future we might want to explore solution #1 as an additional safety feature. With this patch in place I'm able to run the gdb.base/break-probes.exp without seeing the assert that CI testing highlighted. I see no regressions when testing on x86-64 GNU/Linux with gcc 9.3.1. Note: the diff in this commit looks big, but it's really just me indenting the code. Approved-By: Tom Tromey <[email protected]>

When building gdb with -fsanitize=thread and running test-case gdb.base/bg-exec-sigint-bp-cond.exp, I run into: ... ==================^M WARNING: ThreadSanitizer: signal handler spoils errno (pid=25422)^M #0 handler_wrapper gdb/posix-hdep.c:66^M #1 decltype ({parm#2}({parm#3}...)) gdb::handle_eintr<>() \ gdbsupport/eintr.h:67^M #2 gdb::waitpid(int, int*, int) gdbsupport/eintr.h:78^M #3 run_under_shell gdb/cli/cli-cmds.c:926^M ... Likewise in: - tui_sigwinch_handler with test-case gdb.python/tui-window.exp, and - handle_sighup with test-case gdb.base/quit-live.exp. Fix this by saving the original errno, and restoring it before returning [1]. Tested on x86_64-linux. Approved-By: Tom Tromey <[email protected]> [1] https://www.gnu.org/software/libc/manual/html_node/POSIX-Safety-Concepts.html

tromey added the rust label Apr 14, 2016

tromey added this to the upstream milestone Apr 14, 2016

tromey closed this as completed Apr 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

write more tests #1

write more tests #1

tromey commented Apr 14, 2016 •

edited

Loading

tromey commented Apr 26, 2016

write more tests #1

write more tests #1

Comments

tromey commented Apr 14, 2016 • edited Loading

tromey commented Apr 26, 2016

tromey commented Apr 14, 2016 •

edited

Loading