Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small ReleaseSafe binaries with backtraces #18520

Open
matklad opened this issue Jan 11, 2024 · 19 comments
Open

Small ReleaseSafe binaries with backtraces #18520

matklad opened this issue Jan 11, 2024 · 19 comments

Comments

@matklad
Copy link
Contributor

matklad commented Jan 11, 2024

Problem: for TigerBetele, we want our release binary to be relatively small and self-contained, but we also want to see at least a minimal backtrace on assert faliures, overflow errors, indexing out of bounds and such.

As far as I am aware:

  • No combination of Zig build flags gets us the desired result: either a binary is huge, or the stacktrace is absent
  • This should be theoretically possible, as Rust does that

Specifically (non-expert, might be talking nonsense here), I think including frame pointers, and debug symbols (but not the entire of debug info) should give both a small executable, and a reasonable stack trace.

Here's a reproducible log of my experiments:

https://github.com/matklad/repros/tree/7d0c1a081d969ac040625445059dcc6574ad2bdf/zig-stack-trace

This is what I tried:

  1. Baseline, just ReleaseSafe: huge binary because of debuginfo, good backtrace
  2. module.omit_frame_pointer = false; module.strip = false;: small file, but no backtrace at all
  3. addObjCopy step with .strip = .debug: small file, but bad backtrace:
thread 33065 panic: integer overflow
Unwind information for `exe:0x10318a0` was not available, trace may be incomplete

???:?:?: 0x100c274 in ??? (exe)
???:?:?: 0x100bbc8 in ??? (exe)
???:?:?: 0x100bbb8 in ??? (exe)
???:?:?: 0x100bb93 in ??? (exe)
???:?:?: 0x100ba41 in ??? (exe)

In comparison, here's what Rust gives with strip=debuginfo:

thread 'main' panicked at ./main.rs:3:5:
attempt to multiply with overflow
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::panicking::panic
   3: main::foo
   4: main::bar
   5: main::main

I think 3) is what I want big-picture wise, but a few details prevent it from being actually useful:

  • having names, and not just addresses, would be great.
  • it would be great to have line number information for the root failure. I think we can get that for asserts, by providing our own assert using @src(), but that's also important for things like arithmertic overflows
  • The build.zig incantaiton with separate ObjCopy invocation to get the desired behavior is gnarly. I'd love to just module.strip = .strip_debuginfo_keep_symbols. See the repo linked above for build.zig.
@xdBronch
Copy link
Contributor

vaguely related #18387
possible solution could close both issues?

@nektro
Copy link
Contributor

nektro commented Jan 11, 2024

perhaps https://sourceware.org/gdb/current/onlinedocs/gdb.html/Separate-Debug-Files.html could also be used?

@matklad
Copy link
Contributor Author

matklad commented Jan 11, 2024

For TigerBeetle it’s important to ship only a single binary. I think what could work is printing only addresses “in the field”, and then correlating that back to symbols offline, but this is a much worse developer experience than just having a backtraces.

@frmdstryr

This comment was marked as resolved.

@matklad

This comment was marked as resolved.

@matklad
Copy link
Contributor Author

matklad commented Jan 12, 2024

it would be great to have line number information for the root failure. I think we can get that for asserts, by providing our own assert using @src(), but that's also important for things like arithmertic overflows

I would maybe highlight this part a bit more: in terms of user-visible features, that's just a part of the overall "crash reports are useful" package.

Implementation wise, this is mostly unrelated to debug info stripping and requires passing @src() to panic messages, an orthogonal feature.

@frmdstryr
Copy link
Contributor

frmdstryr commented Jan 12, 2024

Wow adding .strip=false makes it much better... thanks for the tip!

@frmdstryr
Copy link
Contributor

frmdstryr commented Jan 12, 2024

I see now there's also an unwind_tables and error_tracing options but no docs/comments on what it either does. Maybe that toggles with optimize too?

@xdBronch
Copy link
Contributor

error_tracing is off by default on all modes but debug as of #18160 so not really relevant in this case

@leroycep
Copy link
Contributor

leroycep commented Nov 17, 2024

Pre-emptive edit: Comparing the sizes between Zig's strip option and other methods of stripping an executable, Zig's can be a lot more aggressive. I would need to learn more about how Zig strips binaries before I could know if any debug info format would make a difference.

I've done a bit of research on this, and one possibility is using GNU's MiniDebugInfo format (at least on systems that use ELF, I'm not sure about other systems). It adds a .gnu_debugdata section that contains an LZMA compressed symbol table.

I went ahead and patched the example from @matklad to add an -Dstrip=mini_debuginfo option. Since Zig doesn't support MiniDebugInfo yet, I run the program from gdb to compare the backtraces.

Here are the sessions for each of the -Dstrip options:

`zig-0.13 build -Dstrip=none`
~/src/3_resources/matklad-repros/zig-stack-trace〉zig-0.13 build -Dstrip=none; print -e (ls ./zig-out/bin/hoyten); gdb -ex "set confirm off" -ex run -ex bt -ex quit ./zig-out/bin/hoyten
╭───┬────────────────────┬──────┬────────┬────────────────╮
│ # │        name        │ type │  size  │    modified    │
├───┼────────────────────┼──────┼────────┼────────────────┤
│ 0 │ zig-out/bin/hoyten │ file │ 1.7 MB │ 33 minutes ago │
╰───┴────────────────────┴──────┴────────┴────────────────╯

GNU gdb (GDB) 14.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./zig-out/bin/hoyten...
Starting program: /home/geemili/src/3_resources/matklad-repros/zig-stack-trace/zig-out/bin/hoyten 
thread 120293 panic: integer overflow                                                                                                       
/home/geemili/src/3_resources/matklad-repros/zig-stack-trace/main.zig:4:14: 0x100a717 in foo (hoyten)
    return x * x;
             ^
/home/geemili/src/3_resources/matklad-repros/zig-stack-trace/main.zig:9:15: 0x100a058 in bar (hoyten)
    return foo(std.math.maxInt(u32));
              ^
/home/geemili/src/3_resources/matklad-repros/zig-stack-trace/main.zig:13:33: 0x100a048 in main (hoyten)
    std.debug.print("{}", .{ bar()});
                                ^
/home/geemili/.local/share/zigup/0.13.0/files/lib/std/start.zig:524:37: 0x100a029 in posixCallMainAndExit (hoyten)
            const result = root.main() catch |err| {
                                    ^
/home/geemili/.local/share/zigup/0.13.0/files/lib/std/start.zig:266:5: 0x1009ed1 in _start (hoyten)
    asm volatile (switch (native_arch) {
    ^
???:?:?: 0x0 in ??? (???)

Program received signal SIGABRT, Aborted.
posix.sigprocmask (flags=2, set=0x7fffffffb600, oldset=0x0) at /home/geemili/.local/share/zigup/0.13.0/files/lib/std/posix.zig:5648
5648	   switch (errno(system.sigprocmask(@bitCast(flags), set, oldset))) {
#0  posix.sigprocmask (flags=2, set=0x7fffffffb600, oldset=0x0) at /home/geemili/.local/share/zigup/0.13.0/files/lib/std/posix.zig:5648
#1  posix.raise (sig=6 '\006') at /home/geemili/.local/share/zigup/0.13.0/files/lib/std/posix.zig:712
#2  0x000000000102d682 in posix.abort () at /home/geemili/.local/share/zigup/0.13.0/files/lib/std/posix.zig:654
#3  0x000000000102d0e0 in debug.panicImpl (trace=0x0, first_trace_addr=...)
    at /home/geemili/.local/share/zigup/0.13.0/files/lib/std/io/Writer.zig:19
#4  0x000000000102ca5c in builtin.default_panic (error_return_trace=0x0, ret_addr=...)
    at /home/geemili/.local/share/zigup/0.13.0/files/lib/std/builtin.zig:857
#5  0x000000000100a718 in main.foo (x=4294967295) at main.zig:4
#6  0x000000000100a059 in main.bar () at main.zig:9
#7  0x000000000100a049 in main.main () at main.zig:13
`zig-0.13 build -Dstrip=strip`
~/src/3_resources/matklad-repros/zig-stack-trace〉zig-0.13 build -Dstrip=strip; print -e (ls ./zig-out/bin/hoyten); gdb -ex "set confirm off" -ex run -ex bt -ex quit ./zig-out/bin/hoyten
╭───┬────────────────────┬──────┬─────────┬──────────╮
│ # │        name        │ type │  size   │ modified │
├───┼────────────────────┼──────┼─────────┼──────────┤
│ 0 │ zig-out/bin/hoyten │ file │ 21.1 KB │ now      │
╰───┴────────────────────┴──────┴─────────┴──────────╯

GNU gdb (GDB) 14.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./zig-out/bin/hoyten...
(No debugging symbols found in ./zig-out/bin/hoyten)
Starting program: /home/geemili/src/3_resources/matklad-repros/zig-stack-trace/zig-out/bin/hoyten 
Downloading separate debug info for system-supplied DSO at 0x7ffff7ffd000
thread 121522 panic: integer overflow                                                                                                       
Unable to dump stack trace: debug info stripped

Program received signal SIGABRT, Aborted.
0x0000000001004a7b in ?? ()
#0  0x0000000001004a7b in ?? ()
#1  0x00000000010048ad in ?? ()
#2  0x00000000010044a9 in ?? ()
#3  0x0000000001002a33 in ?? ()
#4  0x00000000010024d9 in ?? ()
#5  0x00000000010024c9 in ?? ()
#6  0x00000000010024b0 in ?? ()
#7  0x0000000001002362 in ?? ()
#8  0x0000000000000001 in ?? ()
#9  0x00007fffffffc4bb in ?? ()
#10 0x0000000000000000 in ?? ()
`zig-0.13 build -Dstrip=objcopy`
~/src/3_resources/matklad-repros/zig-stack-trace〉zig-0.13 build -Dstrip=objcopy; print -e (ls ./zig-out/bin/hoyten); gdb -ex "set confirm off" -ex run -ex bt -ex quit ./zig-out/bin/hoyten
╭───┬────────────────────┬──────┬──────────┬──────────╮
│ # │        name        │ type │   size   │ modified │
├───┼────────────────────┼──────┼──────────┼──────────┤
│ 0 │ zig-out/bin/hoyten │ file │ 205.1 KB │ now      │
╰───┴────────────────────┴──────┴──────────┴──────────╯

GNU gdb (GDB) 14.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./zig-out/bin/hoyten...
(No debugging symbols found in ./zig-out/bin/hoyten)
Starting program: /home/geemili/src/3_resources/matklad-repros/zig-stack-trace/zig-out/bin/hoyten 
thread 121670 panic: integer overflow                                                                                                       
Unwind information for `exe:0x102d286` was not available, trace may be incomplete

???:?:?: 0x100a717 in ??? (exe)
???:?:?: 0x100a058 in ??? (exe)
???:?:?: 0x100a048 in ??? (exe)
???:?:?: 0x100a029 in ??? (exe)
???:?:?: 0x1009ed1 in ??? (exe)

Program received signal SIGABRT, Aborted.
0x000000000100a890 in posix[raise] ()
#0  0x000000000100a890 in posix[raise] ()
#1  0x000000000102d682 in posix[abort] ()
#2  0x000000000102d0e0 in debug[panicImpl] ()
#3  0x000000000102ca5c in builtin.default_panic ()
#4  0x000000000100a718 in main[foo] ()
#5  0x000000000100a059 in main[bar] ()
#6  0x000000000100a049 in main.main ()
`zig-0.13 build -Dstrip=llvm_objcopy`
~/src/3_resources/matklad-repros/zig-stack-trace〉zig-0.13 build -Dstrip=llvm_objcopy; print -e (ls ./zig-out/bin/hoyten); gdb -ex "set confirm off" -ex run -ex bt -ex quit ./zig-out/bin/hoyten
╭───┬────────────────────┬──────┬──────────┬──────────╮
│ # │        name        │ type │   size   │ modified │
├───┼────────────────────┼──────┼──────────┼──────────┤
│ 0 │ zig-out/bin/hoyten │ file │ 207.4 KB │ now      │
╰───┴────────────────────┴──────┴──────────┴──────────╯

GNU gdb (GDB) 14.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./zig-out/bin/hoyten...
(No debugging symbols found in ./zig-out/bin/hoyten)
Starting program: /home/geemili/src/3_resources/matklad-repros/zig-stack-trace/zig-out/bin/hoyten 
thread 121797 panic: integer overflow                                                                                                       
Unwind information for `exe:0x102d286` was not available, trace may be incomplete

???:?:?: 0x100a717 in ??? (exe)
???:?:?: 0x100a058 in ??? (exe)
???:?:?: 0x100a048 in ??? (exe)
???:?:?: 0x100a029 in ??? (exe)
???:?:?: 0x1009ed1 in ??? (exe)

Program received signal SIGABRT, Aborted.
0x000000000100a890 in posix[raise] ()
#0  0x000000000100a890 in posix[raise] ()
#1  0x000000000102d682 in posix[abort] ()
#2  0x000000000102d0e0 in debug[panicImpl] ()
#3  0x000000000102ca5c in builtin.default_panic ()
#4  0x000000000100a718 in main[foo] ()
#5  0x000000000100a059 in main[bar] ()
#6  0x000000000100a049 in main.main ()
`zig-0.13 build -Dstrip=mini_debuginfo`
~/src/3_resources/matklad-repros/zig-stack-trace〉zig-0.13 build -Dstrip=mini_debuginfo; print -e (ls ./zig-out/bin/hoyten); gdb -ex "set confirm off" -ex run -ex bt -ex quit ./zig-out/bin/hoyten
╭───┬────────────────────┬──────┬──────────┬─────────────╮
│ # │        name        │ type │   size   │  modified   │
├───┼────────────────────┼──────┼──────────┼─────────────┤
│ 0 │ zig-out/bin/hoyten │ file │ 189.9 KB │ an hour ago │
╰───┴────────────────────┴──────┴──────────┴─────────────╯

GNU gdb (GDB) 14.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./zig-out/bin/hoyten...
Reading symbols from .gnu_debugdata for /home/geemili/src/3_resources/matklad-repros/zig-stack-trace/zig-out/bin/hoyten...
(No debugging symbols found in .gnu_debugdata for /home/geemili/src/3_resources/matklad-repros/zig-stack-trace/zig-out/bin/hoyten)
Starting program: /home/geemili/src/3_resources/matklad-repros/zig-stack-trace/zig-out/bin/hoyten 
thread 121889 panic: integer overflow                                                                                                       
Unwind information for `exe:0x102d286` was not available, trace may be incomplete

???:?:?: 0x100a717 in ??? (exe)
???:?:?: 0x100a058 in ??? (exe)
???:?:?: 0x100a048 in ??? (exe)
???:?:?: 0x100a029 in ??? (exe)
???:?:?: 0x1009ed1 in ??? (exe)

Program received signal SIGABRT, Aborted.
0x000000000100a890 in posix[raise] ()
#0  0x000000000100a890 in posix[raise] ()
#1  0x000000000102d682 in posix[abort] ()
#2  0x000000000102d0e0 in debug[panicImpl] ()
#3  0x000000000102ca5c in builtin.default_panic ()
#4  0x000000000100a718 in main[foo] ()
#5  0x000000000100a059 in main[bar] ()
#6  0x000000000100a049 in main.main ()

Comparison of the executable sizes:

strip exe size ratio
none 1.7 MB 1.000
strip 21.1 KB 0.013
objcopy 205.1 KB 0.122
llvm_objcopy 207.4 KB 0.124
mini_debuginfo 189.9 KB 0.113
Comparison of executable sizes using `bloaty`:
section none strip objcopy llvm_objcopy mini_debuginfo
.debug_loc 693.2 KB 0 B 0 B 0 B 0 B
.debug_info 338.0 KB 0 B 0 B 0 B 0 B
.debug_line 169.3 KB 0 B 0 B 0 B 0 B
.text 149.4 KB 11.5 KB 149.4 KB 149.4 KB 149.4 KB
.debug_str 103.2 KB 0 B 0 B 0 B 0 B
.debug_ranges 102.9 KB 0 B 0 B 0 B 0 B
.debug_pubnames 38.2 KB 0 B 0 B 0 B 0 B
.rodata 26.7 KB 2.9 KB 26.7 KB 26.7 KB 26.7 KB
.debug_pubtypes 21.9 KB 0 B 0 B 0 B 0 B
.bss 0 B 0 B 0 B 0 B 0 B
.strtab 10.9 KB 0 B 10.9 KB 10.8 KB 0 B
.eh_frame 7.8 KB 1.2 KB 7.8 KB 7.8 KB 7.8 KB
.symtab 7.2 KB 0 B 7.2 KB 7.2 KB 0 B
.relro_padding 0 B 0 B 0 B 0 B 0 B
[Unmapped] 2.5 KB 4.0 KB 0 B 2.5 KB 0 B
.debug_abbrev 2.5 KB 0 B 0 B 0 B 0 B
.eh_frame_hdr 1.4 KB 292 B 1.4 KB 1.4 KB 1.4 KB
[ELF Section Headers] 1.3 KB 576 B 832 B 832 B 704 B
[ELF Program Headers] 504 B 392 B 504 B 504 B 504 B
.shstrtab 211 B 69 B 211 B 105 B 96 B
[5 Others] 104 B 0 B 0 B 0 B 0 B
[ELF Header] 0 B 64 B 64 B 64 B 64 B
.comment 0 B 19 B 24 B 24 B 0 B
.tbss 0 B 0 B 0 B 0 B 0 B
.got 0 B 0 B 8 B 8 B 8 B
[LOAD #1 [R]] 0 B 0 B 8 B 8 B 8 B
.gnu_debugdata 0 B 0 B 0 B 0 B 3.1 KB

It looks like setting exe.root_module.strip is cutting out a lot more from the binary than just running strip on the finished executable. I would need to familiarize myself with that to say more.

Edit: forget to include the changes I added to the repro

[PATCH] Add mini_debuginfo option to stacktrace repro
From 034fc41addf407524bdfbe90a6e872875d68936a Mon Sep 17 00:00:00 2001
From: geemili <[email protected]>
Date: Sun, 17 Nov 2024 13:29:50 -0700
Subject: [PATCH] Add mini_debuginfo option to stacktrace repro

---
 zig-stack-trace/build.zig         | 11 ++++++++++-
 zig-stack-trace/mini_debuginfo.nu | 31 +++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 1 deletion(-)
 create mode 100644 zig-stack-trace/mini_debuginfo.nu

diff --git a/zig-stack-trace/build.zig b/zig-stack-trace/build.zig
index a18c2be..10c26c8 100644
--- a/zig-stack-trace/build.zig
+++ b/zig-stack-trace/build.zig
@@ -9,7 +9,7 @@ pub fn build(b: *std.Build) !void {
     });
     exe.root_module.omit_frame_pointer = false;
 
-    switch (b.option(enum { strip, objcopy, llvm_objcopy, none }, "strip", "") orelse .none) {
+    switch (b.option(enum { strip, objcopy, llvm_objcopy, mini_debuginfo, none }, "strip", "") orelse .none) {
         .none => b.installArtifact(exe),
         .strip => {
             exe.root_module.strip = true;
@@ -33,5 +33,14 @@ pub fn build(b: *std.Build) !void {
                 exe.out_filename,
             ).step);
         },
+        .mini_debuginfo => {
+            const mini_debuginfo_nu = b.addSystemCommand(&.{"nu"});
+            mini_debuginfo_nu.addFileArg(b.path("./mini_debuginfo.nu"));
+            mini_debuginfo_nu.addArtifactArg(exe);
+            b.getInstallStep().dependOn(&b.addInstallBinFile(
+                mini_debuginfo_nu.addOutputFileArg(exe.out_filename),
+                exe.out_filename,
+            ).step);
+        },
     }
 }
diff --git a/zig-stack-trace/mini_debuginfo.nu b/zig-stack-trace/mini_debuginfo.nu
new file mode 100644
index 0000000..7769904
--- /dev/null
+++ b/zig-stack-trace/mini_debuginfo.nu
@@ -0,0 +1,31 @@
+#!/usr/bin/env nu
+
+def main [binary: string, output: string] {
+  let keep_symbols = (nm $binary --format=posix --defined-only
+    # turn into nicely formatted table
+    | lines | split column ' ' | rename symbol type address size | sort-by symbol
+    # only keep function symbols
+    | where type == 'T' or type == 't')
+
+  let keep_symbols_txt = $"($binary | path dirname)/keep_symbols.txt"
+  rm -f $keep_symbols_txt
+  
+  $keep_symbols | get symbol | to text | save $keep_symbols_txt
+
+  let debuginfo = $"($binary).debuginfo"
+  let mini_debuginfo = $"($binary).mini_debuginfo"
+
+  # get just the debug info
+  objcopy --only-keep-debug $binary $debuginfo
+
+  # create reduced version of the debug info
+  objcopy -S --remove-section .gdb_index --remove-section .comment $"--keep-symbols=($keep_symbols_txt)" $debuginfo $mini_debuginfo
+
+  let stripped = $"($binary).stripped"
+  strip --strip-all -R .comment $binary -o $stripped
+
+  xz --keep $mini_debuginfo
+
+  # combine the stripped binary with the reduced, compressed debuginfo
+  objcopy --add-section $".gnu_debugdata=($mini_debuginfo).xz" $stripped $output
+}
-- 
2.44.1

@xdBronch
Copy link
Contributor

the reason setting zigs strip option is so much smaller than stripping it afterwards is it causes parts of the default panic handler to not be compiled in

dumpCurrentStackTrace(first_trace_addr orelse @returnAddress());

zig/lib/std/debug.zig

Lines 190 to 193 in 3a6a8b8

if (builtin.strip_debug_info) {
stderr.print("Unable to dump stack trace: debug info stripped\n", .{}) catch return;
return;
}

@leroycep
Copy link
Contributor

Ah, okay. I guess that means that the comparison would be better done on a larger project, instead of on a tiny program only meant to demonstrate if the printing works or not.

@leroycep
Copy link
Contributor

Experimenting on a slightly larger project, I get 923.9 KB using exe.root_module.strip = true, vs 1.1 MB for a binary containing MiniDebugInfo, vs 5.5 MB for a binary containing the full debug info.

@alexrp
Copy link
Member

alexrp commented Nov 20, 2024

I think std.debug should be taught to read symbols from the binary's dynamic symbol table. Then users can build with -O <Release*> -fno-omit-frame-pointer -rdynamic (and -f(async-)unwind-tables after my upcoming PR) and things should Just Work.

We might consider adding a convenience flag such as -fstack-traces that enables all the right options to just get working stack traces regardless of build mode / stripping options.

@leroycep
Copy link
Contributor

leroycep commented Nov 21, 2024

Alright, I've gotten simple backtraces working:

~/Downloads/matklad-repros/zig-stack-trace> ~/code/zig/build/stage3/bin/zig build -Dstrip=objcopy --zig-lib-dir ~/code/zig/lib/ --verbose; print -e (ls ./zig-out/bin/hoyten); gdb -ex "set confirm off" -ex run -ex bt -ex quit ./zig-out/bin/hoyten
/home/geemili/code/zig/build/stage3/bin/zig build-exe -fno-omit-frame-pointer -OReleaseSafe -Mroot=/home/geemili/Downloads/matklad-repros/zig-stack-trace/main.zig --cache-dir /home/geemili/Downloads/matklad-repros/zig-stack-trace/.zig-cache --global-cache-dir /home/geemili/.cache/zig --name hoyten --zig-lib-dir /home/geemili/code/zig/lib// --listen=- 
╭───┬────────────────────┬──────┬──────────┬──────────────╮
│ # │        name        │ type │   size   │   modified   │
├───┼────────────────────┼──────┼──────────┼──────────────┤
│ 0 │ zig-out/bin/hoyten │ file │ 92.4 KiB │ a minute ago │
╰───┴────────────────────┴──────┴──────────┴──────────────╯
GNU gdb (GDB) 15.2

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./zig-out/bin/hoyten...
(No debugging symbols found in ./zig-out/bin/hoyten)
Starting program: /home/geemili/Downloads/matklad-repros/zig-stack-trace/zig-out/bin/hoyten 

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.archlinux.org>
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
thread 97359 panic: integer overflow
Unwind information for `exe:0x100a843` was not available, trace may be incomplete

???:?:?: 0x10059c7 in main.foo (???)
???:?:?: 0x10057d8 in main.bar (???)
???:?:?: 0x10057c8 in main.main (???)
???:?:?: 0x1005702 in start.posixCallMainAndExit (???)
???:?:?: 0x10053dd in _start (???)

Program received signal SIGABRT, Aborted.
0x0000000001005d20 in posix[raise] ()
#0  0x0000000001005d20 in posix[raise] ()
#1  0x000000000101493e in posix[abort] ()
#2  0x000000000101413b in debug[defaultPanic] ()
#3  0x00000000010059c8 in main[foo] ()
#4  0x00000000010057d9 in main[bar] ()
#5  0x00000000010057c9 in main.main ()

However it currently swaps out the DWARF debug info backend entirely, so symbol names are the best you'll get in any mode. It also doesn't support the dynamic symbol table yet, it only reads from the regular symbol table.

That code is living on the elf-symbol-debuginfo branch in my Zig repo. I don't think this properly solves the problem at the moment.

I think we want std.debug.SelfInfo to try to use DWARF debug info, and fallback to reading ELF symbols. We can probably handle that by introducing something like ElfDebugInfo = union(enum) { dwarf: Dwarf, symbol_table: Elf }.

@leroycep
Copy link
Contributor

Implemented my previous idea of using a union(enum). Now an executable will attempt to load Dwarf debug info, and fall back to using the symbol table.

With Dwarf Info

~/code/zig/build/stage3/bin/zig build -Dstrip=none --verbose;
print -e (ls ./zig-out/bin/hoyten);
try { ./zig-out/bin/hoyten };
/home/geemili/code/zig/build/stage3/bin/zig build-exe -fno-omit-frame-pointer -OReleaseSafe -Mroot=/home/geemili/Downloads/matklad-repros/zig-stack-trace/main.zig --cache-dir /home/geemili/Downloads/matklad-repros/zig-stack-trace/.zig-cache --global-cache-dir /home/geemili/.cache/zig --name hoyten --zig-lib-dir /home/geemili/code/zig/lib/ --listen=- 
╭───┬────────────────────┬──────┬─────────┬───────────────╮
│ # │        name        │ type │  size   │   modified    │
├───┼────────────────────┼──────┼─────────┼───────────────┤
│ 0 │ zig-out/bin/hoyten │ file │ 1.9 MiB │ 4 minutes ago │
╰───┴────────────────────┴──────┴─────────┴───────────────╯
thread 598365 panic: integer overflow
Unwind error at address `exe:0x101a803` (error.MissingFDE), trace may be incomplete

/home/geemili/Downloads/matklad-repros/zig-stack-trace/main.zig:4:14: 0x100be87 in foo (hoyten)
    return x * x;
             ^
/home/geemili/Downloads/matklad-repros/zig-stack-trace/main.zig:9:15: 0x100bc98 in bar (hoyten)
    return foo(std.math.maxInt(u32));
              ^
/home/geemili/Downloads/matklad-repros/zig-stack-trace/main.zig:13:33: 0x100bc88 in main (hoyten)
    std.debug.print("{}", .{ bar()});
                                ^
/home/geemili/code/zig/lib/std/start.zig:617:37: 0x100bbc2 in posixCallMainAndExit (hoyten)
            const result = root.main() catch |err| {
                                    ^
/home/geemili/code/zig/lib/std/start.zig:248:5: 0x100b89d in _start (hoyten)
    asm volatile (switch (native_arch) {
    ^

Dwarf Info Stripped

objcopy --strip-debug ./zig-out/bin/hoyten ./zig-out/bin/hoyten-debug-stripped;
print -e (ls ./zig-out/bin/hoyten-debug-stripped);
try { ./zig-out/bin/hoyten-debug-stripped };
╭───┬───────────────────────────────────┬──────┬───────────┬──────────╮
│ # │               name                │ type │   size    │ modified │
├───┼───────────────────────────────────┼──────┼───────────┼──────────┤
│ 0 │ zig-out/bin/hoyten-debug-stripped │ file │ 244.0 KiB │ now      │
╰───┴───────────────────────────────────┴──────┴───────────┴──────────╯
thread 598380 panic: integer overflow
Unwind information for `exe:0x101a803` was not available, trace may be incomplete

???:?:?: 0x100be87 in main.foo (???)
???:?:?: 0x100bc98 in main.bar (???)
???:?:?: 0x100bc88 in main.main (???)
???:?:?: 0x100bbc2 in start.posixCallMainAndExit (???)
???:?:?: 0x100b89d in _start (???)

exe.root_module.strip = true

Setting -fstrip=true or std.Build.Module.strip = true will still result in a binary without debug info or the ability to print stack traces.

~/code/zig/build/stage3/bin/zig build -Dstrip=strip --verbose; print -e (ls ./zig-out/bin/hoyten); try { ./zig-out/bin/hoyten };                                                                                                                              11/25/2024 01:14:37 PM
/home/geemili/code/zig/build/stage3/bin/zig build-exe -fstrip -fno-omit-frame-pointer -OReleaseSafe -Mroot=/home/geemili/Downloads/matklad-repros/zig-stack-trace/main.zig --cache-dir /home/geemili/Downloads/matklad-repros/zig-stack-trace/.zig-cache --global-cache-dir /home/geemili/.cache/zig --name hoyten --zig-lib-dir /home/geemili/code/zig/lib/ --listen=- 
╭───┬────────────────────┬──────┬──────────┬──────────╮
│ # │        name        │ type │   size   │ modified │
├───┼────────────────────┼──────┼──────────┼──────────┤
│ 0 │ zig-out/bin/hoyten │ file │ 20.5 KiB │ now      │
╰───┴────────────────────┴──────┴──────────┴──────────╯
thread 599590 panic: integer overflow
Unable to dump stack trace: debug info stripped

I still need to run the Zig's tests locally, but I plan on creating a PR afterwards.

@xdBronch
Copy link
Contributor

i think that could/should integrate with #19650, @nektro still working on it? needing to use an external tool (and yeah ik technically zig has it builtin but its still more code) to get the small exe with bracktraces is unfortunate. maybe -fstrip could take more options rather than just being a boolean?

@leroycep
Copy link
Contributor

I'll have to take a look at #19650, but at the moment I'm looking at making -fstrip=debuginfo a thing

@leroycep
Copy link
Contributor

leroycep commented Dec 1, 2024

Adding a few comments after having done some work on the pull request:

Specifically (non-expert, might be talking nonsense here), I think including frame pointers, and debug symbols (but not the entire of debug info) should give both a small executable, and a reasonable stack trace.

I think [objcopy with .strip = .debug] is what I want big-picture wise, but a few details prevent it from being actually useful:

I can confirm that objcopy --strip-debug is the operation you want. -fno-omit-frame-pointer is unecessary, as objcopy --strip-debug will retain unwinding information (.eh_frame and eh_frame_hdr) by default:

~/Downloads/matklad-repros/zig-stack-trace> zig build -Dstrip=objcopy; bloaty -w -s file zig-out/bin/hoyten
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  72.9%   151Ki  74.4%   151Ki    .text
  12.5%  26.1Ki  12.8%  26.1Ki    .rodata
   5.1%  10.6Ki   0.0%       0    .strtab
   4.7%  9.85Ki   4.8%  9.84Ki    .eh_frame
   3.4%  7.03Ki   0.0%       0    .symtab
   0.7%  1.40Ki   0.7%  1.40Ki    .eh_frame_hdr
   0.4%     832   0.0%       0    [ELF Section Headers]
   0.2%     504   0.2%     504    [ELF Program Headers]
   0.1%     211   0.0%       0    .shstrtab
   0.0%      64   0.0%      64    [ELF Header]
   0.0%      24   0.0%       0    .comment
   0.0%       8   0.0%       8    .got
   0.0%       8   0.0%       8    [LOAD #1 [R]]
   0.0%       0   6.1%  12.4Ki    .bss
   0.0%       0   1.0%  2.13Ki    .relro_padding
   0.0%       0   0.0%      13    .tbss
 100.0%   208Ki 100.0%   204Ki    TOTAL

We might consider adding a convenience flag such as -fstack-traces that enables all the right options to just get working stack traces regardless of build mode / stripping options.

maybe -fstrip could take more options rather than just being a boolean?

My idea for tackling this is replacing -fstrip, -gdwarf32, and -gdwarf64 with a single -fdebuginfo option:

  • -fdebuginfo=none: don't emit any debugging information (equivalent to the current -fstrip)
  • -fdebuginfo=symbols: emits symbol and unwind tables for lightweight backtraces (equivalent to objcopy --strip-debug)
  • -fdebuginfo=dwarf32: emits DWARF debugging information
  • -fdebuginfo=dwarf64: emits DWARF debugging information using the dwarf64 format, which allows debug information larger than 4 GiB.
  • -fdebuginfo=code_view: Windows specific CodeView debugging info
  • which I guess would be similar to -g from clang and gcc, so maybe -g or --debug would be a better flag?

Although I'm not sure that would quite match what is going on: -fdebuginfo=[dwarf32|dwarf64|code_view] adds additional debug markup to the LLVM IR, -fdebuginfo=none omits the debug markup and passes a flag to strip the executable to the linker, and -fdebuginfo=symbols would just omit the debug markup and not do anything special.

Also, most of the time people will not care what the specific debugging format is, so having some aliases like -fdebuginfo=full would make sense.

Anyway, if -fdebuginfo is the way we go, I was thinking of the following defaults:

OptimizeMode -fdebuginfo=
Debug full
ReleaseSafe full
ReleaseFast symbols
ReleaseSmall none

i think that could/should integrate with #19650, @nektro still working on it?

This would definitely make sense to do, especially if we want to change the default backtrace in -fdebuginfo=symbols from this:

???:?:?: 0x100be87 in main.foo (???)
???:?:?: 0x100bc98 in main.bar (???)
???:?:?: 0x100bc88 in main.main (???)
???:?:?: 0x100bbc2 in start.posixCallMainAndExit (???)
???:?:?: 0x100b89d in _start (???)

To something like this:

0x100be87 in main.foo
0x100bc98 in main.bar
0x100bc88 in main.main
0x100bbc2 in start.posixCallMainAndExit
0x100b89d in _start

Getting the backtraces to work only requires changing the standard library, so I'm thinking about splitting -fdebuginfo (or whatever it ends up being) into a separate PR, and focusing on the necessary changes to the standard library for the first PR.

@leroycep leroycep mentioned this issue Dec 9, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants