Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GR-47106] Loading DWARF debug info into debugger can be very slow after GR-45654 #6936

Open
adinn opened this issue Jul 5, 2023 · 0 comments

Comments

@adinn
Copy link
Collaborator

adinn commented Jul 5, 2023

Commit GR-45654 (#6414) modified DWARF debug info generation to encode Java info in a single compilation unit (CU). This was done in order to allow code for top level methods belonging to different classes to be interleaved in the code cache. However, for moderate to large size programs this has caused noticeable slow down when gdb first starts executing.

The problem is only severe when full debug info generation is enabled (passing flags H:GenerateDebugInfo=1, -H:+SourceLevelDebug, -H:+DebugCodeInfoUseSourceMappings, -H:-DeleteLocalSymbols, -H:+TrackNodeSourcePosition). However, it promises to also cause problems for very large programs even when only a subset of those arguments are provided. Mandrel integration test issue 160 details the severity for a moderate sized Quarkus application, where placement of the initial break takes up to 300 seconds.

The problem arises because of two stages of DWARF processing. Line info processing is polynomial in the number of files in a CU's file table. The single CU DWARF for the above example includes > 10,000 files whereas the per-class CU DWARF has tables that usually contain 1-10 files, with rare cases of a few 100s. With single CU DWARF gdb has to process all the line info at startup. With multi-CU DWARF gdb only needs to load a subset of the CUs. While this subset is large the time is significantly reduced because of the polynomial processing time.

The second stage that costs a lot is inline method tree processing which is proportional to the number of inline tree nodes in the DWARF info. Splitting the DWARF into multiple CUs does not result in less nodes. However, with single CU DWARF gdb has to process all the inline entries at startup while with multi-CU DWARF gdb only processes a subset of them.

It would be beneficial to return to multiple, per-class CUs and find some other way to deal with the fact that the code range for a specific class's CU is split into subranges which may interleave with code ranges belonging to another class/CU. It should be possible to achieve this by adding a DWARF debug_ranges section and labelling each CU with a DW_AT_ranges attribute that references the relevant details in the debug_ranges` section.

@fniephaus fniephaus changed the title Loading DWARF debug info into debugger can be very slow after GR-45654 [GR-47106] Loading DWARF debug info into debugger can be very slow after GR-45654 Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant