Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify rustc_codegen_llvm::debuginfo::metadata #482

Closed
3 tasks done
michaelwoerister opened this issue Feb 8, 2022 · 4 comments
Closed
3 tasks done

Simplify rustc_codegen_llvm::debuginfo::metadata #482

michaelwoerister opened this issue Feb 8, 2022 · 4 comments
Labels
major-change A proposal to make a major change to rustc major-change-accepted A major change proposal that was accepted T-compiler Add this label so rfcbot knows to poll the compiler team

Comments

@michaelwoerister
Copy link
Member

michaelwoerister commented Feb 8, 2022

Proposal

Type debuginfo generation in rustc_codegen_llvm::debuginfo::metadata is become very messy over the years. It was originally written before Rust 1.0 and the infrastructure in it has become a bit of a callback hell. At the same time our debuginfo has become more complex and has outgrown the infrastructure's capabilities in a number of ways. As a result the code is now very hard to read and extend.

I propose to refactor the module and in particular:

  • get rid of MemberDescription, MemberDescriptionFactory, RecursiveTypeDescription, and the like. These exist solely for breaking recursion cycles in the debuginfo node graph -- something with can be done in simpler fashion by relying on closures (see below).
  • completely separate enum debuginfo generation for DWARF-based and for "C++-like" targets. The two cases are quite different but at the moment they are all handled by the same set of functions with many, hard-to-untangle if cpp_like_debuginfo { ... } else { ... } branches in them.

Dealing with recursion

For types like struct Foo { x: *const Foo } we need to create cycles in the graph of LLVM metadata nodes that encodes debuginfo. The way to do this in LLVM is to allocate a "stub" node for Foo that does not have the list of fields yet, create the fields (which can now transitively reference the stub node), and then mutate the stub node in-place (that is, attach the list of fields to it).

So far this is implemented via RecursiveTypeDescription, which holds the stub node and a MemberDescriptionFactory that is then used to create the members in a delayed fashion. This has worked OK for enforcing the protocol -- but it has the disadvantage that the creation of debuginfo for all types that have fields is split into a prepare_xyz() -> RecursiveTypeDescription function and an implementation of the MemberDescriptionFactory trait (in the case of enums and generators across two levels), which makes it hard to read the code.

The proposal is to implement this via a single helper function that looks like:

fn build_type_with_fields(cx: Context, type_id: TypeId, stub: DIType, make_fields: FnOnce(cx, stub) -> Vec<fields>) -> DIType {
   assert!(type_map(cx).insert(type_id, stub).is_none());
   let fields = make_fields(cx, stub);
   llvm::attach_fields(stub, fields);
   return stub;
}

As a result one can write things more naturally, like:

fn build_struct_type_info(cx, struct_type) -> DIType {
  build_type_with_fields(
    cx,
    TypeId::from(struct_type),
    stub(name, size, align, ...),
    |cx, stub| {
      struct_type.fields.map(|field| build_field_debuginfo(cx, stub, field))
    })
}

This approach would also reduce the amount of boilerplate code needed to do things that can also be done directly via the LLVM APIs.

The refactoring would be straightforward for the most part (we still follow the same basic principle for handling graph cycles after all) but the module would look rather different afterwards. It is also likely to uncover inconsistencies in the debuginfo we generate.

Mentors or Reviewers

@wesleywiser volunteered to review the changes.

Process

The main points of the Major Change Process are as follows:

  • File an issue describing the proposal.
  • A compiler team member or contributor who is knowledgeable in the area can second by writing @rustbot second.
    • Finding a "second" suffices for internal changes. If however, you are proposing a new public-facing feature, such as a -C flag, then full team check-off is required.
    • Compiler team members can initiate a check-off via @rfcbot fcp merge on either the MCP or the PR.
  • Once an MCP is seconded, the Final Comment Period begins. If no objections are raised after 10 days, the MCP is considered approved.

You can read more about Major Change Proposals on forge.

Comments

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

@michaelwoerister michaelwoerister added T-compiler Add this label so rfcbot knows to poll the compiler team major-change A proposal to make a major change to rustc labels Feb 8, 2022
@rustbot
Copy link
Collaborator

rustbot commented Feb 8, 2022

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

cc @rust-lang/compiler @rust-lang/compiler-contributors

@rustbot rustbot added the to-announce Announce this issue on triage meeting label Feb 8, 2022
@wesleywiser
Copy link
Member

@rustbot second

@rustbot rustbot added the final-comment-period The FCP has started, most (if not all) team members are in agreement label Feb 8, 2022
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Feb 10, 2022
@michaelwoerister
Copy link
Member Author

@apiraino, the FCP has passed without objections, correct?

@apiraino
Copy link
Contributor

Indeed, this can be closed as accepted.

@rustbot label -final-comment-period +major-change-accepted

@rustbot rustbot added major-change-accepted A major change proposal that was accepted to-announce Announce this issue on triage meeting and removed final-comment-period The FCP has started, most (if not all) team members are in agreement labels Feb 22, 2022
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Mar 3, 2022
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 15, 2022
…ctor, r=wesleywiser

debuginfo: Refactor debuginfo generation for types

This PR implements the refactoring of the `rustc_codegen_llvm::debuginfo::metadata` module as described in MCP rust-lang/compiler-team#482.

In particular it
- changes names to use `di_node` instead of `metadata`
- uniformly names all functions that build new debuginfo nodes `build_xyz_di_node`
- renames `CrateDebugContext` to `CodegenUnitDebugContext` (which is more accurate)
- removes outdated parts from `compiler/rustc_codegen_llvm/src/debuginfo/doc.md`
- moves `TypeMap` and functions that work directly work with it to a new `type_map` module
- moves enum related builder functions to a new `enums` module
- splits enum debuginfo building for the native and cpp-like cases, since they are mostly separate
- uses `SmallVec` instead of `Vec` in many places
- removes the old infrastructure for dealing with recursion cycles (`create_and_register_recursive_type_forward_declaration()`, `RecursiveTypeDescription`, `set_members_of_composite_type()`, `MemberDescription`, `MemberDescriptionFactory`, `prepare_xyz_metadata()`, etc)
- adds `type_map::build_type_with_children()` as a replacement for dealing with recursion cycles
- adds many (doc-)comments explaining what's going on
- changes cpp-like naming for C-Style enums so they don't get a `enum$<...>` name (because the NatVis visualizer does not apply to them)
- fixes detection of what is a C-style enum because some enums where classified as C-style even though they have fields
- changes cpp-like naming for generator enums so that NatVis works for them
- changes the position of discriminant debuginfo node so it is consistently nested inside the top-level union instead of, sometimes, next to it

The following could be done in subsequent PRs:
- add caching for `closure_saved_names_of_captured_variables`
- add caching for `generator_layout_and_saved_local_names`
- fix inconsistent handling of what is considered a C-style enum wrt to debuginfo
- rename `metadata` module to `types`
- move common generator fields to front instead of appending them

This PR is based on rust-lang#93644 which is not merged yet.

Right now, the changes are all done in one big commit. They could be split into smaller commits but hopefully the list of changes above makes it tractable to review them as a single commit too.

For now: r? `@ghost` (let's see if this affects compile times)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-change A proposal to make a major change to rustc major-change-accepted A major change proposal that was accepted T-compiler Add this label so rfcbot knows to poll the compiler team
Projects
None yet
Development

No branches or pull requests

4 participants