Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rustdoc takes 8.5 GB of memory to document stm32h7 #79103

Closed
jyn514 opened this issue Nov 16, 2020 · 9 comments
Closed

Rustdoc takes 8.5 GB of memory to document stm32h7 #79103

jyn514 opened this issue Nov 16, 2020 · 9 comments
Labels
C-bug Category: This is a bug. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems

Comments

@jyn514
Copy link
Member

jyn514 commented Nov 16, 2020

Similar to but not the same as #78761: rustdoc uses an inordinate amount of memory to document large crates.

heaptrack_gui shows almost a third of the memory being allocated in get_blanket_impls:

image

However, the peak usage is actually in check_mod_item_types, which rustdoc can't do much about:

image

Rustdoc should work improving on both the total memory usage and peak memory usage.

This is meant as a tracking issue, since there are issues other than get_blanket_impls causing high memory use, such as #76382. There is a Zulip stream for technical discussion.

cc @adamgreig

@jyn514 jyn514 added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. C-bug Category: This is a bug. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Nov 16, 2020
@jyn514
Copy link
Member Author

jyn514 commented Nov 16, 2020

This is causing docs.rs to have to periodically bump the sandbox limits when documenting stm32 crates: rust-lang/docs.rs#1179.

@jyn514 jyn514 added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Nov 16, 2020
@camelid camelid added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Dec 11, 2020
@camelid
Copy link
Member

camelid commented Dec 11, 2020

P-medium: it's annoying that it takes so much memory, but it's not like you can't document it. And it's being worked on.

@jonas-schievink jonas-schievink added the WG-embedded Working group: Embedded systems label Dec 11, 2020
@jyn514
Copy link
Member Author

jyn514 commented Dec 11, 2020

I ran Rustdoc under DHAT and found that a good tenth of the allocations come from get_blanket_impls alone:

. I'm not sure how to upload the full 81 MB file to GitHub, but feel free to ping me here or on Zulip, I'll keep a copy around Here's a link to Google Drive with the full file. Unfortunately DHAT takes a full 50 minutes to run rustdoc on stm32 so it's a pain to collect on your own.

@jyn514
Copy link
Member Author

jyn514 commented Dec 11, 2020

Another 7% just storing the structs themselves:

self.impls.extend(get_auto_trait_and_blanket_impls(
. Maybe I should be asking just how many impls there are?

bors added a commit to rust-lang-ci/rust that referenced this issue Dec 12, 2020
[rustdoc] Calculate span information on demand instead of storing it ahead of time

This brings `size_of<clean::types::Span>()` down from over 100 bytes (!!) to only 12, the same as rustc. It brings `Item` down even more, from `784` to `680`.

~~TODO: I need to figure out how to do this for the JSON backend too. That uses `From` impls everywhere, which don't allow passing in the `Session` as an argument. `@P1n3appl3,` `@tmandry,` maybe one of you have ideas?~~ Figured it out, fortunately only two functions needed to be changed. I like the `convert_x()` format better than `From` everywhere but I'm open to feedback.

Helps with rust-lang#79103
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 15, 2020
[rustdoc] Switch to Symbol for item.name

This decreases the size of `Item` from 680 to 616 bytes. It also does a
lot less work since it no longer has to copy as much.

Helps with rust-lang#79103.

r? `@GuillaumeGomez`
@jyn514
Copy link
Member Author

jyn514 commented Dec 16, 2020

Maybe I should be asking just how many impls there are?

+-------------------------------------------------+-----------+-----------------+----------+------------+
| Item                                            | Self time | % of total time | Time     | Item count |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| collect-trait-impls                             | 33.31s    | 45.735          | 39.64s   | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| render_html                                     | 23.69s    | 32.537          | 23.69s   | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| <unknown>                                       | 6.81s     | 9.345           | 10.21s   | 2417746    |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| clean_crate                                     | 2.85s     | 3.913           | 4.01s    | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| collect-intra-doc-links                         | 918.47ms  | 1.261           | 970.03ms | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| expand_crate                                    | 905.65ms  | 1.244           | 909.95ms | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| build_extern_trait_impl                         | 796.51ms  | 1.094           | 1.38s    | 20080      |

The strange thing that most of the time is not spent on build_extern_trait_impl, it's somewhere else in collect_trait_impls.

@jyn514
Copy link
Member Author

jyn514 commented Dec 16, 2020

Almost all the time in collect_trait_impls is being spent on synthetic impls, specifically

let mut krate = synth.fold_crate(krate);

+-------------------------------------------------+-----------+-----------------+----------+------------+
| Item                                            | Self time | % of total time | Time     | Item count |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| collect_synthetic_impls                         | 32.71s    | 44.967          | 37.56s   | 1          |

@jyn514
Copy link
Member Author

jyn514 commented Dec 16, 2020

get_blanket_impls takes about 10x as long as get_synthetic_impls, but they're both pretty slow.

+-------------------------------------------------+-----------+-----------------+----------+------------+
| Item                                            | Self time | % of total time | Time     | Item count |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| get_blanket_impls                               | 29.86s    | 41.053          | 34.48s   | 32176      |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| render_html                                     | 23.18s    | 31.879          | 23.18s   | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| <unknown>                                       | 6.92s     | 9.514           | 10.33s   | 2417746    |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| get_auto_trait_impls                            | 3.42s     | 4.704           | 3.69s    | 32176      |
+-------------------------------------------------+-----------+-----------------+----------+------------+
| clean_crate                                     | 2.82s     | 3.871           | 3.98s    | 1          |
+-------------------------------------------------+-----------+-----------------+----------+------------+

bors added a commit to rust-lang-ci/rust that referenced this issue Dec 16, 2020
Get rid of `clean::Deprecation`

This brings the size of `item.deprecation` from 56 to 16 bytes. Helps with rust-lang#79103 and rust-lang#76382, in the same vein as rust-lang#79957.

r? `@GuillaumeGomez`
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 22, 2020
Add more timing info to rustdoc

This helped me confirm in rust-lang#79103 (comment) that get_blanket_impls is indeed what's taking all the time on stm32.

r? `@Mark-Simulacrum`
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 23, 2020
…meGomez

[rustdoc] Calculate stability, const_stability, and deprecation on-demand

Previously, they would always be calculated ahead of time, which bloated the size of `clean::Item`.

Builds on rust-lang#80090 and should not be merged before. Helps with rust-lang#79103 and rust-lang#76382.

cc rust-lang#80014 (comment)

This brings `Item` down to 568 bytes, down from 616.
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 23, 2020
…umeGomez

Remove `DefPath` from `Visibility` and calculate it on demand

Depends on rust-lang#80090 and should not be merged before. Helps with rust-lang#79103 and rust-lang#76382.

cc rust-lang#80014 (comment) - `@nnethercote` I figured it out! It was simpler than I expected :)

This brings the size of `clean::Visibility` down from 40 bytes to 8.

Note that this does *not* remove `clean::Visibility`, even though it's now basically the same as `ty::Visibility`, because the `Invsible` variant means something different from `Inherited` and I thought it would be be confusing to merge the two. See the new comments on `impl Clean for ty::Visibility` for details.
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 29, 2020
[rustdoc] Box ItemKind to reduce the size of `Item`

This brings the size of `Item` from

```
[src/librustdoc/lib.rs:103] std::mem::size_of::<Item>() = 536
```

to

```
[src/librustdoc/lib.rs:103] std::mem::size_of::<Item>() = 136
```

This is an alternative to rust-lang#79967; I don't think it makes sense to make both changes.

Helps with rust-lang#79103.
@jyn514
Copy link
Member Author

jyn514 commented Jan 21, 2021

This also impacts windows-rs.

@jyn514
Copy link
Member Author

jyn514 commented Aug 22, 2021

I profiled this recently, and it's down to 1.6 GB - still high, but a lot better than before. I'm going to close this until I find something more specific to track.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems
Projects
None yet
Development

No branches or pull requests

4 participants