Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to streaming out the heap snapshot data (#52854) #128

Merged

Conversation

Drvi
Copy link
Member

@Drvi Drvi commented Feb 1, 2024

This PR is to continue the work on the following PR:

Prevent OOMs during heap snapshot: Change to streaming out the snapshot data (JuliaLang#51518 )

Here are the commit history:

* Streaming the heap snapshot!

This should prevent the engine from OOMing while recording the snapshot!

Now we just need to sample the files, either online, before downloading, or offline after downloading :)

If we're gonna do it offline, we'll want to gzip the files before downloading them.

* Allow custom filename; use original API

* Support legacy heap snapshot interface. Add reassembly function.

* Add tests

* Apply suggestions from code review

* Update src/gc-heap-snapshot.cpp

* Change to always save the parts in the same directory

This way you can always recover from an OOM

* Fix bug in reassembler: from_node and to_node were in the wrong order

* Fix correctness mistake: The edges have to be reordered according to the node order. That's the whole reason this is tricky.

But i'm not sure now whether the SoAs approach is actually an optimization.... It seems like we should probably prefer to inline the Edges right into the vector, rather than having to do another random lookup into the edges table?

* Debugging messed up edge array idxs

* Disable log message

* Write the .nodes and .edges as binary data

* Remove unnecessary logging

* fix merge issues

* attempt to add back the orphan node checking logic

PR Description

What does this PR do?

Checklist

Requirements for merging:

  • I have opened an issue or PR upstream on JuliaLang/julia: <link to JuliaLang/julia>
  • I have removed the port-to-* labels that don't apply.
  • I have opened a PR on raicode to test these changes:

This PR is to continue the work on the following PR:

Prevent OOMs during heap snapshot: Change to streaming out the snapshot
data (JuliaLang#51518 )

Here are the commit history:

```
* Streaming the heap snapshot!

This should prevent the engine from OOMing while recording the snapshot!

Now we just need to sample the files, either online, before downloading, or offline after downloading :)

If we're gonna do it offline, we'll want to gzip the files before downloading them.

* Allow custom filename; use original API

* Support legacy heap snapshot interface. Add reassembly function.

* Add tests

* Apply suggestions from code review

* Update src/gc-heap-snapshot.cpp

* Change to always save the parts in the same directory

This way you can always recover from an OOM

* Fix bug in reassembler: from_node and to_node were in the wrong order

* Fix correctness mistake: The edges have to be reordered according to the node order. That's the whole reason this is tricky.

But i'm not sure now whether the SoAs approach is actually an optimization.... It seems like we should probably prefer to inline the Edges right into the vector, rather than having to do another random lookup into the edges table?

* Debugging messed up edge array idxs

* Disable log message

* Write the .nodes and .edges as binary data

* Remove unnecessary logging

* fix merge issues

* attempt to add back the orphan node checking logic
```

---------

Co-authored-by: Nathan Daly <[email protected]>
Co-authored-by: Nathan Daly <[email protected]>
@Drvi Drvi requested a review from d-netto February 1, 2024 18:36
@Drvi Drvi changed the base branch from backports-release-1.10+RAI to dcn-backport-streaming-snapshot-to-1.10 February 2, 2024 10:38
@Drvi Drvi changed the base branch from dcn-backport-streaming-snapshot-to-1.10 to backports-release-1.10+RAI February 2, 2024 10:38
@Drvi Drvi merged commit 3c64fa1 into backports-release-1.10+RAI Feb 2, 2024
3 checks passed
@Drvi Drvi deleted the backport-streaming-snapshot-to-1.10 branch February 2, 2024 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant