Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement sharing for hls-graph Keys #3206

Merged
merged 6 commits into from
Oct 24, 2022
Merged

Implement sharing for hls-graph Keys #3206

merged 6 commits into from
Oct 24, 2022

Conversation

wz1000
Copy link
Collaborator

@wz1000 wz1000 commented Sep 21, 2022

On large projects, duplicate Keys start to comprise a significant (~5-10%) fraction of the heap.
This becomes better if we use Ints as keys and de-duplicate the actual values by storing them
in maps. Keys are mostly just compared and hash during usual operation, printing them out is
rare so a lookup shouldn't pessimize it much.

Also use HashSets to keep track of dependencies.
However, this means that the underlying value for Keys will never be garbage collected once allocated.

Copy link
Collaborator

@pepeiborra pepeiborra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Please share some benchmarks showing the memory savings when you have a chance!

Copy link
Collaborator

@pepeiborra pepeiborra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop the Int proxy (Key) and the IntMap?
newKey will lookup in the KeyMap and return the existing value if any, recovering sharing.

What's the benefit of keeping an extra mapping?

@wz1000
Copy link
Collaborator Author

wz1000 commented Sep 27, 2022

Can we drop the Int proxy (Key) and the IntMap? newKey will lookup in the KeyMap and return the existing value if any, recovering sharing.

What's the benefit of keeping an extra mapping?

I tried this first, but GHC unpacks the Key and we lose sharing at that point.

@pepeiborra
Copy link
Collaborator

I tried this first, but GHC unpacks the Key and we lose sharing at that point.

Worth adding a comment about this, I believe my previous attempt to share keys must have failed for this reason

@michaelpj
Copy link
Collaborator

I tried this first, but GHC unpacks the Key and we lose sharing at that point.

Maybe it's a bit too sensitive, but can we do something to prevent GHC from unpacking the key? e.g. not import the constructor in the module in question?

@wz1000
Copy link
Collaborator Author

wz1000 commented Sep 28, 2022

Maybe it's a bit too sensitive, but can we do something to prevent GHC from unpacking the key? e.g. not import the constructor in the module in question?

I don't think this is possible currently. Would be a good feature request though.

@wz1000
Copy link
Collaborator Author

wz1000 commented Sep 28, 2022

I've also added sharing of the rendered Key as Text to avoid duplicating the rendered Key in the diagnostic store, else we get a duplicate of each key as a rendered Text value for each file in the project.

Copy link
Collaborator

@pepeiborra pepeiborra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have those benchmark results?

@wz1000 wz1000 force-pushed the wip/share-keys branch 3 times, most recently from c3ab732 to 5ef5975 Compare October 19, 2022 10:48
@wz1000 wz1000 force-pushed the wip/share-keys branch 2 times, most recently from 15b21a9 to 1311c87 Compare October 19, 2022 12:34
@wz1000
Copy link
Collaborator Author

wz1000 commented Oct 24, 2022

Here are some profiles from before and after

before (heap size):
2022-10-21-211021_grab

before (keys in heap):
2022-10-21-211050_grab

In the second image you can see that just the top allocation site of duplicate keys is responsible for ~200MB of live bytes. These are all eliminated by this patch.

after (heap size):
2022-10-24-131012_grab

@wz1000 wz1000 enabled auto-merge (rebase) October 24, 2022 07:48
@wz1000 wz1000 merged commit 94cd24d into master Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants