Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate object leak in devnet AMM loadgen #5640

Closed
warner opened this issue Jun 22, 2022 · 4 comments
Closed

investigate object leak in devnet AMM loadgen #5640

warner opened this issue Jun 22, 2022 · 4 comments
Assignees
Labels
bug Something isn't working performance Performance related issues SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Jun 22, 2022

Our current codebase (a2900b4 aka 5507-ignore-vatcancelled, which is current trunk (102a5b6) plus some telemetry changes and a giant xsnap.js hack to ignore the vatCancelled Promise as a workaround for #5507), in conjunction with the AMM load generation code, appears to leak objects somewhere. Over time, the size of the kernel object table keeps growing.

We've seen this sort of thing before (#2013, at least), but there were enough other memory-consumption bugs that interfered with the analysis that we haven't really dived into it properly yet. One time, the problem turned out to be an XS promise resolution bug (which retained the resolution values of old promises). It could also be a bug in the AMM or Zoe (holding on to stale objects somehow), or in the way that the load generator is using those services. We're also seeing (#5507) a Promise.race/v8/ES bug in which old resolutions are retained as long as one of the input promises remains unsettled, which might be involved.

This ticket is to collect information and analysis results. We can close it when the current testnet AMM loadgen shows a flat kernel object table size.

@warner warner added bug Something isn't working performance Performance related issues labels Jun 22, 2022
@warner warner self-assigned this Jun 22, 2022
@mhofman
Copy link
Member

mhofman commented Jun 23, 2022

Possibly related: #3488

@dckc dckc added the SwingSet package: SwingSet label Jul 5, 2022
@warner
Copy link
Member Author

warner commented Jul 7, 2022

This might be caused by #5671 , so we should circle back once we understand and/or fix the vat-bank issue.

@mhofman
Copy link
Member

mhofman commented Jul 7, 2022

Have we confirmed there is an actual growth of the kernel object table vs the missing decrement fixed by #5652?

In my loadgen tests I am not seeing any object leak while performing both AMM & Vault operations, but I might not be exercising everything.

@warner
Copy link
Member Author

warner commented Jul 10, 2022

I think you're right, the missing decrement was swamping our ability to see any kernel-wide object leaks. If your loadgen is showing flat object counts, then I think we can close this.

The instagoric load profile (in which dozens of clients are doing a lot of initialization work at the same time, and thus takes hours and hours to get to a steady state) is likely to produce more kernel objects until all clients are stable, which will look a lot like an object leak, but is actually just the warmup phase. And the one current known issue already has a ticket (#5671), which we'll investigate separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance Performance related issues SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

3 participants