Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel stats are never decrementing kernelObjects #5652

Closed
warner opened this issue Jun 23, 2022 · 1 comment · Fixed by #5653
Closed

kernel stats are never decrementing kernelObjects #5652

warner opened this issue Jun 23, 2022 · 1 comment · Fixed by #5653
Assignees
Labels
bug Something isn't working SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Jun 23, 2022

@mhofman and I were investigating #5640 by looking at the kernel stats. His 20-hour loadgen run made a copy of the kernel DB about once every two hours, and the self-reported kernel object counts were rising dramatically:

node ➜ ~/agoric-sdk/packages/SwingSet $ for i in 0 1 2 3 4 5 6 7 8 9; do node misc-tools/db-get.js --raw ../../loadgen-state/stage-$i-state local.kernelStats|jq .kernelObjects; done
511
11140
21765
32410
42982
53372
63867
74329
84766
95374

however the corresponding number of c-list entries was not:

node ➜ ~/agoric-sdk/packages/SwingSet $ for i in 0 1 2 3 4 5 6 7 8 9; do node misc-tools/db-get.js --raw ../../loadgen-state/stage-$i-state local.kernelStats|jq .clistEntries; done
851
884
872
893
881
885
891
893
929
917

So, what's holding onto the 95k kernel objects? Let's start by picking an ID at random. Each kernel object table entry has two DB keys: one for the owner (a vatID), and a second for the refcount.

node ➜ ~/agoric-sdk/packages/SwingSet $ node misc-tools/db-dump.js ../../loadgen-state/stage-10-state >s10
node ➜ ~/agoric-sdk/packages/SwingSet $ cat s10 | jq -c 'select(.[0]|startswith("ko"))'|head -5
["ko.nextID","105824"]
["ko100.owner","v7"]
["ko100.refCount","3,3"]
["ko101.owner","v7"]
["ko101.refCount","1,1"]

We want to pick an entry from the middle ages of this chain: objects allocated very near the start of the chain are likely to be legitimately retained (they're probably fundamental things like Zoe or a contract installation handle), and objects allocated near the end are likely to be retained by recent loadgen operations that are not yet complete. Sort of a bathtub curve for expected object lifetime.

So let's see how many DB entries we have:

node ➜ ~/agoric-sdk/packages/SwingSet $ cat s10 | jq -c 'select(.[0]|startswith("ko"))'|wc -l
649

Hang on.. the kernel stats said there were 95k objects, which should have 2x = 190k DB keys, and yet we're only seeing 649 keys? Something is lying to us.

% rgjs "incStat\('kernelObjects'\)" src
src/kernel/state/kernelKeeper.js
537:    incStat('kernelObjects');
% rgjs "decStat\('kernelObjects'\)" src
%

Welp, it looks like we forgot to ever decrement the kernelObjects counter (or more likely, it got lost in the course of some refactoring). So in fact we only have (649-1)/2 = 324 kernel objects, which sounds pretty flat to me.

So the task is to insert a decStat('kernelObjects') into kernelKeeper.js > deleteKernelObject(), and of course write a test that makes sure it doesn't get dropped again. And then we can run a new loadgen run and see if the telemetry-reported object count remains as stable as this makes us expect (and then maybe close #5640).

@warner warner added bug Something isn't working SwingSet package: SwingSet labels Jun 23, 2022
@warner warner self-assigned this Jun 23, 2022
warner added a commit that referenced this issue Jun 23, 2022
Oops, we incremented this counter, but never decremented it. So our
reported `kernelObjects` value was drastically inflated, making it
look like we were leaking objects, when we probably aren't.

closes #5652
@warner
Copy link
Member Author

warner commented Jun 23, 2022

In fact I completely forgot to include stats handling when I introduced deleteKernelObject() during the big GC update last year (c1a162a).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant