Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: document how to examine --trace_gc output #372

Closed

Conversation

HarshithaKP
Copy link
Member

No description provided.

@sam-github
Copy link

Slightly off topic, perhaps, but is all this information eventually going to end up in https://nodejs.org/en/docs/guides/ ? It'd be a bit more findable there.

@gireeshpunathil
Copy link
Member

@sam-github - yes, according to #211 (comment) and the thread beneath it.


```
[PID: isolate] < time taken since GC started in ms> : < type/phase of GC > <heap used before GC call in MB> ( < allocated heap before GC call in MB > ) -> < heap used after GC in MB> ( < allocated heap after GC in MB>) <time spent in GC in ms> [ < reason for GC >]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard to read. A graphic where we point to the sections and say what they are might work better.

If we have examples of how to use the traces to identify problems that would also be good. For example in terms of running out of memory. If you first see that that the old space size is continually increasing but it takes too long to hit the max, setting the max old space size to a smaller value can help.

Another one is that if you see that the time spent in GC is continually increasing or is a large portion of the overall that can mean you are short on memory even if you don't OOM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhdawson, Modified according to your suggestion. PTAL.

## Examples of diagnosing memory issues with trace option:

A. How to get context of bad allocations using --trace-gc
1. Suppose we observe that the old space is ocntinously increasing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Suppose we observe that the old space is ocntinously increasing.
1. Suppose we observe that the old space is continously increasing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhdawson, thanks. Fixed it.

5. Allow the program to run, hit the out of memory.
6. The produced log gives shows the failing context.

B. How to assert whether too many gc are happening or too many gc is causing an overhead
Copy link
Member

@mhdawson mhdawson Apr 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
B. How to assert whether too many gc are happening or too many gc is causing an overhead
B. How to assert whether too many gcs are happening or too many gcs are causing an overhead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhdawson, thanks. Fixed it.

B. How to assert whether too many gc are happening or too many gc is causing an overhead
1. Review the trace data, specifically around time between consecutive gcs.
2. Review the trace data, specifically around time spent in gc.
3. If the time between two gc is less than the time spent in gc, the application is sseverely starving.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3. If the time between two gc is less than the time spent in gc, the application is sseverely starving.
3. If the time between two gc is less than the time spent in gc, the application is severely starving.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

1. Review the trace data, specifically around time between consecutive gcs.
2. Review the trace data, specifically around time spent in gc.
3. If the time between two gc is less than the time spent in gc, the application is sseverely starving.
4. If the time between two gc and the time spent in gc are very high, probably the application can use a smaller heap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. If the time between two gc and the time spent in gc are very high, probably the application can use a smaller heap
4. If the time between two gcs and the time spent in gc are very high, probably the application can use a smaller heap

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

2. Review the trace data, specifically around time spent in gc.
3. If the time between two gc is less than the time spent in gc, the application is sseverely starving.
4. If the time between two gc and the time spent in gc are very high, probably the application can use a smaller heap
5. If the time between two gc is much greater than the time spent in gc, application is relatively healthy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
5. If the time between two gc is much greater than the time spent in gc, application is relatively healthy
5. If the time between two gcs is much greater than the time spent in gc, application is relatively healthy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

3. If the time between two gc is less than the time spent in gc, the application is sseverely starving.
4. If the time between two gc and the time spent in gc are very high, probably the application can use a smaller heap
5. If the time between two gc is much greater than the time spent in gc, application is relatively healthy
6. While the actual numbers for these metrics change from workload to workload, a reasonable gap between gcs is 20 minutes, and a reasonable gc time is < 100 ms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thought as @Flarna as well. 20 minutes seems to long. Do we have something to back up suggestions for particular numbers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.I don’t have any data / proof points, so I simply removed it.

4. Reduce `--max-old-space-size` such that the total heap is closer to the limit.
5. Allow the program to run, hit the out of memory.
6. The produced log gives shows the failing context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might add another which is similar except we reduce old space size in order to ensure we actually have an OOM. If we see the heap is continually increasing but don't OOM for a long time. Reducing the old space size can help confirm it is an actual leak versus the heap just increasing because there is lots of space and no need to gc or gc aggressively enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhdawson, added another example. PTAL.

@gireeshpunathil
Copy link
Member

ping @mhdawson , @Flarna - looks like your review comments are addressed; could you please have another look?

@mmarchini
Copy link
Contributor

mmarchini commented Apr 9, 2020

Do we want to document a V8 flag that is not public API and we have no control over it being removed or changed?

Edit: nevermind, it's already documented

@HarshithaKP
Copy link
Member Author

#node --v8-options | grep "trace-gc "
  --trace-gc (print one trace line following each garbage collection)
#

it is documented here.

gireeshpunathil pushed a commit that referenced this pull request Apr 10, 2020
PR-URL: #372
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Gerhard Stöbich <[email protected]>
@gireeshpunathil
Copy link
Member

landed in 4e8b0bc , thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants