Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Goroutine stack inspector to admin/monitor #19207

Merged
merged 37 commits into from
Mar 31, 2022

Conversation

zeripath
Copy link
Contributor

@zeripath zeripath commented Mar 24, 2022

Continues on from #19202.

Following the addition of pprof labels we can now more easily understand the relationship between a goroutine and the requests that spawn them.

This PR takes advantage of the labels and adds a few others, then provides a mechanism for the monitoring page to query the pprof goroutine profile.

The binary profile that results from this profile is immediately piped in to the google library for parsing this and then stack traces are formed for the goroutines.

If the goroutine is within a context or has been created from a goroutine within a process context it will acquire the process description labels for that process.

The goroutines are mapped with there associate pids and any that do not have an associated pid are placed in a group at the bottom as unbound.

In this way we should be able to more easily examine goroutines that have been stuck.

A manager command gitea manager processes is also provided that can export the processes (with or without stacktraces) to the command line.

Screenshots

Screenshot from 2022-03-24 21-26-50

Screenshot from 2022-03-25 18-14-23

Screenshot from 2022-03-25 18-15-17

Screenshot from 2022-03-25 18-15-41

@zeripath
Copy link
Contributor Author

Now it would be probably helpful to add a vuejs component around this that would allow easy filtering but I think this might represent a good starting point.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Mar 24, 2022
Using the pprof labels we can now add a stacktrace which can show these
labels and name go-routines by the labels if there is a description set.

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath
Copy link
Contributor Author

I've made substantial changes to this to hopefully make the stack trace more friendly

@zeripath zeripath added the type/feature Completely new functionality. Can only be merged if feature freeze is not active. label Mar 26, 2022
@zeripath zeripath added this to the 1.17.0 milestone Mar 26, 2022
Copy link
Contributor

@Gusted Gusted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading the code twice, I'm not sure if I'm actually understanding what's going on. I have a general idea, but the specifics are 🤷🏽.

modules/process/error.go Show resolved Hide resolved
modules/process/manager_exec.go Show resolved Hide resolved
modules/process/manager_exec.go Show resolved Hide resolved
modules/process/manager_stacktraces.go Outdated Show resolved Hide resolved
modules/process/manager_stacktraces.go Show resolved Hide resolved
modules/process/manager_stacktraces.go Show resolved Hide resolved
modules/process/manager_stacktraces.go Outdated Show resolved Hide resolved
modules/process/manager_stacktraces.go Show resolved Hide resolved
modules/process/manager_stacktraces.go Outdated Show resolved Hide resolved
routers/private/manager_process.go Outdated Show resolved Hide resolved
Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. But it much larger than I thought ..... +1,424 −571 about 900 lines.

As an internal framework level mechanism, personally I think sometimes enough (simple) is better than perfect (complex) 😊

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Mar 29, 2022
@zeripath
Copy link
Contributor Author

Most of the code is gorutine and context wrapping labelling outside of modules/process. The merging and handling of the profile.Sample is unfortunate but at least means that the object tree makes some sense.

We have to do that if we want to merge in the process state in to the goroutine stacks, and we need to do that merging with the process manager locked otherwise the results are racey and you'll end up not knowing if you're seeing a goroutine leak or not.

Copy link
Contributor

@Gusted Gusted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, feel confident this can be merged into the codebase without causing too much WTF's during reading this code again.

image

@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Mar 29, 2022
@6543
Copy link
Member

6543 commented Mar 31, 2022

@zeripath ☝️ one of the latest changes did confict with modules/process/manager.go - I think the last atempt to resolve did go wrong ...

@wxiaoguang
Copy link
Contributor

one of the latest changes did confict with modules/process/manager.go - I think the last atempt to resolve did go wrong ...

I did the merge again and resolved the conflicts.

@6543
Copy link
Member

6543 commented Mar 31, 2022

🚀

@6543 6543 merged commit c88547c into go-gitea:main Mar 31, 2022
@zeripath zeripath deleted the pprof-labels-stacks branch March 31, 2022 17:58
zjjhot added a commit to zjjhot/gitea that referenced this pull request Apr 1, 2022
* giteaoffical/main:
  Fix broken of team create (go-gitea#19288)
  Remove `git.Command.Run` and `git.Command.RunInDir*` (go-gitea#19280)
  Performance improvement for add team user when org has more than 1000 repositories (go-gitea#19227)
  [skip ci] Updated translations via Crowdin
  Update JS dependencies (go-gitea#19281)
  Fix container download counter (go-gitea#19287)
  go.mod: update kevinburke/ssh_config to v1.2.0 (go-gitea#19286)
  Fix global packages enabled avaiable (go-gitea#19276)
  Add Goroutine stack inspector to admin/monitor (go-gitea#19207)
  Move checks for pulls before merge into own function (go-gitea#19271)
  Restore user autoregistration with email addresses (go-gitea#19261)
  Improve sync performance for pull-mirrors (go-gitea#19125)
  Refactor `git.Command.Run*`, introduce `RunWithContextString` and `RunWithContextBytes` (go-gitea#19266)
  Move reaction to models/issues/ (go-gitea#19264)
@go-gitea go-gitea locked and limited conversation to collaborators Apr 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. type/feature Completely new functionality. Can only be merged if feature freeze is not active.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants