Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] sql: add crdb_internal.node_stacks #59301

Closed
wants to merge 1 commit into from

Conversation

tbg
Copy link
Member

@tbg tbg commented Jan 22, 2021

Once we have #55733 and #59200, it will be very useful to be able to
join together, on each node, the long-running trace stacks.

To this end, a crdb_internal.node_stacks table is added.

Release note: None

Once we have cockroachdb#55733 and cockroachdb#59200, it will be very useful to be able to
join together, on each node, the long-running trace *stacks*.

To this end, a `crdb_internal.node_stacks` table is added.

Release note: None
@tbg tbg requested review from knz, irfansharif and angelapwen January 22, 2021 17:05
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz
Copy link
Contributor

knz commented Jan 22, 2021

please ping me when there is more to look at here

@tbg
Copy link
Member Author

tbg commented Jan 22, 2021

Will do. Right now this PR just keeps me from forgetting to come back to this.

@tbg
Copy link
Member Author

tbg commented Mar 8, 2021

I'm intrigued by the possibility of writing a SQL query that joins long-running trace spans with the stack of the goroutine id assigned to the span. This isn't at all possible with the profile route taken here (the goroutine profile doesn't contain goroutine ids, it just counts unique stacks basically) but it would be with https://github.com/maruel/panicparse, though that requires calling runtime.Stack which is not resource efficient. cc @cockroachdb/kv-east

@knz
Copy link
Contributor

knz commented Mar 15, 2021

FWIW that's precisely what @felixge and I were talking about today. Probably worth brainstorming this together.

@felixge
Copy link

felixge commented Mar 16, 2021

Cool stuff, love the idea of exposing perf data in SQL 😍. Where does the data in pprof.Lookup("stacks") come from?

https://github.com/maruel/panicparse, though that requires calling runtime.Stack which is not resource efficient.

We're also interested in the data provided by runtime.Stack() (mostly the g.waitsince time to diagnose issues where a program seems "stuck"). I agree that it's not resource efficient : /.

That being said, we decided to go down this route a little bit, but concluded the panicparse library is too big/slow for our needs, so I put this together:

https://github.com/DataDog/dd-trace-go/tree/v1/profiler/internal/stackparse

It's not seen much battle in the real world yet, but I've thrown fuzzing, property based testing, etc. at it. In the end it's about 100x faster than panicparse and 10x less code. If you're interested, I think we can expose this as an importable module. See this comment for more info and design goals.

@tbg
Copy link
Member Author

tbg commented Mar 16, 2021

This PR really isn't much to look at, I ended up not even trying to get the stacks but started poking at the goroutine profile (which is what the "stacks" profile is). But as a goroutine profile, of course it doesn't have info on individual goroutines, it's mostly a map stack -> count.

Have you by any chance explored an approach such as https://play.golang.org/p/6gi26PF3iTT to pull the internals?

@felixge
Copy link

felixge commented Mar 16, 2021

Have you by any chance explored an approach such as https://play.golang.org/p/6gi26PF3iTT to pull the internals?

No, but this is awesome! I've seen the //go:linkname stuff before, but I didn't realize you could use it peak into arbitrary runtime symbols from user-land like this! Of course in some ways that's an even more precarious API than runtime.Stack(), but OTOH it seems manageable when limited to such a small scope.

Thank you so much for sharing this, I'll definitely play around with this approach very soon.

@tbg tbg closed this May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants