Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to select source files directory #262

Closed
mkevac opened this issue Nov 16, 2017 · 17 comments · Fixed by #366
Closed

Option to select source files directory #262

mkevac opened this issue Nov 16, 2017 · 17 comments · Fixed by #366
Assignees

Comments

@mkevac
Copy link

mkevac commented Nov 16, 2017

What version of pprof are you using?

tip from https://github.com/google/pprof

What operating system and processor architecture are you using?

Linux 64bit

I would like to have a way to tell pprof where it can find source files.

Imagine I have some service running on the server and I have a binary that is running there.

I can use pprof like this:

$ pprof -http=:8080 /tmp/lakafka-1.22.0-278 /tmp/prof

Where /tmp/lakafka-1.22.0-278 is a binary built on a build server and /tmp/prof is a profile collected via wget from http://.../debug/pprof/profile.

If I then go to View -> Source I don't see any source code. pprof is looking for the source code in the path from build server, e.g. /home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go

But I am not on a build server. I have the same source code at, lets say, /home/marko/lakafka.

I would like to tell pprof to look for the source code at /home/marko/lakafka.

Similar to what dir command does in GDB.

Thank you!

@rauls5382
Copy link
Contributor

You can use pprof -source_path=/home/marko/lakafka ... to specify the path for your source tree.
I see that pprof -help doesn't list this. I'll send a commit to include it.

@rauls5382
Copy link
Contributor

Actually, it is listed in pprof -help, so no need for any changes.

@aalexand
Copy link
Collaborator

I think -source_path currently does not help to locate source files that have their path recorded as absolute path in the debug information which seems to be the case here. See

if filepath.IsAbs(path) {
where there seems to be a short-circuit for the absolute path case.

I also find the logic of handling the search paths here a bit inconvenient: it attempts to append the file path from the debug information to each non-empty path prefix of each search path. So if one specifies -source_path=/home/aalexand/src:/usr/local/share/company/src and the file being searched is foo/bar/baz.c, the following paths will be tried:

/home/aalexand/src/foo/bar/baz.c
/home/aalexand/foo/bar/baz.c
/home/foo/bar/baz.c
/foo/bar/baz.c
/usr/local/share/company/src/foo/bar/baz.c
/usr/local/share/company/foo/bar/baz.c
/usr/local/share/foo/bar/baz.c
/usr/local/foo/bar/baz.c
/usr/foo/bar/baz.c
/foo/bar/baz.c

@rauls5382 I am curious what are the use cases for this search strategy? Generally, looking up the content in the parent directories of the search paths the user specified is not a very good thing as some of those parent directories could be network automount directories and looking up random paths in those is super-slow.

What I think I would prefer as a user instead, is trying every possible suffix of the path being searched, breadth-first. So for my example, the following paths would be tried:

/home/aalexand/src/foo/bar/baz.c
/usr/local/share/company/src/foo/bar/baz.c
/home/aalexand/src/bar/baz.c
/usr/local/share/company/src/bar/baz.c
/home/aalexand/src/baz.c
/usr/local/share/company/src/baz.c

That avoids the issue with the lookup of arbitrary paths in parent directories, Also, with that, I'd also remove the check for whether the path is absolute. This would make the reported case supported as the user could specify now -search_path=/home/marko/lakafka/gopath and the search for file /home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go would work as

/home/marko/lakafka/gopath/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/src/badoo/lakafka/main.go  <-- FOUND

@aalexand aalexand reopened this Nov 16, 2017
@aalexand
Copy link
Collaborator

Alternatively, we could make the behavior of the -search_path consistent with GDB behavior but for relative path example I had above that would be only

/home/aalexand/src/baz.c
/usr/local/share/company/src/baz.c

which I think is inconvenient which is why GDB also has the option to configure substitution rules, see the same link. The solution I suggested may be a simpler compromise.

@rauls5382
Copy link
Contributor

The idea behind the current search is to allow the user to specify a path inside their copy of the source tree and have things work, without having to finely line up mount points. This works pretty well with the default ($CWD), so if you're standing anywhere inside a copy of the source tree things just work.

I agree that we should fix the handling of absolute paths. Perhaps adding an option for some simple substitution?

About trimming profile paths from the left, I would worry about there being multiple files in the tree with the same set of trailing directories. Say you have a tree with:

util/config.h
cool-library/util/config.h

I would want to avoid resolving cool-library/util/config.h to util/config.h

@aalexand
Copy link
Collaborator

Allowing to specify an arbitrary path within the source root of interest doesn't sound too convincing of a heuristic to me unless it enables some use case rather than having just a utility function. And in some cases it could result in unusual behavior - e.g. if there is a source file with relative path like dev/myfile.c and any search path is specified, the code will end up looking files up in the /dev directory which is somewhat unexpected to me.

If there is a tree /src/util/config.h and /src/cool-library/util/config.h and relative path cool-library/util/config.h is searched, the right file would be found as longer paths are tried first. Wrong file could be found if there is only /src/util/config.h on disk but not /src/cool-library/util/config.h.

@MikeSpreitzer
Copy link

I also have the problem of running pprof on a machine that is not the build machine, and having absolute paths in the profile. Here is an example stack frame from a profile I am studying:

0xe5543e k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage.(*cacheWatcher).process+0x36e /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage/cacher.go:947

So far, I have been able to hack the machine running pprof so that the build paths exist on my pprof machine --- but I do not enjoy doing this. Would it make sense to introduce an option to simply ignore the absolute paths and do a lookup based on the source_path and that other identifier?

@robfig
Copy link

robfig commented Mar 16, 2018

This is an issue for me too. Our Go binaries are produced on a CI server, and I often take profiles from production but there's no way afaik to get the Source view.

Maintainers (@rauls5382 ?) Is there an acceptable functional design, such that I could contribute a Pull Request?

@aalexand
Copy link
Collaborator

Apologies about the radio silence. Yeah, time to figure this one out.

My preference is still what I suggested in #262 (comment) since it addresses every use case I can think of with just a slight tweak of the current behavior of -search_path and does not introduce new flags. @rauls5382 Objections to that?

@rauls5382
Copy link
Contributor

The use case that I want to make sure is preserved is when the user's cwd is somewhere inside their local build tree and the profile has relative paths. Today that 'just works' without having to cd to the root of the local tree or specifying the source_path option. I think this proposal will break that flow.

I think the fundamental problem is that we do not know how much of the remote path is irrelevant an should be stripped. Could we perhaps implement some heuristics and provide an option to override?

Say that if the user specifies source_tree=/my/tree/lakafka and we see remote paths that include /.../lakafka/... we could automatically strip the remote paths to the left.

@aalexand
Copy link
Collaborator

I see -- I agree breaking the current behavior is not a good idea.

The proposed heuristic might be not reliable if the anchor directory name is something ambiguous like "src".

Another solution is introduce a new option like you suggested before. Something like -rewrite_path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka=/home/marko/lakafka (or -rewrite_path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka=lakafka to turn the absolute path into relative and have default $CWD-based search path do the rest when running the pprof command from the source directory).

This could also have some limited support for globs to allow something like -rewrite_path=/*/_gopath/src/badoo/lakafka=/home/marko/lakafka to make the rewrite option independent of the random components like in this example.

@rauls5382
Copy link
Contributor

SGTM. I agree that a heuristic can be defeated but we would have an option override in the (hopefully infrequent) corner cases.

How about if instead of a full rewrite_path option we have a --remote_source_path that says what we should strip from absolute paths to turn them into relative paths, and then let source_path work as it does today. Have remote_source_path default to a good guess based on the source_path.

My concern with rewrite_path is how it interacts with source_path. I'd rather have something more orthogonal.

@aalexand aalexand self-assigned this Mar 20, 2018
@aalexand
Copy link
Collaborator

Despite the radio silence, I do have a change for this, will send it soon.

@fmstephe
Copy link

@aalexand If you need any help, or testing or code review give me a bell. I am very motivated to get this change. We are running a fork of pprof right now to be able to see source built on other machines, and I would love to be able to use the mainline version.

aalexand added a commit to aalexand/pprof that referenced this issue Apr 27, 2018
When pprof is asked to show annotated source for a profile collected on
another machine for a Go program, the profile contains absolute paths
which may not exist on the local machine. In that case pprof currently
fails to locate the source with no option to help it. The new option
adds a way to specify one or several source path prefixes that should be
trimmed from the source paths in the profile before applying the search
using search path option.

For example, taking the example from the issue where the source file
path in the profile is
/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
and the local path is /home/marko/lakafka/main.go. The user would
specify
`-trim-path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo`
to make pprof find the source. The source path doesn't need to be
specified if the current working dir is anything at or under
`/home/marko/`.

TODO: Heuristical default value for the trim path.

Fixes google#262.
aalexand added a commit to aalexand/pprof that referenced this issue Apr 27, 2018
When pprof is asked to show annotated source for a profile collected on
another machine for a Go program, the profile contains absolute paths
which may not exist on the local machine. In that case pprof currently
fails to locate the source with no option to help it. The new option
adds a way to specify one or several source path prefixes that should be
trimmed from the source paths in the profile before applying the search
using search path option.

For example, taking the example from the issue where the source file
path in the profile is
/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
and the local path is /home/marko/lakafka/main.go. The user may
specify
`-trim-path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo`
to make pprof find the source. The source path doesn't need to be
specified if the current working dir is anything at or under
`/home/marko/`.

When the trim path is not specified, it is guessed heuristically based
on the basename of configured searched paths. In the example above,
setting `-search-path=/home/marko/lakafka` would be sufficient to
activate the heuristic successfully. Or having the current directory as
`/home/marko/lakafka` since the search path is by default set to the
current working directory. Note that the heuristic currently
does not attempt to walk the configured search paths up like the search
does. This is to keep it simple, use `-trim-path` explicitly in more
complicated cases.

Fixes google#262.
aalexand added a commit to aalexand/pprof that referenced this issue Apr 27, 2018
When pprof is asked to show annotated source for a profile collected on
another machine for a Go program, the profile contains absolute paths
which may not exist on the local machine. In that case pprof currently
fails to locate the source with no option to help it. The new option
adds a way to specify one or several source path prefixes that should be
trimmed from the source paths in the profile before applying the search
using search path option.

For example, taking the example from the issue where the source file
path in the profile is
/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
and the local path is /home/marko/lakafka/main.go. The user may
specify
`-trim-path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo`
to make pprof find the source. The source path doesn't need to be
specified if the current working dir is anything at or under
`/home/marko/`.

When the trim path is not specified, it is guessed heuristically based
on the basename of configured searched paths. In the example above,
setting `-search-path=/home/marko/lakafka` would be sufficient to
activate the heuristic successfully. Or having the current directory as
`/home/marko/lakafka` since the search path is by default set to the
current working directory. Note that the heuristic currently
does not attempt to walk the configured search paths up like the search
does. This is to keep it simple, use `-trim-path` explicitly in more
complicated cases.

Fixes google#262.
@aalexand
Copy link
Collaborator

@fmstephe FYI #366

aalexand added a commit to aalexand/pprof that referenced this issue May 3, 2018
When pprof is asked to show annotated source for a profile collected on
another machine for a Go program, the profile contains absolute paths
which may not exist on the local machine. In that case pprof currently
fails to locate the source with no option to help it. The new option
adds a way to specify one or several source path prefixes that should be
trimmed from the source paths in the profile before applying the search
using search path option.

For example, taking the example from the issue where the source file
path in the profile is
/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
and the local path is /home/marko/lakafka/main.go. The user may
specify
`-trim-path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo`
to make pprof find the source. The source path doesn't need to be
specified if the current working dir is anything at or under
`/home/marko/`.

When the trim path is not specified, it is guessed heuristically based
on the basename of configured searched paths. In the example above,
setting `-search-path=/home/marko/lakafka` would be sufficient to
activate the heuristic successfully. Or having the current directory as
`/home/marko/lakafka` since the search path is by default set to the
current working directory. Note that the heuristic currently
does not attempt to walk the configured search paths up like the search
does. This is to keep it simple, use `-trim-path` explicitly in more
complicated cases.

Fixes google#262.
aalexand added a commit that referenced this issue May 3, 2018
Add "trim path" option which can be used to relocate sources.

When pprof is asked to show annotated source for a profile collected on
another machine for a Go program, the profile contains absolute paths
which may not exist on the local machine. In that case pprof currently
fails to locate the source with no option to help it. The new option
adds a way to specify one or several source path prefixes that should be
trimmed from the source paths in the profile before applying the search
using search path option.

For example, taking the example from the issue where the source file
path in the profile is
/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
and the local path is /home/marko/lakafka/main.go. The user may
specify
`-trim-path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo`
to make pprof find the source. The source path doesn't need to be
specified if the current working dir is anything at or under
`/home/marko/`.

When the trim path is not specified, it is guessed heuristically based
on the basename of configured searched paths. In the example above,
setting `-search-path=/home/marko/lakafka` would be sufficient to
activate the heuristic successfully. Or having the current directory as
`/home/marko/lakafka` since the search path is by default set to the
current working directory. Note that the heuristic currently
does not attempt to walk the configured search paths up like the search
does. This is to keep it simple, use `-trim-path` explicitly in more
complicated cases.

Fixes #262.
@ilya-korotya
Copy link

@aalexand, why I got this error? I run:

go tool pprof -source_path=/home/korotya-ilya/go/src/github.com/ilya-korotya/highloadcup_2018 highloadcup_2018 profile.out

But when I run list command in pprof, I get error message:

ROUTINE ======================== github.com/ilya-korotya/highloadcup_2018/postgres.changeSymbol in /go/src/github.com/ilya-korotya/highloadcup_2018/postgres/init.go
         0      640ms (flat, cum) 36.99% of Total
 Error: open /go/src/github.com/ilya-korotya/highloadcup_2018/postgres/init.go: no such file or directory

Why source_path flag don't help change path to source files? Could you help me with this?

@aalexand
Copy link
Collaborator

aalexand commented Jun 1, 2019

@ilya-korotya Please file a separate report with instructions on how to reproduce.

gmarin13 pushed a commit to gmarin13/pprof that referenced this issue Dec 17, 2020
…#366)

Add "trim path" option which can be used to relocate sources.

When pprof is asked to show annotated source for a profile collected on
another machine for a Go program, the profile contains absolute paths
which may not exist on the local machine. In that case pprof currently
fails to locate the source with no option to help it. The new option
adds a way to specify one or several source path prefixes that should be
trimmed from the source paths in the profile before applying the search
using search path option.

For example, taking the example from the issue where the source file
path in the profile is
/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
and the local path is /home/marko/lakafka/main.go. The user may
specify
`-trim-path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo`
to make pprof find the source. The source path doesn't need to be
specified if the current working dir is anything at or under
`/home/marko/`.

When the trim path is not specified, it is guessed heuristically based
on the basename of configured searched paths. In the example above,
setting `-search-path=/home/marko/lakafka` would be sufficient to
activate the heuristic successfully. Or having the current directory as
`/home/marko/lakafka` since the search path is by default set to the
current working directory. Note that the heuristic currently
does not attempt to walk the configured search paths up like the search
does. This is to keep it simple, use `-trim-path` explicitly in more
complicated cases.

Fixes google#262.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants