-
Notifications
You must be signed in to change notification settings - Fork 1k
Conversation
6e77029
to
fe2e56f
Compare
Sorry this took so long to look at! Remember when we chatted and were like "it seems like a 1 line fix but maybe we are missing something?" 😀 I've realized that these log messages aren't in the right spot. For a small codebase, it looks right but after running on a bigger one (e.g. run The log message alone won't be enough, here's a bit of context on why and what's really happening:
Since the goal is to give feedback, maybe we can ignore the import pathway because dep is already printing information about each dependency as it is imported. There isn't a long period of time where the importer does work but doesn't print anything. The problem area is when we aren't importing anything, then when we call solve during init, it first syncs everything and we get a long period of no output. I suggest that you look into doing an explicit prefetch when we aren't doing an import (that would be when rootM is nil in the function below), making sure to follow Sam's original suggestions on concurrency which are in the original issue (and his comment linked from that issue). Lines 38 to 53 in 0979f44
If this is more than you have time to work on, I realize that I was really late in catching this... Just let me know and I can throw this issue back up for grabs. |
@carolynvs I should have some time to look in to this deeper this weekend. |
5dce6ce
to
8bedb08
Compare
@carolynvs I have the basic functionality working. I just need to wrap with concurrency and use golang.org/x/sync/errgroup as Sam mentioned. |
cmd/dep/root_analyzer.go
Outdated
logger := a.ctx.Err | ||
|
||
syncDep := func(pr gps.ProjectRoot, sm gps.SourceManager) { | ||
message := fmt.Sprintf("Cached %s", pr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using fmt directly here and elsewhere in this file, use a.ctx.Err
. This let's us capture/discard output more easily in our tests.
cmd/dep/root_analyzer.go
Outdated
return strings.HasPrefix(s, prefix+"/") | ||
} | ||
|
||
func isStdLib(path string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is already defined in github.com/golang/dep/internal/gps/paths
as IsStandardImportPath
.
cmd/dep/root_analyzer.go
Outdated
logger.Printf(message) | ||
} | ||
|
||
for _, v := range pkgT.Packages { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you can usea.directDeps
instead of pkgT, which is the set of repositories represented in by the import paths in pkgT
. Then we don't need to pass in pkgT
or deal with parsing project roots, or skipping duplicates.
@carolynvs So Ive added the suggested changes as well as the concurrency with max 4 go routines firing the cache calls. For small projects it works well, and fast. However with the kubernetes project I am seeing a bug that I am still working to identify. It seems to fail and hang when trying to get https://bitbucket.org/bertimus9/systemstat. Ive waited for 30+ minutes and it still continues to hang. Output below showing the issue I am describing. Still looking in to it...
|
Update on the error reported above. While it is hanging if I push enter or type 'yes' then the kubernetes dep https://bitbucket.org/bertimus9/systemstat will download as expected. Is this a known bug with deps that require authenticity and require user input to continue? Also notice at the end there is also another issue. This seems unrelated to my changes.
|
Yes, it is expected that the ca cert is trusted for each repo. I added bitbucket.org to my trusted certs so that I don't see that prompt anymore. As for k8's actually working with dep, it doesn't, sorry I should have mentioned that... 😞 It is just a good test for a project with lots of dependencies. I'll take a look in a bit, but if those are the only two issues you ran into, sounds like we are in a good spot! 👍 |
Awesome! The only thing I havent looked in to yet was Sams mention of |
The reason to use an errgroup is that when an error/timeout occurs, everything is cancelled:
https://godoc.org/golang.org/x/sync/errgroup#WithContext Here's an example of how to use errorgroup in the dep codebase: Line 71 in a445d4d
And here's a post that walks through how it all works: https://www.oreilly.com/learning/run-strikingly-fast-parallel-file-searches-in-go-with-sync-errgroup |
@carolynvs Awesome! Thank you for sharing that. Glad to have learned something new and useful 😄 New output from kuberenetes
|
cmd/dep/root_analyzer.go
Outdated
"github.com/golang/dep/internal/importers" | ||
) | ||
|
||
const concurrency = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this definition into cacheDeps
. If we end up needing it in more places later, we can extract it out then.
cmd/dep/root_analyzer.go
Outdated
@@ -45,6 +52,9 @@ func (a *rootAnalyzer) InitializeRootManifestAndLock(dir string, pr gps.ProjectR | |||
|
|||
if rootM == nil { | |||
rootM = dep.NewManifest() | |||
if err := a.cacheDeps(pr); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a comment that explains why we are doing this right here. 😀
// Since we didn't find anything to import, dep's cache is empty.
// We are prefetching dependencies and logging so that the subsequent solve step doesn't spend a long time retrieving dependencies without feedback for the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized I forgot to push the code with this comment. Will add tonight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment
cmd/dep/root_analyzer.go
Outdated
|
||
for ip := range a.directDeps { | ||
logger.Printf("Package %q, analyzing...", ip) | ||
if paths.IsStandardImportPath(ip) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this check
cmd/dep/root_analyzer.go
Outdated
if paths.IsStandardImportPath(ip) { | ||
continue | ||
} | ||
if hasImportPathPrefix(ip, string(pr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this check
cmd/dep/root_analyzer.go
Outdated
continue | ||
} | ||
|
||
pr, err := a.sm.DeduceProjectRoot(ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So removing this guy seems to be breaking the underlying call SyncSourceFor
called by the new syncDep
func. Im still investigating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ive been trying to use type conversion to pass a "gps.ProjectRoot" without the call to DeduceProjectRoot
.
err := syncDep(gps.ProjectRoot(ip), a.sm)
Doing this causes one of the deps to cache over and over again and then end the entire process. Digging in to this now.
Cached github.com/docker/spdystream
Cached github.com/fsnotify/fsnotify
Cached github.com/fsnotify/fsnotify
Cached github.com/xanzy/go-cloudstack
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Cached github.com/spf13/cobra
Successfully cached all deps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Odd, that is the right thing to do to cast to a ProjectRoot. I wonder if that's really due to other changes you've made to the code?
I really wish that when I had refactored the importers that I had fixed the type on directDeps and used gps.ProjectRoot instead of string. 😀
cmd/dep/root_analyzer.go
Outdated
return err | ||
} | ||
|
||
if _, ok := deps[pr]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this check
cmd/dep/root_analyzer.go
Outdated
return nil | ||
} | ||
|
||
for ip := range a.directDeps { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a.directDeps
is a unique set of external project roots. So you don't need to track if they have been evaluated already, check if they are stdlib imports, or deduce the root again.
Check out
Line 229 in d10af5e
func getDirectDependencies(sm gps.SourceManager, p *dep.Project) (pkgtree.PackageTree, map[string]bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Thanks for clarifying. Can definitely remove all of these extra checks then!
cmd/dep/root_analyzer.go
Outdated
|
||
deps[pr] = true | ||
} | ||
if err := g.Wait(); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any handling for when err is not nil?
cmd/dep/root_analyzer.go
Outdated
return nil | ||
} | ||
|
||
func hasImportPathPrefix(s, prefix string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
cmd/dep/root_analyzer.go
Outdated
|
||
g.Go(func() error { | ||
select { | ||
case sem <- struct{}{}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize now that the existing usage of using errgroup in dep was not the best/canonical example to have pointed you to.
I would rather not use this extra sem channel to handle bounded parallelism and instead use the pattern from the errgroup documentation:
https://godoc.org/golang.org/x/sync/errgroup#example-Group--Pipeline
In that example, they start a fixed number of goroutines instead of a goroutine per work item (look for numDigesters
), and range over a channel of input values (in our case that would be the project roots).
@carolynvs Added in concurrency changes for errGroup. Demo of latest output when calling dep init on kubernetes.
|
db0b31c
to
443f6f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 👍
I have just some very minor changes, everything is functionally correct.
cmd/dep/root_analyzer.go
Outdated
defer close(deps) | ||
for ip := range a.directDeps { | ||
logger.Printf("Package %q, analyzing...", ip) | ||
pr, err := a.sm.DeduceProjectRoot(ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directDeps
is a map of project roots (see below for how it is populated), so you can change the for loop to for pr := range a.directDeps
, and remove the call to DeduceProjectRoot
.
Lines 238 to 242 in bf8f295
pr, err := sm.DeduceProjectRoot(ip) | |
if err != nil { | |
return pkgtree.PackageTree{}, nil, err | |
} | |
directDeps[string(pr)] = true |
cmd/dep/root_analyzer.go
Outdated
return nil | ||
}) | ||
|
||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The godoc example used this goroutine with g.Wait()
to ensure that they closed the output channel when everything was done (or there was an error). Since we aren't using that, I believe 107-109 can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct. nice catch. This was leftover cruft from when I was working out some bugs in my concurrency
cmd/dep/root_analyzer.go
Outdated
logger.Printf("Unable to cache %s - %s", pr, err) | ||
return err | ||
} | ||
logger.Printf("Cached %s", pr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the output, I think it would be clearer if we cut down a bit on how much we are printing. Let's remove this print statement so that we are only logging when we start caching a package, and when the caching fails.
So the output would look more like this:
Caching package github.com/example/A
Caching package github.com/example/B
Caching package github.com/example/C
Successfully cached all deps.
cmd/dep/root_analyzer.go
Outdated
g.Go(func() error { | ||
defer close(deps) | ||
for ip := range a.directDeps { | ||
logger.Printf("Package %q, analyzing...", ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change this to Caching package %q
because we aren't actually analyzing anything.
@carolynvs comments address. New output after changes:
|
Yay! Thank you for adding this, it should really help with the "wtf is dep doing?!" moments. 🎉 💖 |
Include #1176 in the changelog
What does this do / why do we need it?
Provides some output during dep init to help soften the user experience where people think the tool is hanging.
What should your reviewer look out for in this PR?
For now this WIP PR is just to start the discussion of how to bring back functionality similar to what was added before in.
https://github.com/golang/dep/pull/194/files#diff-b14a2eb16f1efbc6aa9f1ffdad724265R241
Do you need help or clarification on anything?
Yes. For now I just have something very rudimentary that displays when a new direct dep found. To regain functionality from #194 there also needs to be output when transitive deps are discovered. I believed this should be done in
s.Solve()
. Still digging in.Which issue(s) does this PR fix?
fixes #1144
DEMO