Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generation performance #918

Closed
vikstrous opened this issue Nov 2, 2019 · 16 comments
Closed

generation performance #918

vikstrous opened this issue Nov 2, 2019 · 16 comments
Labels

Comments

@vikstrous
Copy link
Contributor

vikstrous commented Nov 2, 2019

What happened?

We've been using gqlgen for a long time and it has been taking almost 1.5 mins to generate from scratch now and 30 seconds to re-generate with no changes. We have 5 graphql servers that share a lot of resolvers. In total this is still multiple minutes in the best case. This is the slowest part of our go generate process.

The "from scratch" number if after running go clean -r ./....

The machine I'm running on is linux with an SSD and 16 GB ram and 4 cores.

In the from scratch case, it used 600MB memory and in the no changes case, 200MB.

On a less powerful machine, I've given up entirely on running gqlgen.

What did you expect?

Generation time should be measured in seconds, not minutes.

Minimal graphql.schema and models to reproduce

Our project is very large and closed source. I can't share too much. I'll share some stats for one of the servers:

  • ~100 models listed in gqlgen.yaml
  • 1 autobind directory
  • 106 types
  • 35 enums
  • 2 scalars
  • 6200 lines of schema
  • I'm guessing at least 1000 resolvers based on the number of resolvers per line I'm seeing

versions

  • gqlgen 0.10.1
  • go 1.13.3
  • go modules

I'm looking for a few things:

  • Ideas for how to reduce generation time without code changes on either side
  • Ideas for how to reduce generation time with code changes either in gqlgen or our codebase
  • Ideas for how to profile things to get more detailed info about what the slow part is. For now I'll try to fork gqlgen and hack in something like https://golang.org/pkg/runtime/pprof/#hdr-Profiling_a_Go_program to profile it.

Let me know if you guys have a chat I can join so we can discuss ideas.

Edit: here's the output from time -v for the best case scenario with no changes:

        User time (seconds): 76.74
        System time (seconds): 14.24
        Percent of CPU this job got: 275%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:33.06
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 198192
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 1
        Minor (reclaiming a frame) page faults: 869804
        Voluntary context switches: 198697
        Involuntary context switches: 52535
        Swaps: 0
        File system inputs: 0
        File system outputs: 47184
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
@vikstrous
Copy link
Contributor Author

The CPU profile:
cpu.prof.gz
I'm not sure what other performance data would be helpful.

@vikstrous
Copy link
Contributor Author

Another CPU profile:
cpu.prof.gz
In here it's pretty clear that the majority of the time is spent in golang.org/x/tools/packages.

My first idea for improving things is: can we deduplicate the work of loading package information by allowing a single gqlgen command to generate the schemas for all 5 of the servers so that the dependency graph doesn't have to be re-generated 5 times?

@vektah
Copy link
Collaborator

vektah commented Nov 3, 2019

Yeah packages is pretty slow, it shells out to go list.

Do you have any cgo? If so try to put an interface in front of the package so gqlgen doesnt need to walk down that far.

@vikstrous
Copy link
Contributor Author

Hmm... yeah, we do have cgo somewhere down the rabbit hole of dependencies. It's likely imported somewhere. Let me know if there's an easy way to confirm.

@vektah
Copy link
Collaborator

vektah commented Nov 3, 2019

I think you can run with CGO_ENABLED=0 go build and see if it still builds? it may even work the same with gqlgen depending how packages passes it through to go list

@vikstrous
Copy link
Contributor Author

vikstrous commented Nov 10, 2019

Ok, so I have some good news and some bad news. The bad news is that I think that the plugins API needs to be changed. The good news is that I think I improved performance by at least 10x for non-trivial projects. I was able to speed up our go generate in CI by 1.5 mins.

Findings

The main insight that I got after trial and error and messing around with profiling and modifying the gqlgen code is that the vast majority of the time is spent in packages.Load and in build.Import.

I'm opening a pull request to show the changes that I made to reduce the time it takes to run gqlgen on my (weak) laptop from 60 seconds to 6. See #926

The main change I made was to remove redundant calls to package.Load and load all relevant packages up front. This required some changes and may have broken some features, but at least for us it didn't change the output.

NameForPackage has to be changed from a global function with a global cache to a struct so that it can use the package list that's pre-loaded.

I also found the post-validation kind of redundant and it was causing yet another packages.Load, so I disabled it.

We don't use models generation and so I disabled it rather than trying to adapt it to the new upfront loading change.

Immediate action items

  • Load packages only once at the start
  • Allow for models generation to be disabled (with the official binary ideally) or speed up models generation by making it also use the pre-loaded packages list.
  • QualifyPackagePath shouldn't be used in go mod mode because it's slow but is a no-op in go mod mode. Alternatively, a more efficient implementation should be used.
  • Allow for optionally disabling post-generation validation

Other opportunities for performance gains

pacakges.Load in my PR ended up having packages.LoadTypes | packages.LoadSyntax | packages.NeedName. We might be able to trim down these flags further. The Load* flags are deprecated https://github.com/golang/tools/blob/master/go/packages/packages.go#L71.

Imports pruning is taking up a solid 30+% of the time after my optimizations. This can be further improved by tracking which imports are referenced some other way during generation time rather than doing a post-processing step.

Another idea: Loading the configs for all 5 of our servers at the same time, doing a single packages.Load and then generating all 5 generated.go files with a single invocation of gqlgen. That might bring down the total generation time to no more than 10 seconds (from the 1:30 mins we started with).

Beyond this, it's just a matter of shuffling bytes around effectively, maybe improving templating efficiency, avoiding ReadAll, etc. There's no need to go there yet.

@vektah
Copy link
Collaborator

vektah commented Nov 10, 2019

I'll copy my comments over from that PR so this issue stays up to date:

Nice exploration, good to see someone dig in here.

Load packages only once at the start

Sadly, I dont think its going to be this straightforward. The reason for multiple packages.Load calls is plugins change the code on disk for a package (eg modelgen).

If we want to go to a single pass we need a way for plugins to invalidate and refetch some packages, this didnt exist in packages when the code was written but maybe it does now.

One of the ideas we threw around at the time was using one of the long running language servers, which should handle reloading packages better - eg gopls or guru. But we would need to be careful that go generate can still run cleanly inside a docker container where inotify events can be quite when using docker for mac and volume mounts.

Allow for models generation to be disabled

I'm happy to add this to the default binary behind a flag but we need to make sure it keeps working. You can already do it now if you roll your own binary, just dont add the modelgen plugin.

QualifyPackagePath shouldn't be used in go mod mode

Sadly, I think you can still use vendor with modules. Maybe we can check if thats not set and modules is turned on and make this function a no-op?

I wouldnt want to drift away from using build for when it is needed though, path resolution in vendor mode is really complicated and trying to keep an implentation in sync with the users go version is going to cause heaps of bugs.

Allow for optionally disabling post-generation validation

:shipit:

Its probably still worth leaving on by default, because it does surface issues binding to user code that isnt correct (eg a hand written interface model that doesnt satisfy the interface). But maybe these cases should be treated as bugs and we should be validating them more during generation.

Imports pruning is taking up a solid 30+% of the time after my optimizations.

In templates we can have hardcoded references to packages (eg graphql.Response), this makes the templates easier to write and read. We could add something to track it like

func (e *executableSchema) Subscription(ctx {{getImport("context")}}.Context, op *{{getImport("github.com/vektah/gqlparser/ast")}}.OperationDefinition) func() *{{getImport("github.com/99designs/gqlgen/graphql")}}.Response

instead of

func (e *executableSchema) Subscription(ctx context.Context, op *ast.OperationDefinition) func() *graphql.Response 

But I think it would be quite error prone, and makes it quite hard to read.

Loading the configs for all 5 of our servers at the same time, doing a single

I think this is probably better addressed by using gopls or guru to maintain some persistent cache rather than us trying to do it ourselves.

@vikstrous
Copy link
Contributor Author

Thanks for your thorough consideration of each part!

Load packages only once at the start / Allow for models generation to be disabled /Allow for optionally disabling post-generation validation

This was really most of the gains. If it ultimately comes down to 1 load for certain optional plugins and 1 load for the rest of the system, we would still be in great shape compared to now. Changing the plugins to work with one pass is clearly a non-starter right now, but if we consolidated all the non-plugin package.Load calls, we would get most of the wins. With the two new flags and this, we'd get huge gains.

My only fear here is that to do the single packages.Load here I had to move cfg.InjectBuiltins(schema) before the package discovery that AutoBind does. I'm not 100% sure how this would affect things. Since AutoBind is (I think still) considered experimental, maybe that's ok if the priority of name resolution changes.

QualifyPackagePath shouldn't be used in go mod mode

Unfortunately, this was was pretty important... It might even be more important than the packages.Load. I just tried changing only this and nothing else and it by itself gives me a 2x improvement. If you think this is too dangerous to auto-detect, maybe we can make it a flag?

Imports pruning is taking up a solid 30+% of the time after my optimizations.

Sure, let's not do this yet, but getImport is not a bad idea because hardcoded imports in templates might cause more problem than just this. What if there's an import name conflict? What if there's a local variable with the same name? (I guess the latter is still a problem actually)

using gopls / guru / generating multiple servers at once

I would love to know what this would mean. Were you thinking of starting an instance of the gopls server binary and making requests to it over its API? Or something else? Seems like this would be a pretty massive change one way or another, so I'm ok with not digging too deep into this yet given how many low hanging fruits we already have.

@vektah
Copy link
Collaborator

vektah commented Nov 11, 2019

QualifyPackagePath shouldn't be used in go mod mode

Unfortunately, this was was pretty important... It might even be more important than the packages.Load. I just tried changing only this and nothing else and it by itself gives me a 2x improvement. If you think this is too dangerous to auto-detect, maybe we can make it a flag?

If we can find a good way to detect both go modules and -mod=vendor lets go for it. It will require some manual testing with a few go versions and configurations:

  • pre 1.11
  • post 1.11 with modules
  • post 1.11 with vendor
  • post 1.11 with modules and --mod=vendor

Imports pruning is taking up a solid 30+% of the time after my optimizations.

With more thought, I think this might actually be impossible to remove. its not just removing imports its also doing the go formatting pass too.

What if there's an import name conflict? What if there's a local variable with the same name? (I guess the latter is still a problem actually)

the {{ reserveImport template function makes sure those names are available

@vikstrous
Copy link
Contributor Author

QualifyPackagePath

Actually, I just realized that it's very hard to know if the vendor directory should or shouldn't be used regardless of whether or not there's a go.mod file because you can run go install with or without -mod=vendor. I don't understand how this is currently solved because the go generate / gqlgen command can't know whether or not you plan to build in vendor mode after generating.

BTW, I think for detecting if we are in a go module, execing go mod edit -json might be a pretty good way to go. If it doesn't error, we are in a go module. If it errors, we are not. This doesn't help with the vendoring question though.

In the worst case scenario, I'd be okay with the lightweight "go mod" mode being opt-in through a flag, but I do understand how insanely confusing it would be to know whether or not to use the flag. Is it possible for gqlgen to somehow not have to care about whether the package it's inspecting is vendored or not?

Import fixing

You can see in the trace below that we are double parsing the file by writing it to disk and then reading it. Maybe this is just because we are using the limited interfaces provided by golang.org/x/tools. I'd like to point out that if we could manipulate the AST directly rather than using text templating, we'd be able to just write it out once with format.Node or printer.Fprint. I don't know how realistic that is though.

Screenshot 2019-11-09 at 10 15 18 PM

By the way, I tried out format.Source and it turned out to be somehow even slower than imports.Process(filename, buf.Bytes(), &imports.Options{FormatOnly: true, Comments: true, TabIndent: true, TabWidth: 8})

Anyway, these are small gains compared to the main 2 changes I'm hoping for: packages.Load and QualifyPackagePath. Let's not get distracted from those fixes.

reserveImport

Ah, makes sense. Thanks.

@vektah
Copy link
Collaborator

vektah commented Nov 11, 2019

BTW, I think for detecting if we are in a go module, execing go mod edit -json might be a pretty good way to go. If it doesn't error, we are in a go module. If it errors, we are not. This doesn't help with the vendoring question though.

We already have code for finding the go.mod file in the code package, I would prefer that to random shelling out. Its pretty easy to walk up the filesystem looking for a file.

In the worst case scenario, I'd be okay with the lightweight "go mod" mode being opt-in through a flag, but I do understand how insanely confusing it would be to know whether or not to use the flag.

Yeah maybe its better to opt into -mod=vendor, I wonder how many poeple actually uses it? Its probably also safe to assume they are used to dealing with these issues at some point, so may be more likely to search for the flag.

Is it possible for gqlgen to somehow not have to care about whether the package it's inspecting is vendored or not?

I opened an issue complaining that packages forces this on its users, but it didnt get fixed. I wish I could link to it, but it seems x/tools github repo now has turned off issues. 🎉

You can see in the trace below that we are double parsing the file by writing it to disk and then reading it.

Thats surprising, that code is actually just a copy of goimports with the automatic adding of imports removed.

@vikstrous
Copy link
Contributor Author

With #940 #941 #942 #944 #945 all merged in, model generation and verification disabled, I'm getting 7 seconds generation time, down from 1:15.

@vikstrous
Copy link
Contributor Author

Actually, sorry, I'm getting some diffs in the generated output on my repo. I'll have to really make sure each change doesn't affect the output because it's very easy to break things with PRs like this.

@vikstrous
Copy link
Contributor Author

Ok, it works now. There was a bug in #945 that's fixed now.

@vektah
Copy link
Collaborator

vektah commented Nov 28, 2019

I've merged the low risk stuff to get a release out. Will take a closer look at the remaining ones for next release.

@stale
Copy link

stale bot commented Jan 27, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 27, 2020
@stale stale bot closed this as completed Feb 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants