Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: incremental gopls scaling #57987

Closed
20 tasks done
findleyr opened this issue Jan 25, 2023 · 33 comments
Closed
20 tasks done

x/tools/gopls: incremental gopls scaling #57987

findleyr opened this issue Jan 25, 2023 · 33 comments
Assignees
Labels
FrozenDueToAge gopls/performance Issues related to gopls performance (CPU, memory, etc). gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@findleyr
Copy link
Contributor

findleyr commented Jan 25, 2023

Incremental gopls

This is an umbrella issue to track the work we've been doing to change the way gopls scales (and significantly reduce memory usage and CPU on large codebases). We've been working off an internal design, but I wanted to share some of that here and have a public issue to track the work.

The main goal of this project is to use on-disk export data and indexing to allow gopls to persist data across sessions, reloading on-demand, thereby reducing its overall memory footprint and startup time.

As a result of this change, gopls memory usage (at least related to package information) will be O(open packages), rather than O(workspace). Furthermore, we will break global relationships between elements of gopls' cache, simplifying gopls' execution model and eliminating multiple categories of bugs.

Background

Gopls, as it very recently existed, was a monolithic build system. Certain algorithms relied on a global package graph containing the full set of workspace packages. For example, to find references to a given identifier, gopls simply walked all packages in the reverse transitive cone of the declaring package(s) and checked types.Info.Uses for references to the declared object. In order for this algorithm to be correct and sufficiently fast, gopls must hold all type-checked packages in memory (including many redundant intermediate test variants!).

We can't solve gopls' scaling problems until we rewrite these algorithms and fix all the places where gopls makes assumptions of global identity. This is what we've been working on.

High level plan

  1. Design a shallow export data format that does not bundle its transitive closure.
  2. Add a mechanism for on-disk caching.
  3. Implement package analysis using export data.
  4. Rewrite workspace-wide queries to use indexes that are independent of types.Package or types.Object identity.
  5. Separate the concept of a "syntax package" (something containing AST and types.Info) from an "export package" (a types.Package with type information for exported symbols). Syntax packages are used to fully understand the syntax a user is working on. Export packages are used for type checking. Currently, gopls has a concept of "exported parse mode", which produces a syntax package on a truncated AST. This exists to reduce memory, but means that syntax packages may be partial, a source of many historical and current bugs. All current uses of partial packages in the package graph can (and must) be eliminated or replaced with a judicious use of parsing or type-checking on-demand.
  6. Create a control plane to manage package information that must be preserved in memory vs re-computed on demand or transiently cached.
  7. When importing during type-checking, use export packages for packages outside the workspace. For now, continue to produce syntax packages for all packages inside the workspace.
  8. Persist and load export packages from disk.
  9. Load xrefs and methodset indexes from disk, rather than hanging them off of syntax packages.
  10. Drop all syntax packages, except those with open files.
  11. Implement precise pruning
  12. Investigate holding on to packages imported by open packages, to reduce re-type-checking latency.
  13. Revisit diagnostic storage and retrieval: diagnostics are re-accessed in multiple places, assuming they will be free to retrieve: fix TestBadlyVersionedModule
  14. Improve the UX around indexing (better progress notifications, partial results, etc).
@findleyr findleyr added the gopls/performance Issues related to gopls performance (CPU, memory, etc). label Jan 25, 2023
@findleyr findleyr added this to the gopls/v0.12.0 milestone Jan 25, 2023
@gopherbot gopherbot added Tools This label describes issues relating to any tools in the x/tools repository. gopls Issues related to the Go language server, gopls. labels Jan 25, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/463556 mentions this issue: gopls/internal/lsp/source: eliminate a couple uses of posToMappedRange

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/463376 mentions this issue: gopls/internal/lsp/source: eliminate ResolveImportPath

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/463375 mentions this issue: gopls/internal/lsp/source: reduce usage of TypecheckWorkspace

gopherbot pushed a commit to golang/tools that referenced this issue Jan 26, 2023
Fix calls to TypeCheck using TypecheckWorkspace where we really want
TypecheckFull. Also, use NarrowestPackage where it suffices.

In approximate order of appearance:
- Code actions, semantic tokens, code lens, and document highlights are
  all scoped to a file; the narrowest package for that file should
  suffice.
- When completing at a position, we need the full package to find
  enclosing context. Furthermore, that file is open, and so will be
  fully type-checked by other operations.
- Ditto for suggested fixes, inlay hints, and signature help.

The current behavior leads to incorrect or missing functionality when
outside the workspace. I did not add comprehensive tests demonstrating
this in all cases, but added one for signature help.

For golang/go#57987

Change-Id: I8270d0f0a0787e36bd4103378176d150426d37f2
Reviewed-on: https://go-review.googlesource.com/c/tools/+/463375
Reviewed-by: Alan Donovan <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Jan 26, 2023
Following up on CL 461944, eliminate uses of ResolveImportPath.

At two of the three callsites, we avoid type-checking. The one that
remains is in renaming.

For golang/go#57987

Change-Id: Ia974d39f2db72a1fe1373cff5faeb07ecb54effb
Reviewed-on: https://go-review.googlesource.com/c/tools/+/463376
Run-TryBot: Robert Findley <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Jan 26, 2023
Eliminate a couple uses of posToMappedRange, which potentially
type-checks, where it is clearly unnecessary.

Also improve test output for highlight.

Updates golang/go#57987
Updates golang/go#54845

Change-Id: I5580bf6431def0a6ee635e394932934ec7fe1afb
Reviewed-on: https://go-review.googlesource.com/c/tools/+/463556
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/463955 mentions this issue: gopls/internal/lsp: separate requests from source.Identifier

gopherbot pushed a commit to golang/tools that referenced this issue Jan 30, 2023
Start to unwind source.Identifier by unrolling definition, type
definition, and call hierarchy handlers.

Along the way, introduce a couple primitive helper functions, which may
be made obsolete in the future but allowed preserving source.Identifier
behavior:
- referencedObject returns the object referenced by the cursor position,
  as defined by source.Identifier.
- mapPosition is a helper to map token.Pos to MappedRange in the narrow
  context of a package fileset.

After this change, the only remaining use of source.Identifier is for
Hover, but that is a sizeable refactoring and therefore left to a
subsequent CL.

Updates golang/go#57987

Change-Id: Iba4b0a574e6a6d3d54253f3b4bff8fe6e13a1b15
Reviewed-on: https://go-review.googlesource.com/c/tools/+/463955
gopls-CI: kokoro <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/464455 mentions this issue: gopls/internal/lsp/source: simplify extracting object hover doc

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/464301 mentions this issue: gopls/internal/lsp/source: use syntax alone in FormatVarType

gopherbot pushed a commit to golang/tools that referenced this issue Feb 2, 2023
FormatVarType re-formats a variable type using syntax, in order to get
accurate presentation of aliases and ellipses. However, as a result it
required typed syntax trees for the type declaration, which may exist in
a distinct package from the current package.

In the near future we may not have typed syntax trees for these
packages. We could type-check on demand, but (1) that could be costly
and (2) it may break qualification using the go/types qualifier.

Instead, perform this operation using a qualifier based on syntax
and metadata, so that we only need a fully parsed file rather than a
fully type-checked package. The resulting expressions may be inaccurate
due to built-ins, "." imported packages, or missing metadata, but that
seems acceptable for the current use-cases of this function, which are
in completion and signature help.

While doing this, add a FormatNodeWithFile helper that allows formatting
a node from a *token.File, rather than *token.FileSet. This can help us
avoid relying on a global fileset. To facilitate this, move the GetLines
helper from internal/gcimporter into a shared tokeninternal package.

For golang/go#57987

Change-Id: I3b8a5256bc2261be8b5175ee360b9336228928ac
Reviewed-on: https://go-review.googlesource.com/c/tools/+/464301
Run-TryBot: Robert Findley <[email protected]>
gopls-CI: kokoro <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Feb 2, 2023
Add a new helper HoverDocForObject, which finds the relevant object
documentation without needing to type-check. This eliminates the last
use of FindPackageFromPos.

For golang/go#57987

Change-Id: Ic9deec78d68156e9ead3831a8247f8c30259a3c6
Reviewed-on: https://go-review.googlesource.com/c/tools/+/464455
Run-TryBot: Robert Findley <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
gopls-CI: kokoro <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/464415 mentions this issue: gopls/internal/lsp/source: rewrite hover

gopherbot pushed a commit to golang/tools that referenced this issue Feb 10, 2023
Significantly rewrite the logic involved in textDocument/hover requests,
eliminate dependence on type checking, and remove the source.Identifier
abstraction.

The existing logichad become very hard to follow, and while I spent
many hours trying to make sure I understood it, some logic seemed
inaccurate and other remained opaque. Notably, it appears much of the
old code was effectively dead, due to impossible combinations of
Node/Object in various execution paths.

Rewrite the essential bits of logic from scratch, preserving only that
which was necessary for hover (the last user of source.Identifier).
After doing this, the intermediate HoverContext and IdentifierInfo data
types became unnecessary.

Along the way, a pair of decisions impacted gopls observable behavior:
- Hovering over a rune was made consistent with other hovering (by way
  of HoverJSON), which had the side effect of introducing newlines to
  hover content.
- Support for hovering over variables with constant initializers was
  removed (golang/go#58220). This feature will be revisited in a
  follow-up CL.

This eliminates both posToMappedRange and findFileInDeps helpers, which
were two of the remaining places where we could incidentally type-check.
After this change, the only uses of TypecheckWorkspace are in renaming.

For golang/go#57987

Change-Id: I3e0d673b914f6277c3713f74450134ceeb181319
Reviewed-on: https://go-review.googlesource.com/c/tools/+/464415
Run-TryBot: Robert Findley <[email protected]>
gopls-CI: kokoro <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/467798 mentions this issue: gopls/internal/lsp/source: don't rely on global FileSet when stubbing

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/468215 mentions this issue: gopls/internal/lsp/source: eliminate Snapshot.FileSet

gopherbot pushed a commit to golang/tools that referenced this issue Feb 14, 2023
This change is a reimplementation of the renaming operation so
that it no longer mixes types.Objects from different packages.
The basic approach is as follows.

First, the referenced object is found and classified. If it is
nonexported (e.g. lowercase, or inherently local such as an
import or label, or within a function body), the operation is
local to that package and is carried out essentially as before.

However, if it is exported, then the scope of the global search
is deduced (direct for package-level var/func/const/type,
transitive for fields and methods). The object is converted to an
objectpath, and all the reverse dependencies are analyzed, using
the objectpath to identify the target in each package.

The renameObject function (the entry point for the fiddly renamer
algorithm) is now called once per package, and the results of all
calls are combined.

Because we process variants, duplicate edits are likely. We sort
and de-dup at the very end under the assumption that edits are
well behaved. The "seen edit" tracking in package renaming is no
longer needed.

Also:
- Package.importMap maps PackagePath to Package for all dependencies,
  so that we can resolve targets identified by (PackagePath,
  objectpath) to their objects in a different types.Importer "realm".
  It is materialized as a DAG of k/v pairs and exposed as
  Package.DependencyTypes.
- The rename_check algorithm (renamer) is now applied once to each
  package instead of once to all packages.
- The set of references to update is derived from types.Info, not the
  references operation.

Still to do in follow-ups:
- Method renaming needs to expand the set of renamed types (based on
  'satisfy') and recompute the dependencies of their declarations,
  until a fixed point is reached. (Not supporting this case is a
  functional regression since v0.11.0, but we plan to submit this
  change to unblock foundational progress and then fix it before the
  next release. See golang/go#58461)
- Lots of generics cases to consider (see golang/go#58462).
- Lots more tests required. For golang/go#57987.

Change-Id: I5fd8538ab35d61744d765d8bd101cd4efa41bd33
Reviewed-on: https://go-review.googlesource.com/c/tools/+/464902
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Robert Findley <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/464902 mentions this issue: gopls/internal/lsp/source/rename: use incremental algorithm

gopherbot pushed a commit to golang/tools that referenced this issue Feb 14, 2023
Update the method stubbing logic not to rely on a global FileSet.

For golang/go#57987

Change-Id: Ia30067dd69ba88d0433482aaee61e35a79bf6a46
Reviewed-on: https://go-review.googlesource.com/c/tools/+/467798
Run-TryBot: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
gopls-CI: kokoro <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Feb 14, 2023
Eliminate remaining uses of the global fileset API.

For golang/go#57987

Change-Id: Ida5a7c556bf48d07a90966aa4f5623e64cc48378
Reviewed-on: https://go-review.googlesource.com/c/tools/+/468215
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/468776 mentions this issue: gopls/internal/lsp/cache: pre-compute load diagnostics

gopherbot pushed a commit to golang/tools that referenced this issue Feb 16, 2023
Process go/packages errors immediately after loading, to eliminate a
dependency on syntax packages.

This simplifies the cache.Package type, which is now just a simple join
of metadata and type-checked syntax.

For golang/go#57987

Change-Id: I3d120fb4cb5687df2a376f8d8617e23e16ddaa92
Reviewed-on: https://go-review.googlesource.com/c/tools/+/468776
gopls-CI: kokoro <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/477977 mentions this issue: gopls/internal/lsp/cache: use json encoding for diagnostics

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/477979 mentions this issue: gopls/internal/lsp: reorder the way we compute diagnostics

gopherbot pushed a commit to golang/tools that referenced this issue Mar 21, 2023
As we move toward incremental evaluation of the package key and precise
pruning, rewrite the type checking pass to formalize the concept of a
pre- and post- package visitor. Whereas previously we would do an
initial pass outside of type-checking to load data (xrefs, diagnostics,
etc), we now integrate loads checks with the batch operation, calling
a pre- func immediately before type-checking, and a post- func
immediately afterward.

This change was originally intended only to move us toward precise
pruning, but inadvertently simplified and clarified many aspects of
type-checking. Namely, it:
- resulted in eliminating all state from the type check batch;
  typeCheckBatch.mu is deleted
- allows avoiding double-encoding of diagnostics and method sets, to fit
  the previous APIs. This results in a significant improvement in the
  methodsets benchmarks
- significantly simplifies the logic of the type-checking batch
- significantly reduces the high water mark during the IWL, as the GC
  can clean up the packages that are no longer pinned to the batch

For golang/go#57987

Change-Id: Iad1073f32cea0a7e26567d94cdd3ed01ff60ad04
Reviewed-on: https://go-review.googlesource.com/c/tools/+/476956
gopls-CI: kokoro <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Mar 21, 2023
snapshot.ReadFile holds the snapshot lock, so calling it concurrently
was pointless. Instead, add a new preloadFiles helper that delegates to
to the underlying FileSource.

For golang/go#57987

Change-Id: If3259ca45260b8a24542d40f7558b5d6bf5018d1
Reviewed-on: https://go-review.googlesource.com/c/tools/+/477975
gopls-CI: kokoro <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Mar 21, 2023
It actually takes ~50ms to hash the gopls executable, which is in the
critical path of all gopls operations. Start the file cache eagerly, so
that we may initialize while we wait for the Go command.

For golang/go#57987

Change-Id: I61f9ba60c6aeab12163a0a9254c17e72335d9dba
Reviewed-on: https://go-review.googlesource.com/c/tools/+/477976
Reviewed-by: Alan Donovan <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Mar 21, 2023
Benchmarking demonstrated that encoding and decoding diagnostics when
diagnosting the workspace consumes a disproportionate amount of time,
especially when the diagnostic sets are almost empty.

The encoding/gob package is not intended for thousands of small encoding
and decoding operations at low latency, as described here:
golang/go#29766 (comment)

Using JSON improved observed encoding performance significantly.

Also, add a test... and fix the bugs uncovered by the test.

For golang/go#57987

Change-Id: I52af8a680ed20cf07e6c61ea496bcf9fd761e6da
Reviewed-on: https://go-review.googlesource.com/c/tools/+/477977
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Mar 23, 2023
Now that diagnostics are not completely free, we must take more care
about when and how (and how often) we access them. Also, we should not
wait for `go mod tidy` before starting to diagnose the initial snapshot.

Fix this by refactoring the diagnostic pass, and skipping expensive mod
diagnostics in code actions. Specifically:
- Simplify diagnosePkgs somewhat, by lifting the logic to determine which
  packages to diagnose / analyze.
- Run go mod tidy concurrently in the diagnostics pass.
- Remove the expensive logic to request all package diagnostics from
  mod.Diagnostics. Instead, let diagnosePkgs report diagnostics for
  files outside the package.
- Don't request workspace package diagnostics when computing code
  actions for mod files. As noted in the comment, I think this logic is
  backwards. For now, this is too expensive.
- Turn guards against the builtin file into bugs: I don't know why they
  existed.
- Invert some logic to simplify the gc_details pass.

This change revealed races inherent in our use of memoize.Promise inside
of forEachPackage: because promises are evaluated on a detached
goroutine, they do not guarantee that their func completes before all
waiters return. Fix this by replacing memoize.Promise with a simple
future cache. This allowed closing over the snapshot, simplifying the
logic significantly.

For golang/go#57987

Change-Id: I2c5fd58db8a1cc0285f32d8a1301a630b9a30b86
Reviewed-on: https://go-review.googlesource.com/c/tools/+/477979
Reviewed-by: Alan Donovan <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/479015 mentions this issue: gopls/internal/lsp/cache: hold on to imports used by open packages

gopherbot pushed a commit to golang/tools that referenced this issue Mar 24, 2023
Benchmarking demonstrated clearly that gopls must memoize active
packages, to efficiently handle repeated requests related to open files
(e.g. code lens, completion, semantic tokens).

In doing so, gopls is effectively pinning the import graph of these open
packages. However, when those packages changed, this import graph was not
being reused. Furthermore when multiple open packages shared packages in
their import graph, there was a chance that gopls may pin multiple copies
of those packages, if the open packages were type-checked in separate
batches.

This change introduces a new optimization which manages a shared import
graph to be re-used across snapshots. Before performing any
type-checking we re-evaluate this shared import graph. This is purely an
optimization, and is not necessary for correctness. As such, the
feature is guarded behind a compile-time constant, so that it may easily
be disabled for debugging. The plan is to have several such constants.

For golang/go#57987

Change-Id: Ica654ffc8f1e5f39bcab7000c0839ece22e20ab2
Reviewed-on: https://go-review.googlesource.com/c/tools/+/479015
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
gopls-CI: kokoro <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/479296 mentions this issue: gopls/internal/lsp/source/typerefs: new package to analyze syntax deps

gopherbot pushed a commit to golang/tools that referenced this issue Mar 30, 2023
Add a new typerefs package that analyzes type-affecting references
between declarations. This can be used to compute the set of packages
that may influence eachothers types, for the purpose of precise pruning.

This CL include both an implementation of the references analysis, and a
standalone implementation of the whole-graph reachability analysis. The
latter is intended to be temporary, as for various reasons it is more
convenient for the memoization and computation of graph data to live in
the gopls/internal/lsp/cache package.

For golang/go#57987

Change-Id: Ia1355ff4dfe476319e6f4f8f06b2a79dd7ba554f
Reviewed-on: https://go-review.googlesource.com/c/tools/+/479296
Run-TryBot: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
gopls-CI: kokoro <[email protected]>
@ShoshinNikita
Copy link

Using gopls@master I noticed that diagnostics are updated with a significant delay in comparison with [email protected]. And this delay is the same for both small and large projects.

I managed to find a CL that introduced this behavior. Is is https://go.dev/cl/477979 (gopls/internal/lsp: reorder the way we compute diagnostics):

gopls@83c1ba19d (before the CL) gopls@d52bc5689 (after the CL)
old.mp4
new.mp4

I would like to ask, is it an intended behavior? For me this delay looks like a performance issue.

@findleyr
Copy link
Contributor Author

findleyr commented Apr 6, 2023

@ShoshinNikita this is unintended.

We set the diagnostics delay to a higher number, and I think that made it more apparent that certain type-checking errors are not getting published immediately for the recently changed file.

The intended behavior is that you get diagnostics immediately for the file you are editing.

I can reproduce.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/477455 mentions this issue: gopls/internal/lsp/cache: improve precision of cache key

gopherbot pushed a commit to golang/tools that referenced this issue Apr 13, 2023
Use the new typerefs algorithm to improve the precision of our cache
key, by excluding from the package key any dependencies that are known
not to affect the results of type-checking.

This makes building the package key more expensive. In particular, we
must be more careful to avoid duplicate building of package keys due to
races. To address this concern, the evaluation of package handles is
turned a concurrent batch operation.

Also fix a few bugs/oversights in the typerefs package encountered while
using gopls built with this change.

For golang/go#57987

Change-Id: Iec6d183b91dcd1223db0f6b64ca5231d42975c3e
Reviewed-on: https://go-review.googlesource.com/c/tools/+/477455
TryBot-Result: Gopher Robot <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
gopls-CI: kokoro <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/494100 mentions this issue: gopls/internal/lsp/filecache: front with a 100MB in-memory LRU cache

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/494099 mentions this issue: gopls/internal/lsp/lru: extract LRU logic to a standalone package

gopherbot pushed a commit to golang/tools that referenced this issue May 16, 2023
Extract a simple LRU package from the parseCache implementation (which
itself was extracted from an initial implementation of the filecache).

In a subsequent CL, this package will be used to reduce I/O in the
filecache package.

For golang/go#57987

Change-Id: I307f397b71654226d4e0e1c532a81cfde49af831
Reviewed-on: https://go-review.googlesource.com/c/tools/+/494099
Reviewed-by: Alan Donovan <[email protected]>
Run-TryBot: Robert Findley <[email protected]>
gopls-CI: kokoro <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue May 16, 2023
Put a 100MB in-memory LRU cache in front of the filecache, to reduce I/O
for repeated cache access, such as we observe with workspace diagnostics
or implements queries.

Updates golang/go#60089
For golang/go#57987

Change-Id: I01a1fcca7dcf26416d4cdb578a7a2674765c9f08
Reviewed-on: https://go-review.googlesource.com/c/tools/+/494100
Run-TryBot: Robert Findley <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/497398 mentions this issue: gopls/internal/lsp: bundle certain quick-fixes with their diagnostic

CarlosRosuero pushed a commit to CarlosRosuero/typeparams that referenced this issue Jun 13, 2023
In this CL type-checked packages are made entirely independent of each
other, and package export data and indexes are stored in a file cache.
As a result, gopls uses significantly less memory, and (with a warm
cache) starts significantly faster. Other benchmarks have regressed
slightly due to the additional I/O and export data loading, but not
significantly so, and we have some ideas for how to further narrow or
even close the performance gap.

In the benchmarks below, based on the x/tools repository, we can see
that in-use memory was reduced by 88%, and startup time with a warm
cache by 65% (this is the best case where nothing has changed). Other
benchmarks regressed by 10-50%, much of which can be addressed by
improvements to the objectpath package (golang/go#51017), and by making
package data serialization asynchronous to type-checking.

Notably, we observe larger regressions in implementations, references,
and rename because the index implementations (by Alan Donovan) preceded
this change to type-checking, and so these benchmark statistics compare
in-memory index performance to on-disk index performance. Again, we can
optimize these if necessary by keeping certain index information in
memory, or by decoding more selectively.

name                              old in_use_bytes  new in_use_bytes  delta
InitialWorkspaceLoad/tools-12            432M ± 2%          50M ± 2%    -88.54%  (p=0.000 n=10+10)

name                              old time/op       new time/op       delta
StructCompletion/tools-12              27.2ms ± 5%       31.8ms ± 9%    +16.99%  (p=0.000 n=9+9)
ImportCompletion/tools-12              2.07ms ± 8%       2.21ms ± 6%     +6.64%  (p=0.004 n=9+9)
SliceCompletion/tools-12               29.0ms ± 5%       32.7ms ± 5%    +12.78%  (p=0.000 n=10+9)
FuncDeepCompletion/tools-12            39.6ms ± 6%       39.3ms ± 3%       ~     (p=0.853 n=10+10)
CompletionFollowingEdit/tools-12       72.7ms ± 7%      108.1ms ± 7%    +48.59%  (p=0.000 n=9+9)
Definition/tools-12                     525µs ± 6%        601µs ± 2%    +14.33%  (p=0.000 n=9+10)
DidChange/tools-12                     6.17ms ± 7%       6.77ms ± 2%     +9.64%  (p=0.000 n=10+10)
Hover/tools-12                         2.11ms ± 5%       2.61ms ± 3%    +23.87%  (p=0.000 n=10+10)
Implementations/tools-12               4.04ms ± 3%      60.19ms ± 3%  +1389.77%  (p=0.000 n=9+10)
InitialWorkspaceLoad/tools-12           3.84s ± 4%        1.33s ± 2%    -65.47%  (p=0.000 n=10+9)
References/tools-12                    9.72ms ± 6%      24.28ms ± 6%   +149.83%  (p=0.000 n=10+10)
Rename/tools-12                         121ms ± 8%        168ms ±12%    +38.92%  (p=0.000 n=10+10)
WorkspaceSymbols/tools-12              14.4ms ± 6%       15.6ms ± 3%     +8.76%  (p=0.000 n=9+10)

This CL is one step closer to the end* of a long journey to reduce
memory usage and statefulness in gopls, so that it can be more
performant and reliable.

Specifically, this CL implements a new type-checking pass that loads and
stores export data, cross references, serialized diagnostics, and method
set indexes in the file system. Concurrent type-checking passes may
share in-progress work, but after type-checking only active packages are
kept in memory. Consequently, there can be no global relationship
between type-checked packages. The work to break any dependence on
global relationships was done over a long time leading up to this CL.

In order to approach the previous type-checking performance, the
following new optimizations are made:
 - the global FileSet is completely removed: repeatedly importing from
   export data resulted in a tremendous amount of unnecessary token.File
   information, and so FileSets had to be scoped to packages
 - files are parsed as a batch and stored in the LRU cache implemented
   in the preceding CL
 - type-checking is also turned into a batch process, so that
   overlapping nodes in the package graph may be shared during large
   type-checking operations such as the initial workspace load

This new execution model enables several simplifications:
 - We no longer need to trim the AST before type-checking:
   TypeCheckMode and ParseExported are gone.
 - We no longer need to do careful bookkeeping around parsed files: all
   parsing uses the LRU parse cache.
 - It is no longer necessary to estimate cache heap usage in debug
   information.

There is still much more to do. This new model for gopls's execution
requires significant testing and experimentation. There may be new bugs
in the complicated new algorithms that enable this change, or bugs
related to the new reliance on export data (this may be the first time
export data for packages with type errors is significantly exercised).
There may be new environments where the new execution model does not
have the same beneficial effect. (On the other hand, there may be
some where it has an even more beneficial effect, such as resource
limited environments like dev containers.) There are also a lot of new
opportunities for optimization now that we are no longer tied to a rigid
structure of in-memory data.

*Furthermore, the following planned work is simply not done yet:
 - Implement precise pruning based on "deep" hash of imports.
 - Rewrite unimported completion, now that we no longer have cached
   import paths.

For golang/go#57987

Change-Id: Iedfc16656f79e314be448b892b710b9e63f72551
Reviewed-on: https://go-review.googlesource.com/c/tools/+/466975
Run-TryBot: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
Reviewed-by: Alan Donovan <[email protected]>
@golang golang locked and limited conversation to collaborators May 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge gopls/performance Issues related to gopls performance (CPU, memory, etc). gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository.
Projects
None yet
Development

No branches or pull requests

4 participants