Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

Commit

Permalink
resolve major search perf regression in v3.2.0 (add redis cache to gi…
Browse files Browse the repository at this point in the history
…t commit OID resolution) (#3685)

* refactor OID resolution logic (review with whitespace off)

* add oid resolution cache

* add observability

* CHANGELOG: update

* custom cache implementation

* enterprise/dev/Procfile: fix bug where zoekt is not listening

* remove redundant return stmt

* resolveCommitIODUncached

* faster test

* Update CHANGELOG.md
  • Loading branch information
slimsag committed May 1, 2019
1 parent 191db01 commit 6917d59
Show file tree
Hide file tree
Showing 5 changed files with 358 additions and 27 deletions.
80 changes: 80 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,86 @@ All notable changes to Sourcegraph are documented in this file.

### Changed

### Removed

### Fixed

- Fixed a bug where submitting a saved query without selecting the location would fail for non-site admins (#3628).

## 3.3.6

## Changed

- All 24 language extensions are enabled by default.
- Fixed a major indexed search performance regression that occurred in v3.2.0. (#3685)

## 3.3.5

## Changed

- Indexed search is now enabled by default for new Docker deployments. (#3540)

### Removed

- Removed smart-casing behavior from search.

### Fixed

- Removes corrupted archives in the searcher cache and tries to populate the cache again instead of returning an error.
- Fixed a bug where search scopes would not get merged, and only the lowest-level list of search scopes would appear.
- Fixed an issue where repo-updater was slower in performing its work which could sometimes cause other performance issues. https://github.com/sourcegraph/sourcegraph/pull/3633

## 3.3.4

### Fixed

- Fixed bundling of the Phabricator integration assets in the Sourcegraph docker image.

## 3.3.3

### Fixed

- Fixed bug that prevented "Find references" action from being completed in the activation checklist.

## 3.3.2

### Fixed

- Fixed an issue where the default `bitbucketserver.repositoryQuery` would not be created on migration from older Sourcegraph versions. https://github.com/sourcegraph/sourcegraph/issues/3591
- Fixed an issue where Sourcegraph would add deleted repositories to the external service configuration. https://github.com/sourcegraph/sourcegraph/issues/3588
- Fixed an issue where a repo-updater migration would hit code host rate limits. https://github.com/sourcegraph/sourcegraph/issues/3582
- The required `bitbucketserver.username` field of a [Bitbucket Server external service configuration](https://docs.sourcegraph.com/admin/external_service/bitbucketserver#configuration), if unset or empty, is automatically migrated to match the user part of the `url` (if defined). https://github.com/sourcegraph/sourcegraph/issues/3592
- Fixed a panic that would occur in indexed search / the frontend when a search error ocurred. https://github.com/sourcegraph/sourcegraph/issues/3579
- Fixed an issue where the repo-updater service could become deadlocked while performing a migration. https://github.com/sourcegraph/sourcegraph/issues/3590

## 3.3.1

### Fixed

- Fixed a bug that prevented external service configurations specifying client certificates from working (#3523)

## 3.3.0

### Added

- In search queries, treat `foo(` as `foo\(` and `bar[` as `bar\[` rather than failing with an error message.
- Enterprise admins can now customize the appearance of the homepage and search icon.
- A new settings property `notices` allows showing custom informational messages on the homepage and at the top of each page. The `motd` property is deprecated and its value is automatically migrated to the new `notices` property.
- The new `gitlab.exclude` setting in [GitLab external service config](https://docs.sourcegraph.com/admin/external_service/gitlab#configuration) allows you to exclude specific repositories matched by `gitlab.projectQuery` and `gitlab.projects` (so that they won't be synced). Upon upgrading, previously "disabled" repositories will be automatically migrated to this exclusion list.
- The new `gitlab.projects` setting in [GitLab external service config](https://docs.sourcegraph.com/admin/external_service/gitlab#configuration) allows you to select specific repositories to be synced.
- The new `bitbucketserver.exclude` setting in [Bitbucket Server external service config](https://docs.sourcegraph.com/admin/external_service/bitbucketserver#configuration) allows you to exclude specific repositories matched by `bitbucketserver.repositoryQuery` and `bitbucketserver.repos` (so that they won't be synced). Upon upgrading, previously "disabled" repositories will be automatically migrated to this exclusion list.
- The new `bitbucketserver.repos` setting in [Bitbucket Server external service config](https://docs.sourcegraph.com/admin/external_service/bitbucketserver#configuration) allows you to select specific repositories to be synced.
- The new required `bitbucketserver.repositoryQuery` setting in [Bitbucket Server external service configuration](https://docs.sourcegraph.com/admin/external_service/bitbucketserver#configuration) allows you to use Bitbucket API repository search queries to select repos to be synced. Existing configurations will be migrate to have it set to `["?visibility=public", "?visibility=private"]` which is equivalent to the previous implicit behaviour that this setting supersedes.
- "Quick configure" buttons for common actions have been added to the config editor for all external services.
- "Quick configure" buttons for common actions have been added to the management console.
- Site-admins now receive an alert every day for the seven days before their license key expires.
- The user menu (in global nav) now lists the user's organizations.
- All users on an instance now see a non-dismissable alert when when there's no license key in use and the limit of free user accounts is exceeded.
- All users will see a dismissible warning about limited search performance and accuracy on when using the sourcegraph/server Docker image with more than 100 repositories enabled.

### Changed

- Indexed searches that time out more consistently report a timeout instead of erroneously saying "No results."
- The symbols sidebar now only shows symbols defined in the current file or directory.
- The dynamic filters on search results pages will now display `lang:` instead of `file:` filters for language/file-extension filter suggestions.
- The default `github.repositoryQuery` of a [GitHub external service configuration](https://docs.sourcegraph.com/admin/external_service/github#configuration) has been changed to `["none"]`. Existing configurations that had this field unset will be migrated to have the previous default explicitly set (`["affiliated", "public"]`).
Expand Down
84 changes: 58 additions & 26 deletions cmd/frontend/graphqlbackend/git_commit.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import (
"fmt"
"strings"
"sync"
"time"

"github.com/sourcegraph/sourcegraph/cmd/frontend/backend"
"github.com/sourcegraph/sourcegraph/cmd/frontend/graphqlbackend/externallink"
Expand Down Expand Up @@ -35,9 +36,9 @@ type gitCommitResolver struct {
// to avoid redirecting a user browsing a revision "mybranch" to the absolute commit ID as they follow links in the UI.
inputRev *string

oid gitObjectID
once sync.Once
onceErr error
oid gitObjectID
oidErr error
once sync.Once

author signatureResolver
committer *signatureResolver
Expand Down Expand Up @@ -84,40 +85,71 @@ func (r *gitCommitResolver) ID() graphql.ID {
func (r *gitCommitResolver) Repository() *repositoryResolver { return r.repo }

func (r *gitCommitResolver) OID() (gitObjectID, error) {
return r.getCommitOID()
}

func (r *gitCommitResolver) getCommitOID() (gitObjectID, error) {
r.once.Do(func() {
// If we already have the commit, no need to try to compute it.
if r.oid != "" {
return // We already have an oid because the creator of this *gitCommitResolver specified it.
}

// Try fetching it from the Redis cache to avoid doing lots of work
// previously done (as this method is called very often, e.g. multiple
// times per search result).
start := time.Now()
result, ok := oidResolutionCache.Get(string(r.repo.repo.ID))
oidResolutionCacheLookupDuration.Observe(time.Since(start).Seconds())
if ok {
oidResolutionCounter.WithLabelValues("hit").Inc()
r.oid = gitObjectID(result)
return
}

// Commit OID is the empty string denoting the default branch. Find out
// what is the latest commit indexed by zoekt.
// The cache doesn't have it, so compute it and update the cache if we
// resolved it successfully.
start = time.Now()
r.resolveCommitIODUncached()
oidResolutionDuration.Observe(time.Since(start).Seconds())
if r.oidErr == nil {
oidResolutionCounter.WithLabelValues("miss").Inc()
oidResolutionCache.Set(string(r.repo.repo.ID), string(r.oid))
} else {
oidResolutionCounter.WithLabelValues("miss_error").Inc()
}
})
return r.oid, r.oidErr
}

indexInfo := r.repo.TextSearchIndex()
func (r *gitCommitResolver) resolveCommitIODUncached() {
if r.oid != "" || r.oidErr != nil {
// Possible scenarios for this case:
//
// - We already have an r.oid because the creator of this *gitCommitResolver specified it.
// - We already have an r.oid because we were called before.
// - We don't have an r.oid but have an r.oidErr from being called before.
//
// In any case, there is no point in doing the work again, so we return
// now.
return
}

ctx := context.Background()
// Commit OID is the empty string denoting the default branch. Find out
// what is the latest commit indexed by zoekt.

var refs []*repositoryTextSearchIndexedRef
refs, r.onceErr = indexInfo.Refs(ctx)
if r.onceErr != nil {
return
}
indexInfo := r.repo.TextSearchIndex()

for _, ref := range refs {
current, _ := ref.Current(ctx)
if current {
r.oid = ref.indexedCommit
ctx := context.Background()

break
}
}
})
var refs []*repositoryTextSearchIndexedRef
refs, r.oidErr = indexInfo.Refs(ctx)
if r.oidErr != nil {
return
}

return r.oid, r.onceErr
for _, ref := range refs {
current, _ := ref.Current(ctx)
if current {
r.oid = ref.indexedCommit
break
}
}
}

func (r *gitCommitResolver) AbbreviatedOID() (string, error) {
Expand Down
123 changes: 123 additions & 0 deletions cmd/frontend/graphqlbackend/git_commit_resolution_cache.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
package graphqlbackend

import (
"sync"
"time"

"github.com/prometheus/client_golang/prometheus"
)

type resolutionCacheEntry struct {
t time.Time
value string
}

type resolutionCache struct {
// ttl indicates how long before cache entries expire. There is no limit on
// the size of the cache except the effective # of repositories on the
// Sourcegraph instance.
ttl time.Duration

// cacheEntries, if non-nil, is used to record the number of entries in the cache.
cacheEntries prometheus.Histogram

// workerInterval, if non-zero, specifies the interval at which the worker
// checks for entries to evict. Defaults ttl / 2.
workerInterval time.Duration

// mockSleep, if non-nil can be used to mock time.Sleep for testing purposes.
mockSleep func(d time.Duration)

m sync.Map
}

func (r *resolutionCache) Set(k, v string) {
r.m.Store(k, resolutionCacheEntry{
t: time.Now(),
value: v,
})
}

func (r *resolutionCache) Get(k string) (string, bool) {
v, ok := r.m.Load(k)
if !ok {
return "", false
}
e := v.(resolutionCacheEntry)
if time.Since(e.t) >= r.ttl {
// entry has expired
r.m.Delete(k)
return "", false
}
return e.value, true
}

func (r *resolutionCache) startWorker() *resolutionCache {
if r.workerInterval == 0 {
r.workerInterval = r.ttl / 2
}
sleep := time.Sleep
if r.mockSleep != nil {
sleep = r.mockSleep
}
go func() {
for {
sleep(r.workerInterval)
size := 0
r.m.Range(func(key, value interface{}) bool {
size++
e := value.(resolutionCacheEntry)
if time.Since(e.t) >= r.ttl {
// entry has expired
r.m.Delete(key)
}
return true
})
r.cacheEntries.Observe(float64(size))
}
}()
return r
}

var (
// oidResolutionCache is used to cache Git commit OID resolution. This is
// used because OID resolution happens extremely often (e.g. multiple times
// per search result).
oidResolutionCache = (&resolutionCache{
ttl: 60 * time.Second,
cacheEntries: prometheus.NewHistogram(prometheus.HistogramOpts{
Namespace: "src",
Subsystem: "graphql",
Name: "git_commit_oid_resolution_cache_entries",
Help: "Total number of entries in the in-memory Git commit OID resolution cache.",
}),
}).startWorker()

oidResolutionCounter = prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: "src",
Subsystem: "graphql",
Name: "git_commit_oid_resolution_cache_hit",
Help: "Counts cache hits and misses for Git commit OID resolution.",
}, []string{"type"})

oidResolutionDuration = prometheus.NewHistogram(prometheus.HistogramOpts{
Namespace: "src",
Subsystem: "graphql",
Name: "git_commit_oid_resolution_duration_seconds",
Help: "Total time spent performing uncached Git commit OID resolution.",
})

oidResolutionCacheLookupDuration = prometheus.NewHistogram(prometheus.HistogramOpts{
Namespace: "src",
Subsystem: "graphql",
Name: "git_commit_oid_resolution_cache_lookup_duration_seconds",
Help: "Total time spent performing cache lookups for Git commit OID resolution.",
})
)

func init() {
prometheus.MustRegister(oidResolutionCache.cacheEntries)
prometheus.MustRegister(oidResolutionCounter)
prometheus.MustRegister(oidResolutionDuration)
prometheus.MustRegister(oidResolutionCacheLookupDuration)
}
Loading

0 comments on commit 6917d59

Please sign in to comment.