Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dbnode] Fix concurrency granularity of seekerManager.UpdateOpenLease #2790

Merged
merged 6 commits into from
Oct 27, 2020

Conversation

linasm
Copy link
Collaborator

@linasm linasm commented Oct 22, 2020

What this PR does / why we need it:
seekerManager.UpdateOpenLease(...) was preventing concurrent calls at global granularity (see the removed comment). This changes it to the smallest granularity possible (namespace + shard + block start) to allow concurrent operations.
Such change would prevent possible collisions between shard.AggregateTiles() background running and manual invocations, and shardColdFlush.Done().

Special notes for your reviewer:
Noticed this while investigating a flaky integration test.

Does this PR introduce a user-facing and/or backwards incompatible change?:
NONE

Does this PR require updating code package or user-facing documentation?:
NONE

Copy link
Contributor

@notbdu notbdu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

isUpdatingLease bool
status seekerManagerStatus
shardSet sharding.ShardSet
updateOpenLeasesInProgress map[block.HashableLeaseDescriptor]struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on consistent naming w/ what's in block/lease.go

func (m *seekerManager) startUpdateOpenLease(
namespace ident.ID,
hashableDescriptor block.HashableLeaseDescriptor,
) (bool, error) {
m.Lock()
defer m.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would we want to release this lock and perhaps use a sync.Map like we do in block/lease.go for tracking open leases? Although tbh perf wise this prob wouldn't make a diff at all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using sync.Map (as I did in block/lease.go) was the initial approach that I tried, but then I realized that the lock is being held for the whole duration of startUpdateOpenLease, and there is this namespace check in between progress check / update, so I decided to go with a simple map under this lock (to better preserve the original semantics).

@linasm linasm merged commit 0e2f31f into master Oct 27, 2020
@linasm linasm deleted the linasm/fix-seekerManager-UpdateOpenLease-granularity branch October 27, 2020 14:13
soundvibe added a commit that referenced this pull request Oct 29, 2020
* master:
  Support dynamic namespace resolution for embedded coordinators (#2815)
  [dbnode] Add index regexp DFA compilation cache to avoid allocating DFAs for same expressions (#2814)
  [dbnode] Set default cache on retrieve to false, prepare testing with cache none (#2813)
  [tools] Output annotations as base64 (#2743)
  Add Reset Transformation (#2794)
  [large-tiles] Fix for a reverse index when querying downsampled namespace (#2808)
  [dbnode] Evict info files cache before index bootstrap in peers bootstrapper (#2802)
  [dbnode] Bump default filesystem persist rate limit (#2806)
  [coordinator] Add metrics and configurability of downsample and ingest writer worker pool (#2797)
  [dbnode] Fix concurrency granularity of seekerManager.UpdateOpenLease (#2790)
  [dbnode] Use correct import path for atomic library (#2801)
  Regenerate carbon automapping rules on namespaces changes (#2793)
  Enforce matching retention and index block size (#2783)
  [query] Add ClusterNamespacesWatcher and use to generate downsample automapper rules (#2782)
  [query][dbnode] Wire up etcd-backed Clusters implementation in coordinator (#2758)
  [query] Fix quantile() argument not being passed through (#2780)

# Conflicts:
#	src/query/parser/promql/matchers.go
soundvibe added a commit that referenced this pull request Oct 29, 2020
* master:
  [query] Improve precision for variance and stddev of equal values (#2799)
  Support dynamic namespace resolution for embedded coordinators (#2815)
  [dbnode] Add index regexp DFA compilation cache to avoid allocating DFAs for same expressions (#2814)
  [dbnode] Set default cache on retrieve to false, prepare testing with cache none (#2813)
  [tools] Output annotations as base64 (#2743)
  Add Reset Transformation (#2794)
  [large-tiles] Fix for a reverse index when querying downsampled namespace (#2808)
  [dbnode] Evict info files cache before index bootstrap in peers bootstrapper (#2802)
  [dbnode] Bump default filesystem persist rate limit (#2806)
  [coordinator] Add metrics and configurability of downsample and ingest writer worker pool (#2797)
  [dbnode] Fix concurrency granularity of seekerManager.UpdateOpenLease (#2790)
  [dbnode] Use correct import path for atomic library (#2801)
  Regenerate carbon automapping rules on namespaces changes (#2793)
  Enforce matching retention and index block size (#2783)
  [query] Add ClusterNamespacesWatcher and use to generate downsample automapper rules (#2782)
  [query][dbnode] Wire up etcd-backed Clusters implementation in coordinator (#2758)
  [query] Fix quantile() argument not being passed through (#2780)
  [query] Add "median" aggregation to Graphite aggregate() function (#2774)
  [r2] Ensure KeepOriginal is propagated when adding/reviving rules (#2796)
  [dbnode] Update default db read limits  (#2784)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants