Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keys,storage: add lock table key space, EngineKey, and LockTableKey #55878

Merged
merged 1 commit into from
Oct 24, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 23 additions & 3 deletions pkg/keys/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,9 @@ var (
// key suffixes.
localSuffixLength = 4

// There are four types of local key data enumerated below: replicated
// range-ID, unreplicated range-ID, range local, and store-local keys.
// There are five types of local key data enumerated below: replicated
// range-ID, unreplicated range-ID, range local, range lock, and
// store-local keys.

// 1. Replicated Range-ID keys
//
Expand Down Expand Up @@ -155,7 +156,26 @@ var (
// (storage/engine/rocksdb/db.cc).
LocalTransactionSuffix = roachpb.RKey("txn-")

// 4. Store local keys
// 4. Lock table keys
//
// LocalRangeLockTablePrefix specifies the key prefix for the lock
// table. It is immediately followed by the LockTableSingleKeyInfix,
// and then the key being locked.
//
// The lock strength and txn UUID are not in the part of the key that
// the keys package deals with. They are in the versioned part of the
// key (see EngineKey.Version). This permits the storage engine to use
// bloom filters when searching for all locks for a lockable key.
//
// Different lock strengths may use different value types. The exclusive
// lock strength uses MVCCMetadata as the value type, since it does
// double duty as a reference to a provisional MVCC value.
// TODO(sumeer): remember to adjust this comment when adding locks of
// other strengths, or range locks.
LocalRangeLockTablePrefix = roachpb.Key(makeKey(localPrefix, roachpb.RKey("l")))
LockTableSingleKeyInfix = []byte("k")

// 5. Store local keys
//
// localStorePrefix is the prefix identifying per-store data.
localStorePrefix = makeKey(localPrefix, roachpb.Key("s"))
Expand Down
27 changes: 20 additions & 7 deletions pkg/keys/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -155,8 +155,9 @@ package keys
var _ = [...]interface{}{
MinKey,

// There are four types of local key data enumerated below: replicated
// range-ID, unreplicated range-ID, range local, and store-local keys.
// There are five types of local key data enumerated below: replicated
// range-ID, unreplicated range-ID, range local, range lock, and
// store-local keys.
// Local keys are constructed using a prefix, an optional infix, and a
// suffix. The prefix and infix are used to disambiguate between the four
// types of local keys listed above, and determines inter-group ordering.
Expand All @@ -167,12 +168,14 @@ var _ = [...]interface{}{
// - RangeID unreplicated keys all share `LocalRangeIDPrefix` and
// `localRangeIDUnreplicatedInfix`.
// - Range local keys all share `LocalRangePrefix`.
// - Range lock (which are also local keys) all share
// `LocalRangeLockTablePrefix`.
// - Store keys all share `localStorePrefix`.
//
// `LocalRangeIDPrefix`, `localRangePrefix` and `localStorePrefix` all in
// turn share `localPrefix`. `localPrefix` was chosen arbitrarily. Local
// keys would work just as well with a different prefix, like 0xff, or even
// with a suffix.
// `LocalRangeIDPrefix`, `localRangePrefix`, `LocalRangeLockTablePrefix`,
// and `localStorePrefix` all in turn share `localPrefix`. `localPrefix` was
// chosen arbitrarily. Local keys would work just as well with a different
// prefix, like 0xff, or even with a suffix.

// 1. Replicated range-ID local keys: These store metadata pertaining to a
// range as a whole. Though they are replicated, they are unaddressable.
Expand Down Expand Up @@ -206,7 +209,17 @@ var _ = [...]interface{}{
RangeDescriptorKey, // "rdsc"
TransactionKey, // "txn-"

// 4. Store local keys: These contain metadata about an individual store.
// 4. Range lock keys for all replicated locks. All range locks share
// LocalRangeLockTablePrefix. Locks can be acquired on global keys and on
// range local keys. Currently, locks are only on single keys, i.e., not
// on a range of keys. Only exclusive locks are currently supported, and
// these additionally function as pointers to the provisional MVCC values.
// Single key locks use a byte, LockTableSingleKeyInfix, that follows
// the LocalRangeLockTablePrefix. This is to keep the single-key locks
// separate from (future) range locks.
LockTableSingleKey,

// 5. Store local keys: These contain metadata about an individual store.
// They are unreplicated and unaddressable. The typical example is the
// store 'ident' record. They all share `localStorePrefix`.
StoreSuggestedCompactionKey, // "comp"
Expand Down
43 changes: 43 additions & 0 deletions pkg/keys/keys.go
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,49 @@ func QueueLastProcessedKey(key roachpb.RKey, queue string) roachpb.Key {
return MakeRangeKey(key, LocalQueueLastProcessedSuffix, roachpb.RKey(queue))
}

// LockTableSingleKey creates a key under which all single-key locks for the
// given key can be found. Note that there can be multiple locks for the given
// key, but those are distinguished using the "version" which is not in scope
// of the keys package.
// For a scan [start, end) the corresponding lock table scan is
// [LTSK(start), LTSK(end)).
func LockTableSingleKey(key roachpb.Key) roachpb.Key {
// Don't unwrap any local prefix on key using Addr(key). This allow for
// doubly-local lock table keys. For example, local range descriptor keys can
// be locked during split and merge transactions.
// The +3 account for the bytesMarker and terminator.
buf := make(roachpb.Key, 0,
len(LocalRangeLockTablePrefix)+len(LockTableSingleKeyInfix)+len(key)+3)
buf = append(buf, LocalRangeLockTablePrefix...)
buf = append(buf, LockTableSingleKeyInfix...)
buf = encoding.EncodeBytesAscending(buf, key)
return buf
}

// DecodeLockTableSingleKey decodes the single-key lock table key to return the key
// that was locked..
func DecodeLockTableSingleKey(key roachpb.Key) (lockedKey roachpb.Key, err error) {
if !bytes.HasPrefix(key, LocalRangeLockTablePrefix) {
return nil, errors.Errorf("key %q does not have %q prefix",
key, LocalRangeLockTablePrefix)
}
// Cut the prefix.
b := key[len(LocalRangeLockTablePrefix):]
if !bytes.HasPrefix(b, LockTableSingleKeyInfix) {
return nil, errors.Errorf("key %q is not for a single-key lock", key)
}
b = b[len(LockTableSingleKeyInfix):]
b, lockedKey, err = encoding.DecodeBytesAscending(b, nil)
if err != nil {
return nil, err
}
if len(b) != 0 {
return nil, errors.Errorf("key %q has left-over bytes %d after decoding",
key, len(b))
}
return lockedKey, err
}

// IsLocal performs a cheap check that returns true iff a range-local key is
// passed, that is, a key for which `Addr` would return a non-identical RKey
// (or a decoding error).
Expand Down
23 changes: 23 additions & 0 deletions pkg/keys/keys_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -724,3 +724,26 @@ func TestTenantPrefix(t *testing.T) {
})
}
}

func TestLockTableKeyEncodeDecode(t *testing.T) {
expectedPrefix := append([]byte(nil), LocalRangeLockTablePrefix...)
expectedPrefix = append(expectedPrefix, LockTableSingleKeyInfix...)
testCases := []struct {
key roachpb.Key
}{
{key: roachpb.Key("foo")},
{key: roachpb.Key("a")},
{key: roachpb.Key("")},
// Causes a doubly-local range local key.
{key: RangeDescriptorKey(roachpb.RKey("baz"))},
}
for _, test := range testCases {
t.Run("", func(t *testing.T) {
ltKey := LockTableSingleKey(test.key)
require.True(t, bytes.HasPrefix(ltKey, expectedPrefix))
k, err := DecodeLockTableSingleKey(ltKey)
require.NoError(t, err)
require.Equal(t, test.key, k)
})
}
}
38 changes: 38 additions & 0 deletions pkg/storage/engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,44 @@ type Iterator interface {
SupportsPrev() bool
}

// EngineIterator is an iterator over key-value pairs where the key is
// an EngineKey.
//lint:ignore U1001 unused
type EngineIterator interface {
// Close frees up resources held by the iterator.
Close()
// SeekGE advances the iterator to the first key in the engine which
// is >= the provided key.
SeekGE(key EngineKey) (valid bool, err error)
// SeekLT advances the iterator to the first key in the engine which
// is < the provided key.
SeekLT(key EngineKey) (valid bool, err error)
// Next advances the iterator to the next key/value in the
// iteration. After this call, valid will be true if the
// iterator was not originally positioned at the last key.
Next() (valid bool, err error)
// Prev moves the iterator backward to the previous key/value
// in the iteration. After this call, valid will be true if the
// iterator was not originally positioned at the first key.
Prev() (valid bool, err error)
// UnsafeKey returns the same value as Key, but the memory is invalidated on
// the next call to {Next,NextKey,Prev,SeekGE,SeekLT,Close}.
// REQUIRES: latest positioning function returned valid=true.
UnsafeKey() EngineKey
// UnsafeValue returns the same value as Value, but the memory is
// invalidated on the next call to {Next,NextKey,Prev,SeekGE,SeekLT,Close}.
// REQUIRES: latest positioning function returned valid=true.
UnsafeValue() []byte
// Key returns the current key.
// REQUIRES: latest positioning function returned valid=true.
Key() EngineKey
// Value returns the current value as a byte slice.
// REQUIRES: latest positioning function returned valid=true.
Value() []byte
// SetUpperBound installs a new upper bound for this iterator.
SetUpperBound(roachpb.Key)
}

// MVCCIterator is an interface that extends Iterator and provides concrete
// implementations for MVCCGet and MVCCScan operations. It is used by instances
// of the interface backed by RocksDB iterators to avoid cgo hops.
Expand Down
Loading