Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: check start-time above GC threshold #51139

Merged
merged 1 commit into from
Jul 12, 2020

Conversation

dt
Copy link
Member

@dt dt commented Jul 8, 2020

checkExecutionCanProceed typically checks that a request that is intending to
operate at a specific time will be doing so at or above the GC threshold.

It previously did this by comparing the ba.Timestamp to the (implied) GC
threshold, as this is the common field on all batches that indicate the
time at which they will operate. However a few special types of requests
operate on a span of time, rather than at a particular time. For
example, ExportRequest can ask to export all revisions between its
start and end times, or RevertRange can ask that all revisions between
its TargetTime and EndTime be destoryed. These commands require that
the GC threshold be not just below their end-time -- which they set
in ba.Timestamp, but also below their start-time, to ensure those
revisions they want to operate on are still there.

Previously they each checked this manually during evaluation. However
these checks were inconsistent with the common check that the kvserver
usually does on ba.Timestamp before even calling evaluation: they used
a different error type and, when the common check was extended to be
strict w.r.t. the TTL, they continued to only check the actual GC time.

This change instead updates that common check to read the ealier of the
batch timestamp or, if one of the requests in the batch is known to have
a start time, the earliest of those times. This then means that the
enforcement of 'this batch is operating at a timestamp above the GC TTL'
has the same semantics for batches thatt operate at a single time or on
a span of time.

Release note: none.

@dt dt requested review from ajwerner, nvanbenschoten, a team and pbardea and removed request for a team July 8, 2020 13:34
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool with this. :lgtm:

I do wonder, what would happen if we set the ba.Header.Timestamp to the StartTime rather than the TargetTime?

Reviewed 6 of 6 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @pbardea)

@dt
Copy link
Member Author

dt commented Jul 8, 2020

I think we set it to to the end time so it correctly populates the timestamp cache -- you don't want someone re-writing after an export (also means we could use it to decide to do follower reads I think).

@@ -1102,7 +1102,7 @@ func (r *Replica) checkExecutionCanProceed(
return err
} else if err := r.checkSpanInRangeRLocked(ctx, rSpan); err != nil {
return err
} else if err := r.checkTSAboveGCThresholdRLocked(ba.Timestamp, st, ba.IsAdmin()); err != nil {
} else if err := r.checkTSAboveGCThresholdRLocked(ba.EarliestTimestamp(), st, ba.IsAdmin()); err != nil {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure when checkExecutionCanProceedForRangeFeed is used but do I need to do this there too?

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

I think we set it to to the end time so it correctly populates the timestamp cache -- you don't want someone re-writing after an export (also means we could use it to decide to do follower reads I think).

👍

Reviewed 6 of 6 files at r1.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @dt)


pkg/kv/kvserver/replica.go, line 1105 at r1 (raw file):

Previously, dt (David Taylor) wrote…

Not sure when checkExecutionCanProceedForRangeFeed is used but do I need to do this there too?

No, I don't think it's needed there.


pkg/roachpb/api.go, line 204 at r1 (raw file):

// operate, which is nominally ba.Timestamp but could be earlier if a request in
// the batch operates on a time span such as ExportRequest or RevertRangeRequest
// which both specify the start of that span in their arguments.

Mention that in those cases, ba.Timestamp indicates the upper bound that the request operates on.


pkg/roachpb/api.go, line 205 at r1 (raw file):

// the batch operates on a time span such as ExportRequest or RevertRangeRequest
// which both specify the start of that span in their arguments.
func (ba BatchRequest) EarliestTimestamp() hlc.Timestamp {

Move this method below SetActiveTimestamp.

Also, consider changing the name of this to EarliestActiveTimestamp to reflect the parallels between those two functions.


pkg/roachpb/api.go, line 208 at r1 (raw file):

	ts := ba.Timestamp

	for i := range ba.Requests {

nit: I'd turn this into a case statement and use Timestamp.Backward like:

	ts := ba.Timestamp
	for _, ru := range ba.Requests {
		switch t := ru.GetInner().(type) {
		case *ExportRequest:
			ts.Backward(t.StartTime)
		case *RevertRangeRequest:
			ts.Backward(t.TargetTime)
		}
	}
	return ts

@dt dt force-pushed the check-start-gc branch from 7564011 to b04641b Compare July 12, 2020 14:05
checkExecutionCanProceed typically checks that a request that is intending to
operate at a specific time will be doing so at or above the GC threshold.

It previously did this by comparing the ba.Timestamp to the (implied) GC
threshold, as this is the common field on all batches that indicate the
time at which they will operate. However a few special types of requests
operate on *a span* of time, rather than at a particular time. For
example, ExportRequest can ask to export all revisions between its
start and end times, or RevertRange can ask that all revisions between
its TargetTime and EndTime be destoryed. These commands require that
the GC threshold be not just below their end-time -- which they set
in ba.Timestamp, but also below their start-time, to ensure those
revisions they want to operate on are still there.

Previously they each checked this manually during evaluation. However
these checks were inconsistent with the common check that the kvserver
usually does on ba.Timestamp before even calling evaluation: they used
a different error type and, when the common check was extended to be
strict w.r.t. the TTL, they continued to only check the actual GC time.

This change instead updates that common check to read the ealier of the
batch timestamp or, if one of the requests in the batch is known to have
a start time, the earliest of those times. This then means that the
enforcement of 'this batch is operating at a timestamp above the GC TTL'
has the same semantics for batches thatt operate at a single time or on
a span of time.

Release note: none.
@dt dt force-pushed the check-start-gc branch from b04641b to 420395b Compare July 12, 2020 18:09
@dt
Copy link
Member Author

dt commented Jul 12, 2020

TFTRs!

bors r+

Copy link
Member Author

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner and @nvanbenschoten)


pkg/roachpb/api.go, line 208 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: I'd turn this into a case statement and use Timestamp.Backward like:

	ts := ba.Timestamp
	for _, ru := range ba.Requests {
		switch t := ru.GetInner().(type) {
		case *ExportRequest:
			ts.Backward(t.StartTime)
		case *RevertRangeRequest:
			ts.Backward(t.TargetTime)
		}
	}
	return ts

Ah, much cleaner, thanks!

@craig
Copy link
Contributor

craig bot commented Jul 12, 2020

Build succeeded

@craig craig bot merged commit 12b58af into cockroachdb:master Jul 12, 2020
@dt dt deleted the check-start-gc branch July 17, 2020 03:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants