-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jobs,sql: create jobs in a batch in a transactions #67991
Conversation
fc74a05
to
32be530
Compare
32be530
to
0771caf
Compare
0771caf
to
e6c6356
Compare
Although some tests are failing, I appreciate if I can get some feedback. I might be pushing too hard in the tests that are failing. If not, they may lead to further changes or adding bounds on the size of a batch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preliminary pass on the subway back from the dentist
pkg/jobs/jobs.go
Outdated
@@ -76,6 +76,50 @@ type Record struct { | |||
CreatedBy *CreatedByInfo | |||
} | |||
|
|||
// Specification specifies a job, which will be created in a batch with other jobs. | |||
type Specification struct { | |||
jobID jobspb.JobID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to think maybe we should just stick ID as a field in Record. It'd simplify the code and some function signatures. I don't see any good reason not to. The history here is that the jobs package used to allocate the id for the user. That turned out to be unfortunate so we changed the client to allocate the id. May as well just move the ID into the record.
pkg/jobs/registry.go
Outdated
// in a single insert to the jobs table. | ||
// There must be at least one job to create, otherwise the function returns an error. | ||
// If an error occurs while preparing the insert statement, no job is created. | ||
//TODO(sajjad): To discuss: What batch size do we expect? Should we limit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we'll want a maximum batch size. Start with just some constant. Maybe 100? We'll want to make sure we test using multiple batches.
pkg/jobs/registry.go
Outdated
// column's value will be large due to large statement size. Are there any | ||
// implications that can result in exceeding limits such as columns size limits? | ||
func (r *Registry) CreateJobsWithTxn( | ||
ctx context.Context, jobSpecs []*Specification, txn *kv.Txn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: put the transaction before the jobs
pkg/sql/planner.go
Outdated
@@ -93,7 +93,7 @@ type extendedEvalContext struct { | |||
Jobs *jobsCollection | |||
|
|||
// SchemaChangeJobCache refers to schemaChangeJobsCache in extraTxnState. | |||
SchemaChangeJobCache map[descpb.ID]*jobs.Job | |||
SchemaChangeJobCache map[descpb.ID]*jobs.Specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that we're going to want to nuance this data structure a bit. There are places where we delete from this cache. Those places are trying to be a bit clever and used to rely on the fact that we had already written the job. I'm thinking what we want is to create a slice of job records and a map from descriptor ID to the record. Then in the places where today we delete from the map we'd retain the record in the slice.
I suspect this is the cause of some of the failing tests.
pkg/sql/schema.go
Outdated
} | ||
log.Infof(ctx, "queued new schema change job %d for schema %d", newJob.ID(), desc.ID) | ||
newSpec := jobs.NewSpecification(p.extendedEvalCtx.ExecCfg.JobRegistry.MakeJobID(), jobRecord) | ||
//TODO(sajjad): To discuss: Previous code didn't cache the new job. I think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you look at what QueueJob does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick review.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/jobs/jobs.go, line 81 at r2 (raw file):
Previously, ajwerner wrote…
I'm starting to think maybe we should just stick ID as a field in Record. It'd simplify the code and some function signatures. I don't see any good reason not to. The history here is that the jobs package used to allocate the id for the user. That turned out to be unfortunate so we changed the client to allocate the id. May as well just move the ID into the record.
I have not built a sufficient context yet. I am following your judgement here.
pkg/jobs/registry.go, line 375 at r2 (raw file):
Previously, ajwerner wrote…
Yeah we'll want a maximum batch size. Start with just some constant. Maybe 100? We'll want to make sure we test using multiple batches.
I will set a max batch size. We may want to make it a cluster setting, but not for now.
I already have a test case that checks multiple batches. The test fails for larger batches.
pkg/jobs/registry.go, line 380 at r2 (raw file):
Previously, ajwerner wrote…
nit: put the transaction before the jobs
Done.
pkg/sql/planner.go, line 96 at r2 (raw file):
Previously, ajwerner wrote…
My guess is that we're going to want to nuance this data structure a bit. There are places where we delete from this cache. Those places are trying to be a bit clever and used to rely on the fact that we had already written the job. I'm thinking what we want is to create a slice of job records and a map from descriptor ID to the record. Then in the places where today we delete from the map we'd retain the record in the slice.
I suspect this is the cause of some of the failing tests.
OK. I guess that clever function is markTableMutationJobsSuccessful
. We will end up deleting the slice only inresetExtraTxnState
.
It might be the cause of error! However, the test logs show messages related to throttling and ranges going down.
pkg/sql/schema.go, line 87 at r2 (raw file):
Previously, ajwerner wrote…
Can you look at what QueueJob does?
Yeah, it creates a new job and adds the job to Jobs collection. So this is a place where want the job to be only in the Jobs collection but not in the cache. I will have to preserve this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sajjadrizvi)
pkg/jobs/jobs.go, line 81 at r2 (raw file):
Previously, sajjadrizvi (Sajjad Rizvi) wrote…
I have not built a sufficient context yet. I am following your judgement here.
Yeah, my advice is that we add ID
as a field to record. It should simplify the interfaces.
pkg/jobs/registry.go, line 375 at r2 (raw file):
Previously, sajjadrizvi (Sajjad Rizvi) wrote…
I will set a max batch size. We may want to make it a cluster setting, but not for now.
I already have a test case that checks multiple batches. The test fails for larger batches.
It should not fail. Instead this function should have a loop to send multiple batches.
pkg/sql/conn_executor_exec.go, line 807 at r2 (raw file):
Quoted 10 lines of code…
// 1. I am not sure if I am returning the error correctly. // 2. Is this the right place to create jobs? I have placed it outside of the // the following code block because the jobs were being created outside of // it in the previous code. // 3. If I have add any ex.phaseTimes, which one should it be? Do I have to // create a new one? if err := ex.createJobs(ctx); err != nil { return ex.makeErrEvent(err, ast) }
This logic should be pulled into commitSQLTransactionInternal
pkg/sql/conn_executor_exec.go, line 824 at r2 (raw file):
} var specList []*jobs.Specification //TODO(sajjad): To discuss: Does the order of jobs creation matter?
No, it does not matter.
pkg/sql/planner.go, line 96 at r2 (raw file):
the test logs show messages related to throttling and ranges going down.
I have a feeling that that's just noise at the end of the test and is not the true failure.
pkg/sql/schema.go, line 87 at r2 (raw file):
Previously, sajjadrizvi (Sajjad Rizvi) wrote…
Yeah, it creates a new job and adds the job to Jobs collection. So this is a place where want the job to be only in the Jobs collection but not in the cache. I will have to preserve this.
My guess is that not putting the descriptor into the cache was an oversight.
pkg/sql/schema.go, line 90 at r2 (raw file):
// the newSpec should be cached. p.extendedEvalCtx.SchemaChangeJobCache[desc.ID] = newSpec log.Infof(ctx, "cached specification of new schema change job %d for schema %d", newSpec.ID(), desc.ID)
This is not a good log message. It would be extremely confusing for a read. I think the old message was pretty reasonable. Your change of the implementation here doesn't affect the behavior really at all.
e6c6356
to
441032c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
pkg/jobs/registry.go, line 375 at r2 (raw file):
Previously, ajwerner wrote…
It should not fail. Instead this function should have a loop to send multiple batches.
I meant the batch size set in the test function. Nevertheless, I have implemented as suggested.
pkg/sql/conn_executor_exec.go, line 807 at r2 (raw file):
Previously, ajwerner wrote…
// 1. I am not sure if I am returning the error correctly. // 2. Is this the right place to create jobs? I have placed it outside of the // the following code block because the jobs were being created outside of // it in the previous code. // 3. If I have add any ex.phaseTimes, which one should it be? Do I have to // create a new one? if err := ex.createJobs(ctx); err != nil { return ex.makeErrEvent(err, ast) }
This logic should be pulled into
commitSQLTransactionInternal
Done.
pkg/sql/conn_executor_exec.go, line 824 at r2 (raw file):
Previously, ajwerner wrote…
No, it does not matter.
Done.
pkg/sql/schema.go, line 90 at r2 (raw file):
Previously, ajwerner wrote…
This is not a good log message. It would be extremely confusing for a read. I think the old message was pretty reasonable. Your change of the implementation here doesn't affect the behavior really at all.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I forgot to publish the comments before pushing the changes.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
441032c
to
704360c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is now ready for another review.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
704360c
to
622cbc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is getting close.
Reviewed 5 of 14 files at r8.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sajjadrizvi)
pkg/bench/rttanalysis/testdata/benchmark_expectations, line 80 at r8 (raw file):
Quoted 6 lines of code…
26,Truncate/truncate_1_column_0_rows 26,Truncate/truncate_1_column_1_row 26,Truncate/truncate_1_column_2_rows 26,Truncate/truncate_2_column_0_rows 26,Truncate/truncate_2_column_1_rows 26,Truncate/truncate_2_column_2_rows
I have a feeling we're going to want to retain the range here of round-trips here (put the 26-27 back).
pkg/jobs/registry.go, line 375 at r8 (raw file):
no further jobs are created and the error is returned.
I find this language confusing. The caller shouldn't need to care about other jobs having been created. It makes it sound like it is not a transactional operation.
pkg/jobs/registry.go, line 378 at r8 (raw file):
func (r *Registry) CreateJobsWithTxn( ctx context.Context, txn *kv.Txn, records []*Record, ) ([]*Job, error) {
Do callers want the *Job
or would they be happy with just a slice of IDs? I feel like IDs might be better.
pkg/jobs/registry.go, line 382 at r8 (raw file):
return nil, errors.Errorf("no jobs to create.") }
can you refactor this a bit into a helper to write a batch and a helper to generate the statement and args. I think it should read something like:
created := make([]*Job, 0, len(records)
for toCreate := records; len(toCreate) > 0; {
const maxBatchSize = 100
batchSize := len(toCreate)
if batchSize > maxBatchSize {
batchSize = maxBatchSize
}
createdInBatch, err := r.createJobsWithTxnInBatch(ctx, toCreate[:batchSize])
if err != nil {
return nil, err
}
created = append(created, createdInBatch...)
toCreate = toCreate[batchSize:]
}
func (r *Registry) createJobsWithTxnInBatch(ctx context.Context, txn *kv.Txn records []*Record) ([]*Job, error) {
s, err := r.sqlInstance.Session(ctx)
if err != nil {
return nil, errors.Wrap(err, "error getting live session")
}
stmt, args := batchJobInsertStmt(s, records)
if err := r.ex.Exec(ctx, "job-rows-batch-insert", txn, stmt, args...); err != nil {
return nil, err
}
// return the jobs etc. Maybe construct them when you construct the statement and args.
}
pkg/jobs/registry.go, line 454 at r8 (raw file):
if _, err = r.ex.Exec(ctx, "job-rows-batch-insert", txn, stmtSB.String(), args..., ); err != nil {
nit: I find this wrapping to be bizarre. My preference is:
if _, err = r.ex.Exec(
ctx, "job-rows-batch-insert", txn, stmtSB.String(), args...,
); err != nil {
pkg/jobs/registry_test.go, line 302 at r8 (raw file):
} skip.UnderStress(t)
Isn't this covered above?
pkg/jobs/registry_test.go, line 334 at r8 (raw file):
// Create jobs in a batch. var jobs []*Job require.NoError(t, kvDB.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
I think you're going to have a bad time if the callback gets called more than once. You should reset the jobs
slice in the callback.
pkg/jobs/registry_test.go, line 350 at r8 (raw file):
Quoted 4 lines of code…
// and description? I was expecting that the type will be "IMPORT" in // the jobs table when the job is created, but it occurred to be NULL. Is // that expected? Similarly, description is NULL.
When you say type
what do you mean? Which table? The description is a field in the record so it should be what you set it to. Maybe set it to some expected values.
pkg/sql/drop_table.go, line 509 at r8 (raw file):
for _, mj := range tableDesc.MutationJobs { jobID := jobspb.JobID(mj.JobID) if record, exists := p.ExtendedEvalContext().SchemaChangeJobCache[tableDesc.ID]; exists {
This deserves some commentary for future readers as well as some commentary about this behavior on the function itself.
pkg/sql/table.go, line 119 at r8 (raw file):
Quoted 4 lines of code…
var record *jobs.Record if cachedRecord, ok := p.extendedEvalCtx.SchemaChangeJobCache[tableDesc.ID]; ok { record = cachedRecord }
nit: replace this with the following code to allow you to eliminate some code.
record, recordExists := p.extendedEvalCtx.SchemaChangeJobCache[tableDesc.ID]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/bench/rttanalysis/testdata/benchmark_expectations, line 80 at r8 (raw file):
Previously, ajwerner wrote…
26,Truncate/truncate_1_column_0_rows 26,Truncate/truncate_1_column_1_row 26,Truncate/truncate_1_column_2_rows 26,Truncate/truncate_2_column_0_rows 26,Truncate/truncate_2_column_1_rows 26,Truncate/truncate_2_column_2_rows
I have a feeling we're going to want to retain the range here of round-trips here (put the 26-27 back).
Done.
pkg/jobs/jobs.go, line 81 at r2 (raw file):
Previously, ajwerner wrote…
Yeah, my advice is that we add
ID
as a field to record. It should simplify the interfaces.
Done.
pkg/jobs/registry.go, line 375 at r8 (raw file):
Previously, ajwerner wrote…
no further jobs are created and the error is returned.
I find this language confusing. The caller shouldn't need to care about other jobs having been created. It makes it sound like it is not a transactional operation.
Done.
pkg/jobs/registry.go, line 378 at r8 (raw file):
Previously, ajwerner wrote…
Do callers want the
*Job
or would they be happy with just a slice of IDs? I feel like IDs might be better.
That's right. I have changed it to IDs.
pkg/jobs/registry.go, line 382 at r8 (raw file):
Previously, ajwerner wrote…
can you refactor this a bit into a helper to write a batch and a helper to generate the statement and args. I think it should read something like:
created := make([]*Job, 0, len(records) for toCreate := records; len(toCreate) > 0; { const maxBatchSize = 100 batchSize := len(toCreate) if batchSize > maxBatchSize { batchSize = maxBatchSize } createdInBatch, err := r.createJobsWithTxnInBatch(ctx, toCreate[:batchSize]) if err != nil { return nil, err } created = append(created, createdInBatch...) toCreate = toCreate[batchSize:] } func (r *Registry) createJobsWithTxnInBatch(ctx context.Context, txn *kv.Txn records []*Record) ([]*Job, error) { s, err := r.sqlInstance.Session(ctx) if err != nil { return nil, errors.Wrap(err, "error getting live session") } stmt, args := batchJobInsertStmt(s, records) if err := r.ex.Exec(ctx, "job-rows-batch-insert", txn, stmt, args...); err != nil { return nil, err } // return the jobs etc. Maybe construct them when you construct the statement and args. }
This is much better. Thanks for the direction. I have updated.
pkg/jobs/registry.go, line 454 at r8 (raw file):
Previously, ajwerner wrote…
if _, err = r.ex.Exec(ctx, "job-rows-batch-insert", txn, stmtSB.String(), args..., ); err != nil {
nit: I find this wrapping to be bizarre. My preference is:
if _, err = r.ex.Exec( ctx, "job-rows-batch-insert", txn, stmtSB.String(), args..., ); err != nil {
Thanks, I have updated.
pkg/jobs/registry_test.go, line 302 at r8 (raw file):
Previously, ajwerner wrote…
Isn't this covered above?
Good catch, it's by mistake.
pkg/jobs/registry_test.go, line 334 at r8 (raw file):
Previously, ajwerner wrote…
I think you're going to have a bad time if the callback gets called more than once. You should reset the
jobs
slice in the callback.
Done.
pkg/jobs/registry_test.go, line 350 at r8 (raw file):
Previously, ajwerner wrote…
// and description? I was expecting that the type will be "IMPORT" in // the jobs table when the job is created, but it occurred to be NULL. Is // that expected? Similarly, description is NULL.
When you say
type
what do you mean? Which table? The description is a field in the record so it should be what you set it to. Maybe set it to some expected values.
It was job_type in the internal table as we discussed offline.
pkg/sql/drop_table.go, line 509 at r8 (raw file):
Previously, ajwerner wrote…
This deserves some commentary for future readers as well as some commentary about this behavior on the function itself.
OK. Done.
pkg/sql/planner.go, line 96 at r2 (raw file):
Previously, ajwerner wrote…
the test logs show messages related to throttling and ranges going down.
I have a feeling that that's just noise at the end of the test and is not the true failure.
I have updated the structure.
pkg/sql/table.go, line 119 at r8 (raw file):
Previously, ajwerner wrote…
var record *jobs.Record if cachedRecord, ok := p.extendedEvalCtx.SchemaChangeJobCache[tableDesc.ID]; ok { record = cachedRecord }
nit: replace this with the following code to allow you to eliminate some code.
record, recordExists := p.extendedEvalCtx.SchemaChangeJobCache[tableDesc.ID]
Done.
622cbc6
to
29c3d17
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get the linter passing and fix the nits and I think this is good to go.
Reviewed 1 of 3 files at r1, 1 of 14 files at r10.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
pkg/jobs/registry_test.go, line 334 at r8 (raw file):
Previously, sajjadrizvi (Sajjad Rizvi) wrote…
Done.
Not sure what I was talking about here
pkg/sql/drop_table.go, line 515 at r10 (raw file):
) error { for _, mj := range tableDesc.MutationJobs { mutationID := jobspb.JobID(mj.JobID)
mutationID is confusing as the name of a variable storing a job ID. MutatationID is a differnent concept which is related to jobs. jobID
was a pretty good name, no?
pkg/sql/drop_table.go, line 521 at r10 (raw file):
if record, exists := p.ExtendedEvalContext().SchemaChangeJobCache[tableDesc.ID]; exists { if record.JobID == mutationID {
nit: inline the condition with &&
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/drop_table.go, line 515 at r10 (raw file):
Previously, ajwerner wrote…
mutationID is confusing as the name of a variable storing a job ID. MutatationID is a differnent concept which is related to jobs.
jobID
was a pretty good name, no?
Yeah, I didn't realize mutation ID is an existing term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
pkg/bench/rttanalysis/testdata/benchmark_expectations, line 74 at r10 (raw file):
1,SystemDatabaseQueries/select_system.users_with_empty_database_Name 1,SystemDatabaseQueries/select_system.users_with_schema_Name 2,SystemDatabaseQueries/select_system.users_without_schema_Name
What is this about?
pkg/jobs/registry.go, line 416 at r10 (raw file):
sessionID sqlliveness.SessionID, txn *kv.Txn, records []*Record, ) (string, []interface{}, []jobspb.JobID, error) { const columnValues = " ($%d, $%d, $%d, $%d, $%d, $%d)"
I don't think we're going to need this with the refactor I'm about to propose.
pkg/jobs/registry.go, line 424 at r10 (raw file):
Quoted 5 lines of code…
start := timeutil.Now() if txn != nil { start = txn.ReadTimestamp().GoTime() } modifiedMicros := timeutil.ToUnixMicros(start)
nit: lift this above this call and just pass in the timestamp
pkg/jobs/registry.go, line 427 at r10 (raw file):
// We have six columns per job. args := make([]interface{}, 0, len(records)*6)
I'm going to make a nit-picky suggestion for how to make this feel a little cleaner:
instanceID := r.ID()
const numColumns = 6
columns := [numColumns]string{`id`, `status`, `payload`, `progress`, `claim_session_id`, `claim_instance_id`}
marshalPanic = func(m proto.Message) []byte {
data, err := protoutil.Marshal(m)
if err != nil {
panic(err)
}
return data
}
valueFns := map[string]func(*Record) interface{}{
`id`: func(r *Record) interface{} { return r.ID },
`status`: func(r *Record) interface{} { return StatusRunning },
`payload`: func(r *Record) interface{} { return marshalPanic(r.Payload) },
`progress`: func(r *Record) interface{} { return marshalPanic(r.Progress) },
`claim_session_id`: func(r *Record) interface{} { return sessionID.UnsafeBytes() },
`claim_instance_id`: func(r *Record) interface{} { return instanceID },
}
appendValues := func(r *Record, val *[]interface{}) (err error) {
defer func() {
switch r := recover(); t.(type) {
case nil:
case error:
err = errors.CombineErrors(err, errors.Wrap(r, "encoding job %d", r.ID)
default:
panic(r)
}
}()
for _, c := range columns {
(*vals) = append(*vals, valueFns[c](r)
}
return nil
}
args := make([]interface{}, 0, len(records) * numColumns)
jobIDs := make([]jobspb.JobID, 0, len(records))
var buf strings.Builder
buf.WriteString(`INSERT INTO system.jobs (`
buf.WriteString(strings.Join(columns, ", "))
buf.WriteString(`) VALUES `)
argIdx := 0
for i, r := range records {
if i > 0 {
buf.WriteString(", ")
}
buf.WriteString("("
for _, c := range columns {
buf.WriteString("$")
buf.WriteString(strconv.Itoa(argIdx))
argIdx++
}
buf.WriteString(")")
r.Progress.ModifiedMicros = modifiedMicros
if err := appendValues(r, &args); err != nil {
return "", nil, nil, err
}
jobIDs = append(jobIDs, job.ID)
}
return stmtSB.String(), args, jobIDs, nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/bench/rttanalysis/testdata/benchmark_expectations, line 74 at r10 (raw file):
Previously, ajwerner wrote…
1,SystemDatabaseQueries/select_system.users_with_empty_database_Name 1,SystemDatabaseQueries/select_system.users_with_schema_Name 2,SystemDatabaseQueries/select_system.users_without_schema_Name
What is this about?
I am not sure how it happened. I have now updated only Grant test numbers.
pkg/jobs/registry.go, line 416 at r10 (raw file):
Previously, ajwerner wrote…
I don't think we're going to need this with the refactor I'm about to propose.
OK.
pkg/jobs/registry.go, line 424 at r10 (raw file):
Previously, ajwerner wrote…
start := timeutil.Now() if txn != nil { start = txn.ReadTimestamp().GoTime() } modifiedMicros := timeutil.ToUnixMicros(start)
nit: lift this above this call and just pass in the timestamp
Done.
pkg/jobs/registry.go, line 427 at r10 (raw file):
Previously, ajwerner wrote…
I'm going to make a nit-picky suggestion for how to make this feel a little cleaner:
instanceID := r.ID() const numColumns = 6 columns := [numColumns]string{`id`, `status`, `payload`, `progress`, `claim_session_id`, `claim_instance_id`} marshalPanic = func(m proto.Message) []byte { data, err := protoutil.Marshal(m) if err != nil { panic(err) } return data } valueFns := map[string]func(*Record) interface{}{ `id`: func(r *Record) interface{} { return r.ID }, `status`: func(r *Record) interface{} { return StatusRunning }, `payload`: func(r *Record) interface{} { return marshalPanic(r.Payload) }, `progress`: func(r *Record) interface{} { return marshalPanic(r.Progress) }, `claim_session_id`: func(r *Record) interface{} { return sessionID.UnsafeBytes() }, `claim_instance_id`: func(r *Record) interface{} { return instanceID }, } appendValues := func(r *Record, val *[]interface{}) (err error) { defer func() { switch r := recover(); t.(type) { case nil: case error: err = errors.CombineErrors(err, errors.Wrap(r, "encoding job %d", r.ID) default: panic(r) } }() for _, c := range columns { (*vals) = append(*vals, valueFns[c](r) } return nil } args := make([]interface{}, 0, len(records) * numColumns) jobIDs := make([]jobspb.JobID, 0, len(records)) var buf strings.Builder buf.WriteString(`INSERT INTO system.jobs (` buf.WriteString(strings.Join(columns, ", ")) buf.WriteString(`) VALUES `) argIdx := 0 for i, r := range records { if i > 0 { buf.WriteString(", ") } buf.WriteString("(" for _, c := range columns { buf.WriteString("$") buf.WriteString(strconv.Itoa(argIdx)) argIdx++ } buf.WriteString(")") r.Progress.ModifiedMicros = modifiedMicros if err := appendValues(r, &args); err != nil { return "", nil, nil, err } jobIDs = append(jobIDs, job.ID) } return stmtSB.String(), args, jobIDs, nil
I have updated accordingly.
c48274c
to
4e6017e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 14 files at r12.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
pkg/bench/rttanalysis/testdata/benchmark_expectations, line 43 at r12 (raw file):
19,DropSequence/drop_1_sequence 27,DropSequence/drop_2_sequences 35,DropSequence/drop_3_sequences
seems like some of these have changed.
pkg/jobs/registry.go, line 462 at r12 (raw file):
appendValues := func(rec *Record, vals *[]interface{}) (err error) { defer func() { switch ret := recover(); ret.(type) {
ret
is a confusing name for the recovered value. It's explicitly not something being returned.
pkg/sql/conn_executor.go, line 1092 at r12 (raw file):
jobs jobsCollection // schemaChangeJobsCache is a map of descriptor IDs to job Records.
I wonder if this deserves a new name.
pkg/sql/conn_executor_exec.go, line 809 at r12 (raw file):
// createJobs creates jobs for the records cached in schemaChangeJobsCache // during this transaction. func (ex *connExecutor) createJobs(ctx context.Context) error {
nit: I'd put this below commitSQLTransactionInternal
. I usually like to read functions which invoke helpers before the helpers.
pkg/sql/planner.go, line 96 at r12 (raw file):
// SchemaChangeJobCache refers to schemaChangeJobsCache in extraTxnState. SchemaChangeJobCache map[descpb.ID]*jobs.Record
In parallel to the other question about names, I wonder if this deserves a new name. SchemaChangeJobRecords
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
pkg/sql/planner.go, line 96 at r12 (raw file):
Previously, ajwerner wrote…
In parallel to the other question about names, I wonder if this deserves a new name.
SchemaChangeJobRecords
?
Done, I have updated the name to SchemaChangeJobRecords
.
4e6017e
to
392aab8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 14 files at r14.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
This commit adds a new function in Registry, 'CreateJobsWithTxn', that allows creating jobs in a batch using a single SQL statement. This function is a precursor to creating schema-changing jobs in a batch, which will result in reducing the number of round-trips between nodes during schema-changes. Release note: None
This commit changes the way schema-change jobs are created in a transaction. Previously, the jobs were created individually during the execution of a transaction. A job creation incurs a round trip to insert an entry in jobs table. This commit accumulates the jobs to be created during a transaction and creates them in a batch, which is expected to reduce the number of round-trips between distant nodes during a transaction execution. This is a step forward to improve the performance of GRANT/REVOKE queries in large databases. Release note: None Fixes: cockroachdb#64389
392aab8
to
12de76f
Compare
TFTR! bors r=ajwerner |
Build succeeded: |
Commit cockroachdb#67991 introduced a test that turned out to be flaky. The test runs out of memory sometimes as it creates a very large batch of jobs. Nevertheless, the test doesn't contribute much value to the test suite because it only creates a large batch of jobs. The core functionality is already being tested with moderately large batch sizes. Therefore, the test is now removed in this commit. Release note: None Fixes: cockroachdb#68962
Commit cockroachdb#67991 introduced a test that turned out to be flaky. The test runs out of memory sometimes as it creates a very large batch of jobs. This fix disables job adoptions to avoid large memory use. Release note: None Fixes: cockroachdb#68962
68803: sql: add ON UPDATE syntax r=otan a=pawalt Previously, we did not have parsing implemented for ON UPDATE expressions on columns. This PR adds parsing for ON UPDATE expressions. Release note: None 69013: opt: fix panic in findJoinFilterRange r=mgartner a=mgartner This commit fixes a panic produced in `findJoinFilterRange` when attempting to access the 0-th constraint in an empty constraint set. Fixes #68975 There is no release note because the bug is not present in any releases. Release note: None 69014: jobs: fix test for batch jobs creation, marked as flaky r=ajwerner a=sajjadrizvi Commit #67991 introduced a test that turned out to be flaky. The test runs out of memory sometimes as it creates a very large batch of jobs. This fix disables job adoptions to avoid large memory use. Release note: None Fixes: #68962 69022: sql: implement combined iterator for transaction statistics r=maryliag a=Azhng Follow up to #68675 Release note: None 69031: ccl/backupccl: skip TestBackupRestoreSystemJobsProgress r=otan a=otan Refs: #68571 Reason: flaky test (this is flaking regularly on PRs) Generated by bin/skip-test. Release justification: non-production code changes Release note: None 69034: roachtest/tests/cdc: effectively disable exponential backoff in jobs r=ajwerner a=ajwerner Fixes #68963. Release note: None 69045: bazel: swap order of interfaces in generated mocks in `pkg/roachpb` r=rail a=rickystewart This makes the order match up to what we have in the checked-in generated code, quashing a diff when you `dev build --hoist-generated-code`. Release note: None Co-authored-by: Peyton Walters <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Sajjad Rizvi <[email protected]> Co-authored-by: Azhng <[email protected]> Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
This commit change the way jobs are created for a transaction.
Previously, the jobs were created individually during the
execution of a transaction. A job creation incurs a round trip
to insert an entry in system.jobs table.
This commit introduces a new function in Registry, 'CreateJobsWithTxn',
that allows creating jobs in a batch using a single SQL statement. Using
this function, this commit creates jobs in a batch during a transaction, which
reduces the number of round-trips between distant nodes during a transaction
execution. This is a step forward to improve the performance of GRANT/REVOKE
queries in large databases.
Release note: None
Fixes: #64389