-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importccl: move stub stats logic outside publish txn #70447
Conversation
I don't have a clear RCA for the associated issue yet, so I'm hoping that this proposed solution might get someone to see something I missed. Regardless, I feel more comfortable keeping the table publishing closure as small as possible, especially since the stats stubbing is best effort, and swallowing the error seems to be tickling something. Another open question is does this refactor mean we should add a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:/ I want to reach enlightenment on this.
Nevertheless, the change itself
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt and @michae2)
pkg/ccl/importccl/import_stmt.go, line 2371 at r1 (raw file):
statistic.NullCount = nullCount } err = stats.InsertNewStats(ctx, execCfg.InternalExecutor, nil /* txn */, statistics)
when I look at this, I feel sad about how sequential it is is. Hope this isn't a global cluster with hecka columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, actually, there's probably a reason it was in the txn. We probably want the stats published at or before we make the tables public, no? If we're going to do it, maybe do it before?
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt and @michae2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this. Original motivation for same txn was to write stub stats atomically with table publishing, but before publishing would also be fine. After publishing would leave a gap during which queries would be planned without stats. (Granted, it's best-effort anyway, so not a huge issue, but might as well not have a gap.)
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt)
pkg/ccl/importccl/import_stmt.go, line 2371 at r1 (raw file):
Previously, ajwerner wrote…
when I look at this, I feel sad about how sequential it is is. Hope this isn't a global cluster with hecka columns.
Fair point. I'm starting work on distinct and null counts. @adityamaru if you add something like TODO(michae2): parallelize
somewhere I can take a shot at it.
This change pulls the logic related to inserting table statistic stubs for new tables created during the import, out of the txn closure that sets the tables to `public` from `offline`. This was motivated by the failure mode observed in cockroachdb#70410 where we failed to write the stats stub for a particular table and ended up with a descriptor published in the offline state. Fixes: cockroachdb#70410 Release note: None
e697db6
to
5fba5c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a clear RCA for the associated issue yet, so I'm hoping that this proposed solution might get someone to see something I missed. Regardless, I feel more comfortable keeping the table publishing closure as small as possible, especially since the stats stubbing is best effort, and swallowing the error seems to be tickling something.
I'm not sure exactly what TransactionAbortedError(ABORT_REASON_CLIENT_REJECT): "unnamed"
means yet, but it seems bad in general to keep using the same txn after the abort. So this is definitely an improvement.
Another open question is does this refactor mean we should add a details.StatsInserted bool to prevent us from inserting multiple stubs across resumptions? Does it matter?
I don't think it matters for query planning (the optimizer will simply use the latest stats). But as @ajwerner points out writing the statistics is currently serial, so if the import has many tables or columns, writing redundant stats will add some extra latency to the IMPORT. Adding the bool sounds nice, if it's not too much trouble.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt and @michae2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @dt and @michae2)
TFTR! Bazel flake is unrelated. bors r=ajwerner,michae2 |
This PR was included in a batch that was canceled, it will be automatically retried |
For posterity, the RCA from Nathan in the related issue is consistent with what we were observing. I feel better about this change now 🙂 |
Build succeeded: |
This change pulls the logic related to inserting table statistic
stubs for new tables created during the import, out of the txn
closure that sets the tables to
public
fromoffline
.This was motivated by the failure mode observed in #70410 where
we failed to write the stats stub for a particular table and ended
up with a descriptor published in the offline state.
Fixes: #70410
Release note: None