-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import: check readability earlier #74863
Conversation
I don't think there are reasonable unit tests to add here - behavior should be the same as before, just faster. But I'm open to suggestions if you have any! (Conceivably I could break out the "pre-import" stanza into a separate, named function, and test that, but that doesn't seem super-consistent with how we organize/test these files right now anyway) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ~fine without a test but if we wanted to write one, I think one easy-ish one to write would be to add a "debug pause point" to the ingestion phase, then send and import for one good file URI plus then one or more bad URIs.
In theory, without this change, the job would enter ingestion phase for the good URI and then pause on the pause point, but then with this change it'd fail rather than pause. Entering the ingestion phase at all is what we want to detect, and avoid, when we know the job is doomed due to the bad URI.
However, the fact we import files in random (go map iteration) order is a wrinkle in that, since it could spuriously pass (that is fail on the bad URI, rather than pause in ingestion) even without this change if it random'ed the bad file first, and it doesn't really seem worth putting lots of time into changing this just for a test to me. Maybe we test by hand once or twice and call it good?
Do we typically build unit tests in the way you describe? Seems odd to me, but if it's customary then I'll dive in! It sounds like (a) The test-fail condition would be hanging indefinitely until test timeout, and (b) we'd be making pretty unusual use of breakpoints and debuggers (unless you didn't mean that literally?). Maybe I misunderstand? But otherwise I don't love that solution, I think a manual test or two might be more fitting for this circumstance. |
Yeah, sorry, I didn't mean that literally at all, that was really unclear: I added a little helper thing last month in this PR #73890 Basically, you could add a line to the import job, right when it moves in to ingestion, that looks like Anyway, this is all academic -- a manual test sounds great and much simpler. |
de4241c
to
709780e
Compare
nit: IMPORT isn't an enterprise feature so release note should be |
63fa010
to
67d6e66
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @stevendanna)
pkg/ccl/importccl/read_import_base.go, line 175 at r1 (raw file):
Previously, dt (David Taylor) wrote…
I'm not sure you pushed the amended commit yet, but sure, that sounds fine.
Re defer in loop with/without anon function it still "works" just fine, it just changes when it'd run -- they'd all be queued up to run at the end of the outer function vs after each loop, but they'll still run. I'm fine with it either way though. Maybe a comment though to explain the anon function is to fun the deferred close immediately.
Ah, got it. Thanks! Added a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @stevendanna)
Sorry I should have nitted this at the same time earlier as the release not category, but it isn't technically "user permissions" which we're checking, which could be reasonably interpreted as the SQL user's permissions inside the DB to import into the table or database being written to -- those are already checked much earlier than this before we create the IMPORT job -- but rather it is the accessibility of the provided input URIs that is being verified in this change. If you want to mention permissions, I might clarify that we mean the cloud provider's bucket permissions, so something like: Release note (sql change): IMPORT now verifies that it can open and read from all of the provided input URIs before it starts processing them to detect and report any URIs that are malformed or lack the required bucket permission grants sooner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @benbardin and @stevendanna)
pkg/ccl/importccl/read_import_base.go, line 175 at r1 (raw file):
Previously, benbardin (Ben Bardin) wrote…
Ah, got it. Thanks! Added a comment.
Thanks!
(But of course, no good deed goes unpunished... Our style guide says to wrap comments at 80. Not blocking but would be nice if you need to push again for something else)
67d6e66
to
913712c
Compare
Release note (sql change): Import now checks readability earlier for multiple files, to fail sooner if e.g. permissions are invalid.
913712c
to
16aecb2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries! Fixed.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @benbardin and @stevendanna)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @stevendanna)
pkg/ccl/importccl/read_import_base.go, line 175 at r1 (raw file):
Previously, dt (David Taylor) wrote…
Thanks!
(But of course, no good deed goes unpunished... Our style guide says to wrap comments at 80. Not blocking but would be nice if you need to push again for something else)
done!
bors r+ |
Build succeeded: |
Release note (sql change): Import now checks readability earlier for multiple files, to fail sooner if e.g. permissions are invalid.