-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/distsqlrun: refactor tableReader to implement RowSource #20584
sql/distsqlrun: refactor tableReader to implement RowSource #20584
Conversation
I will reiterate that I believe we should first work on row batching before going down this path because 1) it is definitely necessary regardless of other improvements and 2) it could reduce the benefit of this kind of change. Overall the change itself seems fine but if we make more changes to remove a row channel here and there, things could easily become very messy. Review status: 0 of 2 files reviewed at latest revision, all discussions resolved, some commit checks failed. Comments from Reviewable |
+1 to batching first :) Review status: 0 of 2 files reviewed at latest revision, all discussions resolved, some commit checks failed. Comments from Reviewable |
And there's another thing - there's been a series of triparty talks between
Radu, Rafa and myself about supporting synchronous scheduling of co-located
processors. You'd probably end up with something similar to what you have
here, but hopefully done in a more general way. I was hoping that line of
thought would marinate as we work towers unifying the execution engines.
BTW, I applaud the development of benchmarks like the one that lead to this
PR.
…On Dec 8, 2017 6:11 PM, "Andrei Matei" ***@***.***> wrote:
+1 to batching first :)
------------------------------
Review status: 0 of 2 files reviewed at latest revision, all discussions
resolved, some commit checks failed.
------------------------------
*Comments from Reviewable
<https://reviewable.io:443/reviews/cockroachdb/cockroach/20584#-:-L-sOAg0sMjNNc1mfPW6:b-n6wl6h>*
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#20584 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAXBcWhUw0482VhZl7-NtxD7npCfM5Ujks5s-cIIgaJpZM4Q7tAB>
.
|
Note that I hacked this up to see what could be accomplished. I'm not in anyway proposing this as the right approach. Passing batches of rows will lessen the need for removing inter-processor communication speed, but not eliminate it. And this PR gives us an indication of the possible speed-up from row batching (assuming we don't change processor internals at the same time). @andreimatei How would synchronous scheduling of co-located processors work? Would it be any different from pushing this PR to its conclusion and eliding |
Another thing to note is that making the underlying MVCCScan go much faster probably exacerbates the difference in performance between the two approaches, making the need for eliding Comments from Reviewable |
f509431
to
f309fc9
Compare
This is prep work for #20550, without actually taking the step of eliding RowChannels. The plan is to follow-up with similar refactorings of the other processors. Note the first commit is all context plumbing work. PTAL. Review status: 0 of 26 files reviewed at latest revision, 1 unresolved discussion. pkg/sql/distsqlrun/tablereader.go, line 374 at r2 (raw file):
@RaduBerinde Any insight on what to do here? The various Comments from Reviewable |
f309fc9
to
fac7e36
Compare
Review status: 0 of 26 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/sql/distsqlrun/tablereader.go, line 314 at r2 (raw file):
Currently the table reader calls pkg/sql/distsqlrun/tablereader.go, line 374 at r2 (raw file): Previously, petermattis (Peter Mattis) wrote…
Currently, even if a consumer signals that it is done, it still "drains" the input (e.g. to get any remaining metadata). If that carries over, the consumer will keep running Next, so the right thing to do would be to set a flag and have the next Comments from Reviewable |
d315d45
to
ec11190
Compare
Review status: 0 of 26 files reviewed at latest revision, 2 unresolved discussions. pkg/sql/distsqlrun/tablereader.go, line 314 at r2 (raw file): Previously, RaduBerinde wrote…
Ah, I missed that. Is the contract that we'll send the misplanned ranges metadata and tracedata even if there is an error? Is there a reason this extra metadata isn't bundled into the same PTAL at the logic both in Comments from Reviewable |
Review status: 0 of 26 files reviewed at latest revision, 2 unresolved discussions. pkg/sql/distsqlrun/tablereader.go, line 314 at r2 (raw file): Previously, petermattis (Peter Mattis) wrote…
I think they can probably be packaged in the same I think we want the tracing data even if there's an error. Note that finishing early (through Comments from Reviewable |
Review status: 0 of 26 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending. pkg/sql/distsqlrun/tablereader.go, line 314 at r2 (raw file): Previously, RaduBerinde wrote…
Yeah, packaging everything into a single The changes here have gotten complex enough that I'm not at all comfortable with them without additional testing. I'd like to add some testing which verifies both the Comments from Reviewable |
Just a meta-comment: I can easily see this series of changes turning into a can of worms and potentially causing a lot of conflicts with other ongoing work (row batching). Consider putting things on a separate branch for now (you can still post PRs for changes that go on that branch). |
That's a good point. I was hoping to do this processor-by-processor to avoid some uber merge nightmare. I think the row batching could take a similar approach. |
0d41438
to
149c909
Compare
Ok, I think I have all of the semantics of At this point, it looks feasible to implement
Review status: 0 of 4 files reviewed at latest revision, 3 unresolved discussions. pkg/sql/distsqlrun/flow.go, line 57 at r3 (raw file):
This was moved from pkg/sql/distsqlrun/tablereader.go, line 374 at r2 (raw file): Previously, RaduBerinde wrote…
Done. Comments from Reviewable |
Ping. I'd like to get this reviewed and merged piecemeal rather than changing all processors at once. This PR changes 3 of the 15 existing processors to implement Review status: 0 of 8 files reviewed at latest revision, 3 unresolved discussions. Comments from Reviewable |
LGTM @andreimatei should take a look too. And folks who are working on row batching should at least take a look at the new interface. CC @arjunravinarayan @asubiotto Review status: 0 of 8 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. pkg/sql/distsqlrun/tablereader.go, line 326 at r5 (raw file):
How can Comments from Reviewable |
90aa016
to
8950cd5
Compare
Review status: 0 of 8 files reviewed at latest revision, 2 unresolved discussions. pkg/sql/distsqlrun/tablereader.go, line 326 at r5 (raw file): Previously, RaduBerinde wrote…
It can't. I think this was detritus from earlier versions of this change (before I added Comments from Reviewable |
Review status: 0 of 8 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed. pkg/sql/distsqlrun/tablereader.go, line 57 at r6 (raw file):
please comment how pkg/sql/distsqlrun/tablereader.go, line 201 at r6 (raw file):
this message is OK in pkg/sql/distsqlrun/tablereader.go, line 204 at r6 (raw file):
please add a TODO somewhere around here to remove the pkg/sql/distsqlrun/tablereader.go, line 229 at r6 (raw file):
pls comment nils (some of them were commented before :) ) pkg/sql/distsqlrun/tablereader.go, line 312 at r6 (raw file):
pkg/sql/distsqlrun/tablereader.go, line 315 at r6 (raw file):
not sure resetting these is a good idea (it forces you to not use pkg/sql/distsqlrun/tablereader.go, line 324 at r6 (raw file):
pls comment that err can be nil pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file):
pkg/sql/distsqlrun/tablereader.go, line 339 at r6 (raw file):
how come the linter is not yelling at you? pkg/sql/distsqlrun/tablereader.go, line 343 at r6 (raw file):
With processors implementing the pkg/sql/distsqlrun/tablereader.go, line 418 at r6 (raw file):
does it not work if you use tr.ctx (even if its span has been closed already)? I seem to remember that we made it logging to finished spans "work" (not panic). pkg/sql/distsqlrun/tablereader_test.go, line 233 at r6 (raw file):
nit: you've done a superfluous change to this line, but you missed the opportunity to comment the nil :P Comments from Reviewable |
Review status: 0 of 8 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed. pkg/sql/distsqlrun/tablereader.go, line 57 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/tablereader.go, line 201 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Fixed the message. pkg/sql/distsqlrun/tablereader.go, line 204 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I'll remove it in a follow-on PR. I pinky promise. pkg/sql/distsqlrun/tablereader.go, line 229 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/tablereader.go, line 312 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/tablereader.go, line 315 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Well, we use pkg/sql/distsqlrun/tablereader.go, line 324 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Hmm, let me investigate this again. I think I took a look while developing this PR and determined that the comment was out of date. pkg/sql/distsqlrun/tablereader.go, line 339 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. The linter only cares about exported types. pkg/sql/distsqlrun/tablereader.go, line 343 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I think I already updated the comment on pkg/sql/distsqlrun/tablereader.go, line 418 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Well, if we set pkg/sql/distsqlrun/tablereader_test.go, line 233 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. Comments from Reviewable |
8950cd5
to
325cbe3
Compare
Review status: 0 of 8 files reviewed at latest revision, 13 unresolved discussions, some commit checks pending. pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file): Previously, petermattis (Peter Mattis) wrote…
I was wrong. There are several places in the code which extract the error and drop the rest of the metadata. Comments from Reviewable |
Review status: 0 of 8 files reviewed at latest revision, 13 unresolved discussions, some commit checks pending. pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file): Previously, petermattis (Peter Mattis) wrote…
Note that the over-the-wire proto is Comments from Reviewable |
Review status: 0 of 8 files reviewed at latest revision, 13 unresolved discussions, some commit checks pending. pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file): Previously, RaduBerinde wrote…
I am inclined to believe that "one at a time, and nothing comes after an error" is an easy to work with convention, but I don't have strong feelings. Comments from Reviewable |
Review status: 0 of 8 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed. pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Ok, so if an error is returned we don't need to return anything else. And if no error is returned I have to return a separate Comments from Reviewable |
325cbe3
to
5bc19cb
Compare
Review status: 0 of 8 files reviewed at latest revision, 13 unresolved discussions. pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file): Previously, petermattis (Peter Mattis) wrote…
Ok, I took a crack at this. It mostly works, though causes Comments from Reviewable |
Review status: 0 of 8 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed. pkg/sql/distsqlrun/base.go, line 139 at r7 (raw file):
s/the consumer it is done/the consumer, well, it is done pkg/sql/distsqlrun/tablereader.go, line 315 at r6 (raw file): Previously, petermattis (Peter Mattis) wrote…
Well, if pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file): Previously, petermattis (Peter Mattis) wrote…
I think I might have spoken nonsense before. A producer can push other metadata after an error, and so can a RowSource return metadata after an error. Sorry. And it's quite useful, e.g. for sending the trace after an error in this processor. pkg/sql/distsqlrun/tablereader.go, line 339 at r6 (raw file): Previously, petermattis (Peter Mattis) wrote…
ignore: I personally have moved to the "Types is part of the RowSource interface" phrasing for interfaces with more than one method. The current phrasing doesn't quite compute. pkg/sql/distsqlrun/tablereader.go, line 343 at r6 (raw file): Previously, petermattis (Peter Mattis) wrote…
Ah, you did change something indeed, that's good. pkg/sql/distsqlrun/tablereader.go, line 73 at r8 (raw file):
If this stays, please put some comment on this field. pkg/sql/distsqlrun/tablereader.go, line 292 at r8 (raw file):
Ummm seems like someone is abusing something here, and also breaking the Comments from Reviewable |
Refactor tableReader to implement the RowSource interface. Refactor tableReader.Run() to be implemented in terms of tableReader.Next() (i.e. the RowSource interface). Adjusted BenchmarkTableReader to avoid using a RowBuffer. This shows the benefit that can be achieved by using TableReader as a RowSource ("old" below is with the benchmark modified to use a RowChannel). name old time/op new time/op delta TableReader-8 11.6ms ± 5% 9.4ms ± 3% -18.81% (p=0.000 n=10+10) See cockroachdb#20550 Release note: None
Implement `tableReader.Run()` using the generalized `Run()` function. Release note: None
Release note: None
Release note: None
5bc19cb
to
e224833
Compare
TFTR! PTAL. Review status: 0 of 9 files reviewed at latest revision, 8 unresolved discussions. pkg/sql/distsqlrun/base.go, line 139 at r7 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done, though I just removed the extraneous pkg/sql/distsqlrun/tablereader.go, line 315 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I've added an explicit pkg/sql/distsqlrun/tablereader.go, line 327 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I believe the test was failing because there were no misplanned ranges and thus we were sending an empty pkg/sql/distsqlrun/tablereader.go, line 339 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/tablereader.go, line 343 at r6 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I'm not sure if anything needs to be commented here. Clearly post-processing needs to be done. Failure to do so will break tests dramatically. pkg/sql/distsqlrun/tablereader.go, line 73 at r8 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/tablereader.go, line 292 at r8 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Yes, there is a bit of encapsulation breakage here, though I'm hoping it is temporary. Once all processors implement Comments from Reviewable |
Review status: 0 of 9 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed. pkg/sql/distsqlrun/base.go, line 139 at r7 (raw file): Previously, petermattis (Peter Mattis) wrote…
I kid. pkg/sql/distsqlrun/tablereader.go, line 315 at r6 (raw file): Previously, petermattis (Peter Mattis) wrote…
Thanks. SGTM about about Background(); I now think it might not work with a closed span if we have lightstep enabled. pkg/sql/distsqlrun/tablereader.go, line 73 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
danke Comments from Reviewable |
@arjunravinarayan, @asubiotto I'm going to merge this now. Feel free to leave additional comments which I'll address in follow-on PRs. I don't think this fundamentally affects row batching, but if it does I'm happy to deal with the fallout. |
Reviewed 1 of 26 files at r4, 4 of 4 files at r9, 3 of 3 files at r10, 2 of 2 files at r11, 2 of 2 files at r12. Comments from Reviewable |
Refactor tableReader to implement the RowSource interface. Refactor
tableReader.Run() to be implemented in terms of
tableReader.Next() (i.e. the RowSource interface).
See #20550
Release note: None