Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: track bytes and rows read unconditionally for every statement #38002

Merged
merged 1 commit into from
Jun 19, 2019

Conversation

yuzefovich
Copy link
Member

In order to propagate metrics about bytes and rows read we use
the existing tracing infrastructure, but, instead of using recording
spans, we use recordable ones. All processors get a recordable span,
but only table readers propagate their stats when full tracing is
disabled.

Addresses: #37409.

In order to see the performance hit, I ran kv workload with read only queries (I guess it should have the largest hit) and saw a decrease of around 3%. Here is a representative output of 30 seconds of load:

With stats:
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
   30.0s        0         128368         4278.2      0.2      0.3      0.4      0.6     10.0

Without stats:
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
   30.0s        0         131946         4398.1      0.2      0.3      0.4      0.6     10.0 

Release note: None

@yuzefovich yuzefovich requested review from ajwerner, asubiotto and a team June 4, 2019 14:10
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich
Copy link
Member Author

At the moment, these metrics are only exposed via storing them in crdb_internal.node_statement_statistics table. I think we should show them on the statement detail page, but probably that should be done in a separate PR.

I might not be fully understanding what I'm doing here, so please take a very critical look. Also cc @jordanlewis @andreimatei if you guys have time to take a look as well.

@yuzefovich yuzefovich force-pushed the metrics branch 2 times, most recently from 067aa07 to 0ba41e7 Compare June 4, 2019 15:49
@ajwerner
Copy link
Contributor

ajwerner commented Jun 4, 2019

3% is sort of a bummer, I'm interested in digging in more to the perf implications

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name             old ops/s   new ops/s   delta
KV99-throughput  42.1k ± 3%  40.3k ± 3%  -4.17%  (p=0.032 n=5+5)

name             old ms/s    new ms/s    delta
KV99-P50          8.10 ± 0%   8.60 ± 3%  +6.17%  (p=0.016 n=4+5)
KV99-Avg          2.64 ± 6%   2.74 ± 6%    ~     (p=0.333 n=5+5)

This is with longer runs on your later revision. We're going to need to look at the source of inefficiency. Hopefully we can pre-allocate or pool some objects to mitigate the cost.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @asubiotto)

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me minus the performance cost. There are several things you could try out:

  1. In the tableReader, tags are set on the span. This is something unnecessary and that we can avoid, although we don't go into getSpans, which would be the more expensive part.
  2. There is no need to create any spans, this is probably expensive as well. I would just create a RecordedSpan with the Stats field set when the tableReader returns metadata.

Reviewed 11 of 15 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @yuzefovich)


pkg/sql/distsql_running.go, line 542 at r1 (raw file):

						r.rowsRead += trs.InputStats.NumRows
					} else {
						// TODO(yuzefovich): what should we do here?

I think this is fine


pkg/util/tracing/tracer_span.go, line 250 at r1 (raw file):

		stats, err := types.MarshalAny(s.mu.stats)
		if err != nil {
			// TODO(yuzefovich): what should we do here?

getSpans panics as well.

@yuzefovich yuzefovich force-pushed the metrics branch 3 times, most recently from b17b0dc to 34974ef Compare June 5, 2019 21:53
@yuzefovich yuzefovich requested a review from a team June 5, 2019 21:53
@yuzefovich yuzefovich force-pushed the metrics branch 11 times, most recently from 4de3b7a to ec8d4e2 Compare June 6, 2019 22:25
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @asubiotto)


pkg/sql/distsql_running.go, line 542 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I think this is fine

Removed the changes.


pkg/sql/distsqlpb/data.go, line 216 at r2 (raw file):

Previously, ajwerner wrote…

My preference would be that this pool not to be exported and instead have strongly typed functions.

Done.


pkg/sql/distsqlrun/tablereader.go, line 363 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Is it necessary to add a Metrics field to metadata instead of just doing something like distsqlpb.ProducerMetadata{TraceData: []RecordedSpan{Stats: &TableReaderStats{BytesRead: bytes, RowsRead: rows}}}. We can use a RecordedSpan pool. In any case, it would be good to make sure that these bytes are also added to the EXPLAIN ANALYZE plan in case of tracing.

The reasoning for adding Metrics field was attempt at removing one marshaling/unmarshaling conversion, but the benchmarks do not show significant changes because of that, so I removed it as we discussed.

Bytes read do show up in the output of EXPLAIN ANALYZE since they are added to TableReaderStats in outputStatsToTrace.


pkg/sql/distsqlrun/tablereader.go, line 377 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Why bother with a metricsCollector? Since this is unconditional, we could just have the tableReader keep track of the number of rows read.

Indeed, I feel so silly :/


pkg/util/tracing/tracer_span.go, line 250 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

getSpans panics as well.

Removed the changes.

@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label Jun 17, 2019
@yuzefovich yuzefovich force-pushed the metrics branch 4 times, most recently from ca84892 to ca4728d Compare June 17, 2019 23:32
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @asubiotto)


pkg/sql/distsqlrun/tablereader.go, line 363 at r3 (raw file):

Previously, yuzefovich wrote…

The reasoning for adding Metrics field was attempt at removing one marshaling/unmarshaling conversion, but the benchmarks do not show significant changes because of that, so I removed it as we discussed.

Bytes read do show up in the output of EXPLAIN ANALYZE since they are added to TableReaderStats in outputStatsToTrace.

My initial change was incomplete - the tests were failing due to duplicate spans. My understanding is that when we leave SpanID field of RecordedSpan uninitialized, it has value of 0 which belongs to the root span which breaks some assumptions. I think now I have fixed the issue, but rerunning the benchmarks shows slight performance decrease (I believe due to marshaling/unmarshaling as I was thinking earlier). Here are the numbers on kv99 ran 4 times for 4 minutes with 10 seconds ramp ("old" one is using Metrics field and "new" one is this PR at the moment):

gcp
name             old ops/s   new ops/s   delta
KV99-throughput  17.8k ± 2%  17.0k ± 3%  -4.39%  (p=0.029 n=4+4)

name             old ms/s    new ms/s    delta
KV99-P50          19.7 ± 7%   20.7 ± 4%    ~     (p=0.257 n=4+4)
KV99-Avg          6.00 ± 0%   6.38 ± 4%  +6.25%  (p=0.029 n=4+4)
aws
name             old ops/s   new ops/s   delta
KV99-throughput  29.8k ± 1%  29.1k ± 3%    ~     (p=0.200 n=4+4)

name             old ms/s    new ms/s    delta
KV99-P50          11.8 ± 3%   11.8 ± 7%    ~     (p=1.000 n=4+4)
KV99-Avg          3.70 ± 0%   3.80 ± 0%  +2.70%  (p=0.029 n=4+4)

The numbers look quite noisy, but it seems that there is a 2-3% performance hit (in addition to about 1% performance hit on Metrics binary), so now I'm leaning towards going back to using Metrics field again.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 12 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @asubiotto, and @yuzefovich)


pkg/sql/distsql_running.go, line 621 at r4 (raw file):

		}
	}
	meta.Release()

Where is meta being acquired from the pool?


pkg/sql/distsqlrun/tablereader.go, line 363 at r3 (raw file):

Previously, yuzefovich wrote…

My initial change was incomplete - the tests were failing due to duplicate spans. My understanding is that when we leave SpanID field of RecordedSpan uninitialized, it has value of 0 which belongs to the root span which breaks some assumptions. I think now I have fixed the issue, but rerunning the benchmarks shows slight performance decrease (I believe due to marshaling/unmarshaling as I was thinking earlier). Here are the numbers on kv99 ran 4 times for 4 minutes with 10 seconds ramp ("old" one is using Metrics field and "new" one is this PR at the moment):

gcp
name             old ops/s   new ops/s   delta
KV99-throughput  17.8k ± 2%  17.0k ± 3%  -4.39%  (p=0.029 n=4+4)

name             old ms/s    new ms/s    delta
KV99-P50          19.7 ± 7%   20.7 ± 4%    ~     (p=0.257 n=4+4)
KV99-Avg          6.00 ± 0%   6.38 ± 4%  +6.25%  (p=0.029 n=4+4)
aws
name             old ops/s   new ops/s   delta
KV99-throughput  29.8k ± 1%  29.1k ± 3%    ~     (p=0.200 n=4+4)

name             old ms/s    new ms/s    delta
KV99-P50          11.8 ± 3%   11.8 ± 7%    ~     (p=1.000 n=4+4)
KV99-Avg          3.70 ± 0%   3.80 ± 0%  +2.70%  (p=0.029 n=4+4)

The numbers look quite noisy, but it seems that there is a 2-3% performance hit (in addition to about 1% performance hit on Metrics binary), so now I'm leaning towards going back to using Metrics field again.

Going back to using the Metrics field sounds like a good call

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @asubiotto)


pkg/sql/distsql_running.go, line 621 at r4 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Where is meta being acquired from the pool?

Within generateMeta of tableReader.


pkg/sql/distsqlrun/tablereader.go, line 363 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Going back to using the Metrics field sounds like a good call

Ok, I restored that version. Please take - hopefully last - look.

@yuzefovich yuzefovich removed the do-not-merge bors won't merge a PR with this label. label Jun 18, 2019
Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 12 files at r4, 9 of 9 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto and @yuzefovich)


pkg/sql/distsql_running.go, line 621 at r4 (raw file):

Previously, yuzefovich wrote…

Within generateMeta of tableReader.

But then you'd also want to acquire and release it in inbound streams and outboxes in the case of distributed flows, right?


pkg/sql/distsqlpb/data.go, line 226 at r5 (raw file):

// Release is part of Releasable interface. Note that we do not explicitly
// reset the values of the fields since those will be correctly assigned during

I think it's convention to reset them though (e.g. table reader specs)


pkg/sql/distsqlrun/tablereader.go, line 303 at r5 (raw file):

		}

		tr.rowsRead++

I would add a comment that rowsRead are counted twice when tracing (I think). Once in the InputStats, then again here.

@yuzefovich yuzefovich force-pushed the metrics branch 2 times, most recently from 3a0a19c to 69e47c3 Compare June 18, 2019 19:58
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @asubiotto)


pkg/sql/distsql_running.go, line 621 at r4 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

But then you'd also want to acquire and release it in inbound streams and outboxes in the case of distributed flows, right?

Hm, nice catch. You're right - I think, as it was before, I would only release the meta to the pool on the gateway node, but I'd be getting it from the pool on all nodes. I added another acquisition/release in stream encoder/decoder which should be (?) sufficient.

The change seems a little dirty in RemoteProducerMetaToLocalMeta since we want to allocate a new ProducerMetadata object in all cases except for Metrics one. Let me know if you see a cleaner way to go about it (I could extract a separate type check for metrics to put before the type switch, but that seems inefficient and not cleaner).

Let's see what the build says about that change.


pkg/sql/distsqlpb/data.go, line 226 at r5 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I think it's convention to reset them though (e.g. table reader specs)

Ok, done.


pkg/sql/distsqlrun/tablereader.go, line 303 at r5 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I would add a comment that rowsRead are counted twice when tracing (I think). Once in the InputStats, then again here.

Done.

@yuzefovich yuzefovich force-pushed the metrics branch 2 times, most recently from bac349d to d374e87 Compare June 19, 2019 01:54
Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: minus consolidation of metrics release into meta release.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @asubiotto, and @yuzefovich)


pkg/sql/distsql_running.go, line 529 at r6 (raw file):

			r.bytesRead += meta.Metrics.BytesRead
			r.rowsRead += meta.Metrics.RowsRead
			meta.Metrics.Release()

It might make sense to do something similar to ReleaseSetupFlowRequest, where that takes care of checking and releasing fields instead of having to remember to do so.

We propagate metrics about bytes and rows read unconditionally by
emitting appropriate metadata when table readers are drained.

Release note: None
@yuzefovich
Copy link
Member Author

As discussed with Alfonso, two objects have different lifetimes, so it is easier to keep two Release methods separate.

I kicked off another testing run, and the numbers look good (by mistake, both clusters were on GCP):

name             old ops/s   new ops/s   delta
KV99-throughput  19.7k ± 2%  19.5k ± 2%   ~     (p=0.886 n=4+4)

name             old ms/s    new ms/s    delta
KV99-P50          17.6 ± 8%   17.6 ± 4%   ~     (p=1.000 n=4+4)
KV99-Avg          5.57 ± 4%   5.57 ± 4%   ~     (p=1.000 n=4+4)
name             old ops/s   new ops/s   delta
KV99-throughput  19.0k ± 2%  19.5k ± 2%   ~     (p=0.114 n=4+4)

name             old ms/s    new ms/s    delta
KV99-P50          19.6 ± 4%   18.1 ± 5%   ~     (p=0.057 n=4+4)
KV99-Avg          5.35 ± 3%   5.50 ± 0%   ~     (p=0.429 n=4+4)

It sees like there is no noticeable performance hit (my explanation is that this commit improves things with pooling some allocations and gives a performance hit due to extra bookkeeping and metadata propagation which about cancel each other out).

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner and @asubiotto)


pkg/sql/distsql_running.go, line 529 at r6 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

It might make sense to do something similar to ReleaseSetupFlowRequest, where that takes care of checking and releasing fields instead of having to remember to do so.

It seems to me that because of different lifetimes, it'd more complicated to do something similar. I'll look into doing some refactoring around usage of ProducerMetadata this Friday since it seems like we could benefit from pooling those, so I'll merge this as is for now.

craig bot pushed a commit that referenced this pull request Jun 19, 2019
38002: sql: track bytes and rows read unconditionally for every statement r=yuzefovich a=yuzefovich

In order to propagate metrics about bytes and rows read we use
the existing tracing infrastructure, but, instead of using recording
spans, we use recordable ones. All processors get a recordable span,
but only table readers propagate their stats when full tracing is
disabled.

Addresses: #37409.

In order to see the performance hit, I ran `kv` workload with read only queries (I guess it should have the largest hit) and saw a decrease of around 3%. Here is a representative output of 30 seconds of load:
```
With stats:
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
   30.0s        0         128368         4278.2      0.2      0.3      0.4      0.6     10.0

Without stats:
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
   30.0s        0         131946         4398.1      0.2      0.3      0.4      0.6     10.0 
```

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jun 19, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants