Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Address issues with tracking disk usage as capacity minus availability #17733

Merged
merged 2 commits into from
Aug 18, 2017

Conversation

a-robinson
Copy link
Contributor

See commit descriptions for details. The first commit is important to improving stats-based rebalancing. The second commit isn't necessary, but I got halfway done with it while thinking through the first one so included it anyways. I'm happy to separate it out for later if desired.

Addresses #17691 and #13548. I'll regenerate embedded.go before submitting.

@a-robinson a-robinson requested review from a team August 17, 2017 17:42
@cockroach-teamcity
Copy link
Member

This change is Reviewable

var totalWritesPerSecond float64
bytesPerReplica := make([]float64, 0, capacity.RangeCount)
writesPerReplica := make([]float64, 0, capacity.RangeCount)
newStoreReplicaVisitor(s).Visit(func(r *Replica) bool {
if r.ownsValidLease(now) {
leaseCount++
}
bytesPerReplica = append(bytesPerReplica, float64(r.GetMVCCStats().LiveBytes))
mvccStats := r.GetMVCCStats()
// TODO(DONOTSUBMIT): Which MVCCStats fields should we use for this?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @petermattis for his opinion on what's most accurate here.

@BramGruneir
Copy link
Member

:lgtm:

With one nit and obviously waiting on Peter's response.


Reviewed 9 of 10 files at r1.
Review status: 4 of 21 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


pkg/roachpb/metadata.proto, line 115 at r1 (raw file):

  // to another actually removes its bytes from the source store even though
  // RocksDB may not actually reclaim the physical disk space for a while.
  optional int64 logical_bytes = 9 [(gogoproto.nullable) = false];

Is it common practice to not have the index in order in the proto file??


Comments from Reviewable

@BramGruneir
Copy link
Member

Oh, there's a new commit. I only looked at the first one.


Review status: 4 of 21 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


Comments from Reviewable

@BramGruneir
Copy link
Member

Still LGTM.


Reviewed 16 of 17 files at r2.
Review status: 19 of 21 files reviewed at latest revision, 2 unresolved discussions.


Comments from Reviewable

@a-robinson
Copy link
Contributor Author

Review status: 19 of 21 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


pkg/roachpb/metadata.proto, line 115 at r1 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

Is it common practice to not have the index in order in the proto file??

It wasn't uncommon at Google, and while I only checked a couple of our proto files we've done it before for roachpb.Transaction and storage.Header.


Comments from Reviewable

@petermattis
Copy link
Collaborator

:lgtm:


Review status: 19 of 21 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


pkg/roachpb/metadata.proto, line 115 at r1 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

It wasn't uncommon at Google, and while I only checked a couple of our proto files we've done it before for roachpb.Transaction and storage.Header.

The proto fields should be grouped in the natural order, ignoring tag numbers. Forcing the fields into tag number order reduces readability. For the most part, the tag numbers aren't something you care about.


pkg/storage/store.go, line 2280 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

cc @petermattis for his opinion on what's most accurate here.

Depends on the purpose? LiveBytes+SysBytes captures the long term size of data in the range after GC has been performed. KeyBytes+ValBytes+SysBytes captures the instantaneous size of data in the range (ie. it includes historical data that will eventually be GC'd). Note that IntentBytes is included in LiveBytes.

My guess is that KeyBytes+ValBytes+SysBytes is the better choice.


Comments from Reviewable

@tbg
Copy link
Member

tbg commented Aug 17, 2017

:lgtm: and nice, this was an ugly wart!


Reviewed 10 of 10 files at r1, 16 of 17 files at r2, 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


pkg/roachpb/metadata.go, line 275 at r3 (raw file):

// FractionUsed computes the fraction of storage capacity that is in use.
func (sc StoreCapacity) FractionUsed() float64 {

Where are we still using FractionUsed()? For anything that matters?


pkg/roachpb/metadata.proto, line 115 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

The proto fields should be grouped in the natural order, ignoring tag numbers. Forcing the fields into tag number order reduces readability. For the most part, the tag numbers aren't something you care about.

I'm not aware of a fixed precedent, and while I like to keep them in order, I definitely see how it makes sense the way it is in this diff.


pkg/storage/allocator_scorer.go, line 523 at r1 (raw file):

		if log.V(2) {
			log.Infof(ctx,
				"s%d: should-rebalance(bad-fit): - balanceScore=%s, capacity=(%v), rangeInfo=%+v, "+

What's that lonely - for? Do you mean -(a + b + c)? I also don't understand bad-fit. Similarly the other messages.


pkg/storage/allocator_scorer.go, line 567 at r1 (raw file):

	if log.V(3) {
		log.Infof(ctx,
			"s%d: should-not-rebalance: - balanceScore=%s, capacity=(%v), rangeInfo=%+v, "+

What does capacity print like with %v? It should print like meanDiskUsage, right?


pkg/storage/store.go, line 2280 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

cc @petermattis for his opinion on what's most accurate here.

I think KeyBytes+ValBytes+SysBytes is the right one, the others ignore historical versions and I don't see a reason for that.
One could argue about SysBytes being included, but it should typically be small, and if it isn't better to have rebalances happening that the user doesn't understand than to go out of balance as a result.


Comments from Reviewable

@petermattis
Copy link
Collaborator

Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


pkg/storage/store.go, line 2280 at r2 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

I think KeyBytes+ValBytes+SysBytes is the right one, the others ignore historical versions and I don't see a reason for that.
One could argue about SysBytes being included, but it should typically be small, and if it isn't better to have rebalances happening that the user doesn't understand than to go out of balance as a result.

I would definitely include SysBytes. It should be small, but sometimes it isn't.


Comments from Reviewable

@a-robinson
Copy link
Contributor Author

a-robinson commented Aug 17, 2017

TFTRs!


Review status: 2 of 22 files reviewed at latest revision, 5 unresolved discussions.


pkg/roachpb/metadata.go, line 275 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Where are we still using FractionUsed()? For anything that matters?

The max-capacity check that attempts to avoid filling a node past 95% - https://github.com/cockroachdb/cockroach/blob/master/pkg/storage/allocator_scorer.go#L966


pkg/storage/allocator_scorer.go, line 523 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

What's that lonely - for? Do you mean -(a + b + c)? I also don't understand bad-fit. Similarly the other messages.

It's not meant to be there, must have either been a mistake or a leftover from some prior version of the format string.

disk-full/bad-fit/etc. are indicating the reason why we should-rebalance.


pkg/storage/allocator_scorer.go, line 567 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

What does capacity print like with %v? It should print like meanDiskUsage, right?

It prints as defined by (StoreCapacity).String() in roachpb.metadata.go. I've conventionally used %v for structs for a String method, but can switch to %s if that's our standard.


pkg/storage/store.go, line 2280 at r2 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I would definitely include SysBytes. It should be small, but sometimes it isn't.

Hm, I was going to include SysBytes, but it'd nice be to reuse an existing method and it seems a bit odd that MVCCStats.Total() doesn't include SysBytes. It may just be awkward naming, but adding an additional total-bytes method that returns more than Total would definitely be unintuitive.

It seems to me like either SysBytes should be included in Total (and thus used everywhere) or not used here either. I've switched to just using Total, but let me know if I should do otherwise.


Comments from Reviewable

@tbg
Copy link
Member

tbg commented Aug 17, 2017

Reviewed 20 of 20 files at r4, 17 of 17 files at r5.
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


pkg/storage/allocator_scorer.go, line 523 at r1 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

It's not meant to be there, must have either been a mistake or a leftover from some prior version of the format string.

disk-full/bad-fit/etc. are indicating the reason why we should-rebalance.

And the + later in the line? :trollface:


pkg/storage/allocator_scorer.go, line 567 at r1 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

It prints as defined by (StoreCapacity).String() in roachpb.metadata.go. I've conventionally used %v for structs for a String method, but can switch to %s if that's our standard.

Sorry, I'm fine with %v, was just curious whether that boils down to humanizeutil.IBytes(...) as well. (not important).


pkg/storage/store.go, line 2280 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Hm, I was going to include SysBytes, but it'd be to reuse an existing method and it seems a bit odd that [MVCCStats.Total()'](https://github.com/cockroachdb/cockroach/blob/master/pkg/storage/engine/enginepb/mvcc.go#L24) doesn't include SysBytes. It may just be awkward naming, but adding an additional total-bytes method that returns more than Total` would definitely be unintuitive.

It seems to me like either SysBytes should be included in Total (and thus used everywhere) or not used here either. I've switched to just using Total, but let me know if I should do otherwise.

Hmm, I'm not totally sure, but I'm afraid we're doing something weird in SysBytes. I think we use it for all keys that have the localPrefix, but that in turn contains both data that is local and replicated (various store-local things, but also, for example, the RangeDescriptor and everything else in LocalRangePrefix = roachpb.Key(makeKey(localPrefix, roachpb.RKey("k")))). Some comments in that area suggest that we're intentionally mixing unreplicated and replicated keys for, I guess, performance, so perhaps we're doing everything right -- not obvious to me, though. I suppose the consistency checker would find violations here, so perhaps it's all right.

Either way, ISTM that MVCCStats should really only contain the replicated parts and so we should include SysBytes in total, though there's no pressing need to do so right now. I'll file something to investigate: #17737


Comments from Reviewable

@a-robinson
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


pkg/storage/allocator_scorer.go, line 567 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Sorry, I'm fine with %v, was just curious whether that boils down to humanizeutil.IBytes(...) as well. (not important).

No, it boils down to (StoreCapacity).String(), which includes much more.


pkg/storage/store.go, line 2280 at r2 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Hmm, I'm not totally sure, but I'm afraid we're doing something weird in SysBytes. I think we use it for all keys that have the localPrefix, but that in turn contains both data that is local and replicated (various store-local things, but also, for example, the RangeDescriptor and everything else in LocalRangePrefix = roachpb.Key(makeKey(localPrefix, roachpb.RKey("k")))). Some comments in that area suggest that we're intentionally mixing unreplicated and replicated keys for, I guess, performance, so perhaps we're doing everything right -- not obvious to me, though. I suppose the consistency checker would find violations here, so perhaps it's all right.

Either way, ISTM that MVCCStats should really only contain the replicated parts and so we should include SysBytes in total, though there's no pressing need to do so right now. I'll file something to investigate: #17737

Thanks!


Comments from Reviewable

@a-robinson
Copy link
Contributor Author

So I think this is good to go, but now we don't have a graph in the admin UI that directly corresponds to what we're using to rebalance. We export metrics for KeyBytes and ValBytes, but not for TotalBytes, which is what would be ideal. I don't imagine we have the ability to add two metrics together into a single line in our UI, do we, @mrtracy? Adding a new metric would be redundant (and confusing to anyone trying to interpret the difference between all these fields), but is worth it if we can't graph it otherwise.

@petermattis
Copy link
Collaborator

Feels like the "Live Bytes per Store" graph should be renamed to "Logical Bytes per Store". Seems useful to graph the metric the rebalancer is using.

@a-robinson
Copy link
Contributor Author

Right, but it's not a matter of renaming, it's a matter of having the right data to graph.

@petermattis
Copy link
Collaborator

Ok. Let's get this change merged and file a follow-on issue to change the graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants