Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: allow multiple concurrent snapshots #7307

Merged
merged 1 commit into from
Dec 3, 2016
Merged

storage: allow multiple concurrent snapshots #7307

merged 1 commit into from
Dec 3, 2016

Conversation

tamird
Copy link
Contributor

@tamird tamird commented Jun 17, 2016

@cuongdo this is on top of #7299 - I bet the degradation we saw with this change originally was the result of that issue. Can you test it?


This change is Reviewable

@tamird tamird self-assigned this Jun 17, 2016
@petermattis
Copy link
Collaborator

Review status: 0 of 7 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


storage/store.go, line 673 [r2] (raw file):

  raftElectionTimeout := time.Duration(sc.RaftElectionTimeoutTicks) * sc.RaftTickInterval
  sc.concurrentSnapshotLimit = 5

Per my earlier comment, let's default this to 1 and perhaps allow an env var to override. I'd like to see concrete numbers showing there is a benefit to raising the number of concurrent snapshots. I worry about allowing multiple concurrent snapshots while we have the existing memory issues with snapshots.


Comments from Reviewable

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Review status: 0 of 7 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


storage/store.go, line 673 [r2] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Per my earlier comment, let's default this to 1 and perhaps allow an env var to override. I'd like to see concrete numbers showing there is a benefit to raising the number of concurrent snapshots. I worry about allowing multiple concurrent snapshots while we have the existing memory issues with snapshots.

Will do.

Comments from Reviewable

@bdarnell
Copy link
Contributor

I'd rather hold off on this change until we demonstrate stability with the current version, especially since I agree with Peter that it is unlikely that allowing more snapshots will make sense until we can move to streaming snapshots.

@cuongdo
Copy link
Contributor

cuongdo commented Jun 17, 2016

Agreed. Let's let the dust settle from the recent Raft transport and
reservations changes first.

On Fri, Jun 17, 2016 at 12:22 PM Ben Darnell [email protected]
wrote:

I'd rather hold off on this change until we demonstrate stability with the
current version, especially since I agree with Peter that it is unlikely
that allowing more snapshots will make sense until we can move to streaming
snapshots.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7307 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABffphosEw8loyw4Eh_17eFbuaOm3uwOks5qMsnXgaJpZM4I4Tmk
.

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Sure. I've rebased this, so it should be good to go whenever we're ready to start experimenting with it again.


Review status: 0 of 5 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


Comments from Reviewable

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Review status: 0 of 5 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


storage/store.go, line 673 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Will do.

Done.

Comments from Reviewable

@tamird tamird assigned cuongdo and unassigned tamird Jun 17, 2016
@tamird tamird changed the title Revert "Revert "storage: allow multiple in-flight snapshots"" storage: allow multiple concurrent snapshots Sep 26, 2016
@tamird
Copy link
Contributor Author

tamird commented Sep 26, 2016

Rebased on top of streaming snapshots, much simpler now.

@jordanlewis
Copy link
Member

Is there any way that we can test this change with more than 1 concurrent snapshot before merging? I see that the default is 1, but I'm nervous that we'll see problems with snapshots when we adjust that configuration. I think it would be nice to try to catch potential problems with concurrent snapshots with a test before we merge this.

The code :lgtm:, though.


Reviewed 7 of 7 files at r3.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


Comments from Reviewable

@petermattis
Copy link
Collaborator

I like getting rid of the use of atomic here. It would be nice to have evidence that we'll ever set this value larger than 1.


Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


Comments from Reviewable

@tamird
Copy link
Contributor Author

tamird commented Sep 26, 2016

@jordanlewis updated the default to 2 -- I guess it doesn't make much sense to merge this before the default is changed. The tests still pass (though maybe you had some specific tests in mind).

@petermattis indeed -- hard to say if or when we will. Perhaps the chaos monkeys will tell. On the other hand, it may be good to expose concurrency bugs while the streaming snapshots change is fresh in everyone's minds.


Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


Comments from Reviewable

@petermattis
Copy link
Collaborator

Can't we show via some local workload that a value larger than 1 is worthwhile?


Review status: 0 of 2 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


Comments from Reviewable

@bdarnell
Copy link
Contributor

The code change LGTM, but I'd rather keep the default at 1 until we can show that this is both safe and beneficial.

@tamird
Copy link
Contributor Author

tamird commented Dec 3, 2016

OK, updated the default to 1.

@petermattis
Copy link
Collaborator

:lgtm:


Review status: 0 of 2 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/storage/store.go, line 778 at r4 (raw file):

	}
	if sc.concurrentSnapshotLimit == 0 {
		sc.concurrentSnapshotLimit = envutil.EnvOrDefaultInt("COCKROACH_CONCURRENT_SNAPSHOT_LIMIT", 1)

Can you mind adding a small comment here: NB: be careful about setting this value higher than 1 as it will degrade client throughput.


Comments from Reviewable

Set the default to 1 for the moment.
@tamird
Copy link
Contributor Author

tamird commented Dec 3, 2016

Review status: 0 of 2 files reviewed at latest revision, 2 unresolved discussions.


pkg/storage/store.go, line 778 at r4 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Can you mind adding a small comment here: NB: be careful about setting this value higher than 1 as it will degrade client throughput.

Done.


Comments from Reviewable

@tamird tamird merged commit 7c959bd into cockroachdb:master Dec 3, 2016
@tamird tamird deleted the raft-multiple-snapshots branch December 3, 2016 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants