Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jobs: bump default progress log time to 30s #25791

Merged
merged 1 commit into from
May 31, 2018
Merged

jobs: bump default progress log time to 30s #25791

merged 1 commit into from
May 31, 2018

Conversation

maddyblue
Copy link
Contributor

@maddyblue maddyblue commented May 22, 2018

The previous code allowed updates to be performed every 1s, which could
cause the MVCC row to be very large causing problems with splits. We
can update much more slowly by default. In the case of a small backup
job, the 5% fraction threshold will allow a speedier update rate.

Remove a note that's not useful anymore since the referred function
can now only be used in the described safe way.

See #25770. Although this change didn't fix that bug, we still think
it's a good idea.

Release note: None

@maddyblue maddyblue requested review from dt, danhhz and a team May 22, 2018 02:21
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@nvanbenschoten
Copy link
Member

:lgtm:

I'm assuming that this effectively fixes the workload fixtures make issue in the referenced issue. Did you confirm that it does? If you haven't tried it already, I'd recommend setting --warehouses=15000 to avoid colliding with the existing 10k fixture.


Review status: 0 of 1 files reviewed at latest revision, all discussions resolved.


pkg/sql/jobs/progress.go, line 26 at r1 (raw file):

// progressFractionThreshold.
var (
	progressTimeThreshold             = time.Second * 30

nit: 30 * time.Second


Comments from Reviewable

@maddyblue
Copy link
Contributor Author

@nvanbenschoten how long does it take for that workload command to fail? it's not going quickly for me in my tests so far when i'm verifying that master fails.

@nvanbenschoten
Copy link
Member

It took around 20 minutes to fail before, but I'd let the command finish successfully (~3 hours) before concluding that this fixes the issue completely.

@maddyblue
Copy link
Contributor Author

This didn't appear to fix the problem. I'm going to work on a test that can reproduce this faster.

@nvanbenschoten
Copy link
Member

You could try dropping the range size so that the row doesn't need to grow as large to trigger the error.

The previous code allowed updates to be performed every 1s, which could
cause the MVCC row to be very large causing problems with splits. We
can update much more slowly by default. In the case of a small backup
job, the 5% fraction threshold will allow a speedier update rate.

Remove a note that's not useful anymore since the referred function
can now only be used in the described safe way.

See #25770. Although this change didn't fix that bug, we still think
it's a good idea.

Release note: None
@maddyblue
Copy link
Contributor Author

I've removed the Fixes line so that this PR no longer closes the original bug. But I still think this is a good idea to merge anyway.

@maddyblue
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request May 31, 2018
25014: storage: queue requests to push txn / resolve intents on single keys r=spencerkimball a=spencerkimball

Previously, high contention on a single key would cause every thread to
push the same conflicting transaction then resolve the same intent in
parallel. This is inefficient as only one pusher needs to succeed, and
only one resolver needs to resolve the intent, and then only one writer
should proceed while the other readers/writers should in turn wait on
the previous writer by pushing its transaction. This effectively
serializes the conflicting reader/writers.
    
One complication is that all pushers which may have a valid, writing
transaction (i.e., `Transaction.Key != nil`), must push either the
conflicting transaction or another transaction already pushing that
transaction. This allows dependency cycles to be discovered.

Fixes #20448 

25791: jobs: bump default progress log time to 30s r=mjibson a=mjibson

The previous code allowed updates to be performed every 1s, which could
cause the MVCC row to be very large causing problems with splits. We
can update much more slowly by default. In the case of a small backup
job, the 5% fraction threshold will allow a speedier update rate.

Remove a note that's not useful anymore since the referred function
can now only be used in the described safe way.

See #25770. Although this change didn't fix that bug, we still think
it's a good idea.

Release note: None

26293: opt: enable a few distsql logictests r=RaduBerinde a=RaduBerinde

 - `distsql_indexjoin`: this is only a planning test. Modifying the
   split points and queries a bit to make the condition more
   restrictive and make the optimizer choose index joins. There was a
   single plan that was different, and the difference was minor (the
   old planner is emitting an unnecessary column).

 - `distsql_expr`: logic-only test, enabling for opt.

 - `distsql_scrub`: planning test; opt version commented out for now.

Release note: None

Co-authored-by: Spencer Kimball <[email protected]>
Co-authored-by: Matt Jibson <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
@craig
Copy link
Contributor

craig bot commented May 31, 2018

Build succeeded

@craig craig bot merged commit 1faebfa into cockroachdb:master May 31, 2018
@maddyblue maddyblue deleted the loop-time branch May 31, 2018 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants