Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachprod: change scp flags to avoid copy via local host #99283

Merged
merged 1 commit into from
Mar 23, 2023

Conversation

aliher1911
Copy link
Contributor

Previously default scp option made it to copy files via local host. This was causing unnecessary delays and network usage because files were first copied to one node, then distributed from that node using scp between remote locations. Unfortunately the default scp behaviour is to fetch from src and put to dst to avoid authenticating from src to dst.
This commit adds scp options to do direct copy.

Fixes: #99103

Release note: None

Previously default scp option made it to copy files via local host.
This was causing unnecessary delays and network usage because files
were first copied to one node, then distributed from that node using
scp between remote locations. Unfortunately the default scp behaviour
is to fetch from src and put to dst to avoid authenticating from src
to dst.
This commit adds scp options to do direct copy.

Fixes: cockroachdb#99103

Release note: None
@aliher1911 aliher1911 requested a review from tbg March 22, 2023 20:52
@aliher1911 aliher1911 requested a review from a team as a code owner March 22, 2023 20:52
@aliher1911 aliher1911 requested review from herkolategan and smg260 and removed request for a team March 22, 2023 20:52
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aliher1911
Copy link
Contributor Author

Without fix put to 6 nodes takes 4:30:

10:34:12 cluster_synced.go:1690: 1679394722-01-n6cpu4: putting (dist) /Users/.../go/src/github.com/cockroachdb/cockroach/artifacts/02a019a415c/cockroach ./cockroach-default on nodes [1 2 3 4 5 6]
10:38:45 cluster_synced.go:1888:
10:38:45 cluster_synced.go:1891:    1: done
10:38:45 cluster_synced.go:1891:    2: done
10:38:45 cluster_synced.go:1891:    3: done
10:38:45 cluster_synced.go:1891:    4: done
10:38:45 cluster_synced.go:1891:    5: done
10:38:45 cluster_synced.go:1891:    6: done

With this fix put takes 0:58:

20:42:54 cluster_synced.go:1690: oleg-1679517640-01-n6cpu4: putting (dist) /Users/ali/go/src/github.com/cockroachdb/cockroach/artifacts/2f821d4d752-dirty/cockroach ./cockroach-default on nodes [1 2 3 4 5 6]
20:43:52 cluster_synced.go:1888:
20:43:52 cluster_synced.go:1891:    1: done
20:43:52 cluster_synced.go:1891:    2: done
20:43:52 cluster_synced.go:1891:    3: done
20:43:52 cluster_synced.go:1891:    4: done
20:43:52 cluster_synced.go:1891:    5: done
20:43:52 cluster_synced.go:1891:    6: done

@aliher1911 aliher1911 self-assigned this Mar 22, 2023
@aliher1911
Copy link
Contributor Author

bors r=tbg

@craig
Copy link
Contributor

craig bot commented Mar 23, 2023

Build succeeded:

@craig craig bot merged commit b1f889e into cockroachdb:master Mar 23, 2023
@aliher1911 aliher1911 deleted the fix_scp_traversal branch March 23, 2023 10:42
@renatolabs
Copy link
Contributor

@aliher1911 seems like -R is only available on BSD scp; this is failing on TC agents 😞

We can conditionally add that flag only on OSX/BSD; that should help you with your development flow while keeping behavior on TC as it was before.

@aliher1911
Copy link
Contributor Author

Argh I was suspicious that its not universal and too trigger happy!
Can we rollback?

@renatolabs
Copy link
Contributor

Revert PR: #99419.

AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Mar 23, 2023
In cockroachdb#99283 the flags `-R -A` were added to the `scp` command invocation,
in order to ensure that copies between two remote hosts are done without
transferring to the localhost. These flags are only supported on MacOS
(`"darwin"`) and not on Linux, as this behavior is already the default
on Linux. This change fixes that issue by only using these flags on
MacOS, allowing the `scp` invocation (and by extension, roachprod) to
work on Linux.

Epic: none

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Apr 3, 2023
In cockroachdb#99283 the flags `-R -A` were added to the `scp` command invocation,
in order to ensure that copies between two remote hosts are done without
transferring to the localhost. These flags are only supported on MacOS
(`"darwin"`) and not on Linux, as this behavior is already the default
on Linux. This change fixes that issue by only using these flags on
MacOS, allowing the `scp` invocation (and by extension, roachprod) to
work on Linux.

Epic: none

Release note: None
craig bot pushed a commit that referenced this pull request Apr 3, 2023
99430: roachprod: ensure scp uses supported flags r=AlexTalks a=AlexTalks

In #99283 the flags `-R -A` were added to the `scp` command invocation, in order to ensure that copies between two remote hosts are done without transferring to the localhost. These flags are only supported on MacOS (`"darwin"`) and not on Linux, as this behavior is already the default on Linux. This change fixes that issue by only using these flags on MacOS, allowing the `scp` invocation (and by extension, roachprod) to work on Linux.

Epic: none

Release note: None

99982: clusterversion: clarify where to add new version gates during stability r=rail a=celiala

The previous direction said to always put append version gates to the end
of the list. However, during the stability period, before we mint the
release branch, there are two places to add new gates:

- new gates should go _before_ the "mint" version, if it is being backported
  to the release branch and we have not yet merged the "mint" version to
  the release branch.
- new gates for the upcoming development, should be appended to the end
  of the list.

This direction should be reverted/simplified (to direct to always append) once the commit to "mint" 23.1 has been backported and merged to release-23.1. Tracking issue for this is: https://cockroachlabs.atlassian.net/browse/REL-311

Release note: None
Epic: REL-283

100378: kvserver: fix merge queue test failure due to race r=AlexTalks a=AlexTalks

Previously, we saw `TestMergeQueueDoesNotInterruptReplicationChange` fail on arm64 machines (#99349), which, after testing, was determined to be due to timing issues in the test. As such, this change modifies the test so as to not assume that the snapshot will start within the 100ms time frame used as a buffer, and will instead wait for the snapshot to start before attempting the merge.

Fixes: #99349

Release note: None

Co-authored-by: Alex Sarkesian <[email protected]>
Co-authored-by: Celia La <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachprod: don't scp files via local host when doing treedist
4 participants