Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop ReadFromWithConcurrency sending more data than it needs to #537

Merged
merged 1 commit into from
Feb 13, 2023

Conversation

ncw
Copy link
Contributor

@ncw ncw commented Feb 13, 2023

It was discovered that the ReadFrom method for uploading files in
pkg/sftp was sending more data than it needed to.

This was tracked down to the ReadFromWithConcurrency method forgetting
to truncate the packets it was sending to the size Read.

This was giving the remote server more work to do as it was writing
and re-writing parts of a file.

See: rclone/rclone#6763

It was discovered that the ReadFrom method for uploading files in
pkg/sftp was sending more data than it needed to.

This was tracked down to the ReadFromWithConcurrency method forgetting
to truncate the packets it was sending to the size Read.

This was giving the remote server more work to do as it was writing
and re-writing parts of a file.

See: rclone/rclone#6763
@drakkan
Copy link
Collaborator

drakkan commented Feb 13, 2023

Thanks!

@drakkan drakkan merged commit 971c283 into pkg:master Feb 13, 2023
@puellanivis
Copy link
Collaborator

Huh… good catch!

@ncw
Copy link
Contributor Author

ncw commented Feb 14, 2023

Thanks for merging :-)

On a philosophical point - the current implementation directly couples the block size read from the io.Reader to the block size sent over the SFTP packet.

io.Readers are quite entitled not to fill the buffer if they don't have data at that moment (common with network based io.Readers).

From https://pkg.go.dev/io#Reader :

If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.

Since SFTP isn't latency sensitive, I'd suggest reading from the io.Reader until the outgoing packet is full. This will decrease the number of packets sent and increase the throughput slightly.

What do you think? Happy to send a PR if you like the idea.

@ncw ncw deleted the fix-readfrom branch February 14, 2023 11:08
ncw added a commit to rclone/rclone that referenced this pull request Feb 14, 2023
The block size for crypt is 64k + a few bytes. The default block size
for sftp is 32k. This means that the blocks for crypt get split over 3
sftp packets two of 32k and one of a few bytes.

However due to a bug in pkg/sftp it was sending 32k instead of just a
few bytes, leading to the 65% slowdown.

This was fixed in the upstream library.

This bug probably affected transfers from over the network sources
also.

Fixes #6763
See: pkg/sftp#537
ncw added a commit to rclone/rclone that referenced this pull request Feb 14, 2023
The block size for crypt is 64k + a few bytes. The default block size
for sftp is 32k. This means that the blocks for crypt get split over 3
sftp packets two of 32k and one of a few bytes.

However due to a bug in pkg/sftp it was sending 32k instead of just a
few bytes, leading to the 65% slowdown.

This was fixed in the upstream library.

This bug probably affected transfers from over the network sources
also.

Fixes #6763
See: pkg/sftp#537
@puellanivis
Copy link
Collaborator

Yeah, I was thinking of this as an alternative to the code here, basically, use io.ReadFull instead. I figure the change wouldn’t be so big at all, so, if you want to spin up a PR, and at least get some benchmark information about if it’s better or not…

Skylion007 pushed a commit to Skylion007/rclone that referenced this pull request Feb 23, 2023
The block size for crypt is 64k + a few bytes. The default block size
for sftp is 32k. This means that the blocks for crypt get split over 3
sftp packets two of 32k and one of a few bytes.

However due to a bug in pkg/sftp it was sending 32k instead of just a
few bytes, leading to the 65% slowdown.

This was fixed in the upstream library.

This bug probably affected transfers from over the network sources
also.

Fixes rclone#6763
See: pkg/sftp#537
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants