-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PUT of large files via oc-client inefficient with standard chunksize in fast networks #4354
Comments
cc @dragotin |
Comment by @DeepDiver1975:
|
Thank you for providing these performance test results. |
a config value will not help - in case of having the client running on a laptop living in different networks it will be a nightmare to change the setting over and over again. this has to be an adaptive value depending on the network quality |
Perhaps the best way to calculate what the chunk size should be involves starting at the default 5MB and measuring how long it takes to transfer chunks and how long it takes to start sending a new chunk. When that ratio gets close it indicates that the existing chunk size could probably be bigger, so increment a step, observe, [in|de]crement a step, observe, and so on. I'd guess the observation period would be based on number of chunks, not amount of time. Perhaps watch the first 6 chunks to get a starting avg then step up by XMB based on the ratio (closer indicates capacity to take a bigger step), watch the next 6, and so on. How would the server handle chunks of variable size? Would it even have to worry? |
The server can handle dynamic chunk sizes. |
I did this process manually for syncing from a VM to our servers with a 10gbit link between them. One thing to note is I found that after a certain point there was increased processing time server side so that increasing chunk size resulted in an ever increasing gap before the the process is complete and the client can start sending the next chunk. This is likely due to our storage being an SMB mount and the time it takes to do all the transfers. We ended up settling on 50MB chunks. It would probably be wise to have a ceiling on the variable chunk size to avoid a runaway condition in the calculation. |
@DeepDiver1975 does the NFS storage backend used here have any influence? |
I did suggest something similar to @DeepDiver1975, but probably in a wrong thread: owncloud/core#9832 (comment) |
Would a general usage of bigger chunks have any disadvantages? @scolebrook That's interesting. 50MB was also my best performing chunk size. |
Chunks are stored within the data folder - in a clustered setup the data folder has to be a shared resource among the web servers. So - yes - the shared file system has an influence on chunked upload but should not be of any greater value. One thing to remark about the current chunked implementation on the server side: These issue have been addressed with the new chunking implementation. |
@DeepDiver1975 Chunks are fully read into memory before they are written to disk Really? Not in my case as it seems. I can live watch the chunks growing in my php upload temp directory until they reach the full chunk size. Then they are moved away to the users cache folder I think. Every 0,1s: ls -l Thu Jan 14 10:05:53 2016 total 70877 |
yes - and this is the operation which reads them fully into RAM |
I don't know the internals but in my setup I can verify this behavior: 5MB chunks -> throughput ~13MB/s -> server CPU load ~50% -> server RAM consumption ~470MB |
You are right - it's only on assembly where the full chunk is loaded to ram 🙈 |
The high server load is possibly a server bug. Not sure why it would occur like that, maybe too much per-chunk overhead?
Yep. |
We do not want to increase the default chunk size because that is very bad for slow networks. Not that many people have such high bandwith. A good solution would be to adopt the chunk size according to the current network speed. As the chunk size has to be constant during the chunked upload of one file, we can only change it for the next file. So what we suggest: During the upload of the chunks of a certain file we measure the time that takes. If a good percentage of the chunks for the file upload faster than in a certain time span, we increase the base value for the chunk size. For the upload of the next file, we use that adopted chunk size. In addition to that, we make the chunk size a value that can be set in the config file. That way, deployments can adopt the base value depending on their site network. @felixboehm fyi |
I'll temporarily change this to 2.1.1: We want the config setting in 2.1.1, the dynamic scaling in 2.2 |
@DeepDiver1975 its unfortunate that the new chunking can also not handle changing chunk sizes during the upload. We should rethink that for the new chunking. |
@felixboehm @MorrisJobke would it be possible (I think I asked before) to get an access_log file from this specific installation with transaction time enabled for an example upload for both chunk sizes? That would be helpful to better understand where the time is lost. |
Given that each chunk needs to be read into RAM on the server during assembly, this has implications on the server. I think there needs to be a config.php value so an admin can set a maximum chunk size which the client can get via the capabilities system. If this value is set on the server the client then uses this as it's ceiling. My thinking is that if you end up with a bunch of clients on a high speed LAN connection all deciding that they want to send 50MB chunks, that could result in the worker processes taking well over double the regular RAM during assembly. On a busy server that could cause a problem. Best to give admins control as every single deployment will have a different optimal max. |
@scolebrook I think this is fixed as part of "These issue have been addressed with the new chunking implementation." |
@DeepDiver1975 wow that sounds promising 👍 |
Added a new "chunkSize" entry in the General group of the owncloud.cfg which can be set to the size, in bytes, of the chunks. This allow user with hude bandwidth to select more optimal chunk size Issue #4354
in owncloud client 2.1.1 it will be possible to configue the chunk size with the |
that's cool. Is this a step towards anything, or just a tuning option?
|
@MTRichards that should resolve the blue ticket for 2.1.1, in case the admin does deployments of the ownCloud client on the machines in the network he/she can adjust that config value The automatic adjustment is planned for 2.2 |
ok, got it. @felixboehm seems a good answer to quickly get that feature resolved. |
yes, good solution. |
@ogoffart Will I have an option to set the number of max_parallel uploads in owncloud.cfg too? That would also be nice I think. |
@guruz, @dragotin: will you remember the measured best chunk size fit and for how long? you can consider that if the network was not changed then most likely the network properties did not change either (at least in a short time). you can anyway adjust it on the next upload. also, would not it make sense to sort the uploaded files by size and start with the files which size matches the closest the current chunk size (or the smallest size first?). 3rd: once this is in place we can actually measure the impact of this feature using our existing monitoring infrastructure against various owncloud endpoints (the one presented at CS3). |
Thanks to the logs from @felixboehm, here is the conclusion: In the 50MB chunk case, we do 1.66 PUT by seconds. But what's more worrying is that it starts with 6 PUT per seconds, and ends with only 3 PUT per second, showing that the processing time increase with the number of chunks. (server issue: |
Server side PR owncloud/core#22602 That should fix the chunks getting slower over time... |
As discussed with Klaas, this seems to be a better compromise. 10MB * 3 prarralel jobs = 30MB in memory, and to retry in case of disconnection. Which is still reasonable. And might make the upload almost twice as fast on fast network where the amount of chunk is the bottleneck (because of more server processing) Relates to issue #4354
Changed the default chunk size to 10MB. We will not do the adaptive chunk sizing in 2.2 |
Closing this issue.
We will switch to automatic scaling of chunk size with #4019 |
Where is the "discussion with Klaas" as mentioned in the commit (b685f6b) that changed the default chunk size? I'd be interested to see that and the reasonings for this change All I see in this issue are comments saying "do not increase the chunk size because of memory issues and most people don't have fast networks" yet, it was still changed. This to me seems like a very small edge case that a few people have been very vocal about.
Why accommodate for a very small edge case and make upload of files more problematic for the vast majority? There is already the user-configurable setting for these few users, so why change the default? |
@pjrobertson the discussion of this goes back on issue #4354 We were discussing solutions back and forth, and the best solution would be to dynamically adjust the chunk size based on the bandwidth. But that is not so easy. The quality of peoples bandwidth and computing power seems to be very different all over the place. That is why it is hard to decide on a default. However, 10MB has been the default quite some time ago, than we changed it to 5MB, and now back to 10MB because we thought that even with limited resources, this is ok. Note that this will only affect the upload of files > 10MB, the others are not at all chunked but uploaded in one request. Also note that we enhanced the upload of small files in 2.2.0 with dynamically enhancing the number of parallel requests. So the experience should be better. |
@pjrobertson The article you cite as your source for your speed numbers are from Ookla (aka speedtest). So it only factors in A: people who have broadband type internet and B: people who are nerdy enough to want to test how fast it is. It is far from a measure of global average internet speeds. But this isn't about internet speeds. It's about network speeds. Very different. The average LAN network is 1Gbps for wired connections these days. And server to server is typically 10Gbps. So there's your 100MB/s of upload bandwidth. Even wifi is several hundred mbps. What you describe as a very small edge case is in fact problematic for a very large number of people. Keep in mind that one corporate installation of ownCloud has several thousand users behind it. Potentially hundreds of thousands (or more - only ownCloud Inc has the numbers to determine that) of end users are affected by this. Chunk size used to be 10MB. 5MB was a performance regression. Going back is a temporary fix until the client is made smart enough to determine the best chunk size. |
Yeah I got this when citing that source. I assumed that those nerdy enough to test their internet probably have fast internet, so this was an over-estimate of world average upload speed. It's probably much lower.
So you're saying that the majority of ownCloud installs are run on local networks not on distributed networks via the internet (e.g. corporate companies and large organisations with their own infrastructure). If this is the case then fair enough, my assumption was incorrect.
Fair enough, it seems very hard to determine. The open source nature of ownCloud also makes it hard to determine who installs it and where, so even ownCloud Inc. probably doesn't have enough valuable information on this. If they did then it would be interesting to do an actual analysis of typical infrastructure and installs, to estimate client speed. But then again, it's probably more effort than it's worth at this point if variable chunk sizes is coming. The other argument to bear in mind is what has a greater negative effect on user experience? Longer upload time for those users with fast internet connections, or more failures uploading files for users with slow internet connections (& timeouts). |
@pjrobertson Don't think about number of ownCloud installs and what kind forms the majority. Think about the number of users. One install of ownCloud may server 1 user. One install may server 10,000. Should the needs of the 1 person be dismissed? Absolutely not. Should the needs of each of the 10,000 be considered only 1/10000th as important as the needs of the 1? Absolutely not. And that 1 person may be connecting to an instance running on a web server in their basement in which case, for this issue, they're in the same boat as the 10,000. The dynamics that are coming are because there is no "sweet spot" that'll work well for everyone under all circumstances. I want high speed when I'm in the office. I but I don't want to have to manually change my settings when I'm connecting to wifi at an airport or from home so that I get the best performance I can in those locations. Let the computer figure out the right size based on the changing circumstances. You're not in a unique situation being in China connecting to a server outside of China. About 85% of my companies staff our not in the same continent as the servers providing our ownCloud service. It would be interesting to determine where the timeouts are occurring. One potential cause is that packets are being dropped. This could be because of QoS rules favouring traffic like VoIP or perhaps a particular link is overwhelmed with traffic. A traceroute and some ping testing could help determine where the problem lies. Once you know that, you may have options. |
Expected behaviour
ownCloud desktop client performs well while uploading large files in a fast network. (~100MB/s)
Actual behaviour
ownCloud desktop client performs bad while producing a high server load with standard parameters
(~10-20MB/s)
If you set the chunksize to a higher value the transfrerate raises while the server load drops.
(~90-100MB/s)
Steps to reproduce
Server configuration
Operating system:
openSUSE 13.1 (Bottle) (x86_64)
Web server:
Apache/2.4.16
Database:
PostgreSQL 9.4.5
PHP version:
5.6.9
ownCloud version:
8.2.2
Storage backend:
NFS
Client configuration
Client version:
2.1.0 (build 2944)
Operating system:
OS X 10.10.5
OS language:
German
Installation path of client:
/Applications/owncloud.app
The text was updated successfully, but these errors were encountered: