-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In the mixed file size scenario bandwidth is very underutilized - enterprise setup #5391
Comments
First questions I would love to ask:
|
Regarding rsync:
|
Explained in simple words what you do: Sync big and small files alternated. Right?
Does Bundling work folder by folder, or did you extend bundling to work across folders? |
This is the trade-off yes. Assumption
Lets do some maths. This will be our folder to sync: Assume, that your network is 5MB/s and one "1 Byte PUT" takes 1s (we are not in ideal home case scenario with empty server now guys, it is not that easy, it could take much much more) The request "latency" consists of 2 components, time it takes for bookkeeping on server and time it takes for data transfer. Current caseIf you do 100 small request in the row, your data transfer is neglible ~0s and bookkeeping time is 100s. Parallelising that maybe you can achieve 33s having 3 parallel flows. If you do 10 biger files request in a row, your data transfer is 10s bookkeeping. Parallelised say 4s if you are lucky with 3 parallel flows. However, you cannot omit 20s coming from transfer of 100MB, does not matter how many requests you have in parallel 1000 or 1. Your 5MB/s office net bounds you. Total you need around 33s plus 24s from big, having 57s. Optimized caseIf you do 100 small request in the row, your data transfer is neglible ~0s and bookkeeping time is 100s. Parallelising that maybe you can achieve 50s reserving 2 flow slots. In this 50s, you used neglible bandwidth. If you use the 3rd flow to pump there 30s comming from big files, you just synced your files in 50s, since your 30s are done in parallel filling the bandwidth. 57s vs 50s is like 13% of the time orginal time? In this example it is around 7s. 100 files is not a big deal but shows bigger picture More big files and more small files, the different is the percentage. Do the math for 55000 files and 55GB. Where you have 40000 of small files in 10GB(avg. size 250kB) and 15000 in the rest 45GB (avg size 3MB). I think we wont be talking about even minutes there :>
Look above, tried to find a definition of small file
Bundling works folder wise. So it will take all files under 10MB (chunk size) and try to bundle them within the specific folder. However, bundling can be in 3 parallel flows. So you will not leave the folder until you finished all files there. Still needs a mechanism to allow bigger files to be pumped in one of the flows, to ensure we cover the bandwidth nicely. |
Using Bundling, I assume exact same sync times with your scenario
Please don't mix things here, this is not about parallel flows, keep them out. |
@felixboehm Not sure what you mean, bundling still needs a bookkeeping, for 100 files it will be still 30s parallelised in 3 flows there, and they will be on the server in 2s, waiting next 30s for response. You just lost 30s doing nothing thinking about bandwidth. Bundling will help with latency though on enterprise scale |
Bundling would make the server think once (1s) then sending a bundle of 10MB including 100 files in one request. Thats what I thought. I would assume that bundling will lead to very much requests being of size CHUNK_SIZE, and thus a dynamic chunk size will be able to perfectly utilize the network. |
@felixboehm Yes, you just found a case in which Bundling will work like that. This is a very idealised case where all files are inside one folder, not distributed in 10 folders, and that our bundle will fit all 100 files. Woboqs suggest to start bundling with only 10 files. Furthermore, this is why I said with Bundling it will be even more powerful. |
Not even more powerful - In my opinion with bundling you don't need to alternate file sizes ... Bundling with 10 files? Reason? I expect to bundle always to the chunk size - also with dynamic chunk sizes. |
Files and path names are an ilusion. only inodes and blocks exist. If you call them chunks or bundles that does not matter. |
@felixboehm @jnweiger Just giving an alternative. I am just afraid of very long running requests with bundling. Each implementation has its pros and cons. With just bundling you would not however achieve full utilization in some cases. Combining two you are nearly sure. |
@felixboehm as far as I remember, the idea about alternating file sizes was to avoid starvation of small files, while one big video fills all the chunks. What exact strategy is best, only time will tell. The important part is that we implement an API, where we can apply tuning parameters. E.g. Allow priorization based on X.Y.Z. Maybe it turns out, that a shortest job first strategy wins, maybe other users prefer randomized queuing. Maybe it is youngest file first. We should not finialize the strategy now. Same for dynamic chunk sizes. Maybe there is a clever algorithm so that client and server negotiate the perfect chunk size, maybe the client simply ramps up the size, until the server chokes, then stays clear below that limit, with occasional tests, to see if smaller or bigger sizes result in improvement. |
Well, I think it is the natural expectation that file sync works directory-wise and not in a way that is perceived somehow arbitrary. Anyway in most cases the user probably won't even notice or care. What I really like is what @jnweiger said. Would be great to have 'buttons' to play with and run performance tests to see which parameter combinations deliver best results. This could in the end lead to an intelligent algorithm that adjusts parameters to always have the best possible performance for specific scenarios. Anyway, I think real performance tests are needed for this. |
Ok, so I consider this and the related PR: #5349 as alternative strategy and not needed. Agreed? We need to focus to the existing strategy of bundling (also across folder borders) and dynamic chunk sizes for network utilization, ... |
@felixboehm Why this and #5349 is an alternative strategy for Bundling? These are completely separate concepts which could live separately. I see a conceptual difference between Prioritizing Items, Trying to fill bandwidth using other files, and bundling which reduces latency influence. I said giving an alternative, but bundling does not solve all the problems, and introduces new. Its trade-off everywhere. Bundling indeed can sometimes fill the bandwidth, but very offen will not because of bookkeeping. It is lottery. |
BTW complexity in the code is increased because it is now monolitic around Propagator and Directory classes, I can propose new PR which will separate syncing jobs from data transfer jobs. I will create 2 black boxes and lets discuss. |
BTW, do you want performance/load test showing a context from this issue? Please ask @davidjericho for an account, sync all your private files, and observe. And lets back to the topic here. |
The question here seems to be whether the time needed for bundled transfers is still server-processing dominated (like individual small files are) or whether their runtime is dominated by data-transfer time. If it still is dominated by server processing (and @mrow4a seems to assume it is), having a large-file operation running at the same time could reduce total sync time. Do we have data about this? How long do 100x 1-byte file uploads take compared to a bundled upload of 100 1-byte files? |
@ckamm we've been experimenting with this on AARNet some time, and we're very aware of it due to the nature of our continental network and our spread user base. Elimating any file over 1MB (as we do have a large number of ADSL users in Australia), as of this instant the mean is 2.6s for any given file assuming time to transfer is not the issue. In further experiments, using our own bundling with disassembly of the bundle on the server in our own php code path, we're averaging 30 - 40 MByte/s of 10kb files over a gigabit link and using 100MB bundles, and then calling ownCloud's files scan function to update the filecache table after the files have been placed on disk. We had a huge ingest from a user about an hour ago, that bumped the mean time for any given file to 5.2 seconds not including transfer-over-the-wire time. In contrast, I routinely upload large files into the system at over 3.2Gbps, which is the present per thread TLS capabilty of our layer 7 offload servers. Add to that the cost of TLS round trips over continental scale latencies, and as far as I'm concerned, awareness of latency, server response time, and anything that can be done to make the service respond quicker to these sorts of queries is incredibly important. It's enough that I have an employee researching this full time so we can figure out how we address it ourselves. |
ANY performance test you do will never be precise, it just can give engineers understanding how it works. If you start selling that it improves something in general showing fancy graphs, you are crazy. If you however test it on someones setup, and show-case it, then you clearly win. However, please mind that every performance test is bound to its test setup! The below graph shows how it will behave on home-styled service without enterprise improvements, on 1ms latency. In server PR I have shown what you can squeze there thinking about server resources #5319 I did a test for ownCloud conference about that, testing for upload. |
Number of files per second is meaningless on its own. (Because files are just an illusion...) What really matters is bytes/sec over the wire. The above graph has some relevance, as it was done with a constant file size of 100 bytes per file. 60 files/sec equals 6000 bytes/sec in this case. Do we know the average file size (plus standard deviation of file size) in different user scenarios? |
I think we are now offtopic from the issue in the parent post. |
@mrow4a Please add details on your test results.
Don't sell me anything without testing! Actually you proposed to do find design decisions based on testing on the conference, right? |
I did not study the initial setup and scenarios in full. If that counts as off-topic, please split the issue. I would not think that is needed, but may help to reduce complexity here. It is an abstract case anyway and everybody wants to make it more concrete, it seems. |
I think this discussion belongs to discussion here: owncloud/core#25760.
This is right! Further details about bundling I have in presentations which I can provide, also show-case it on our local setup internally.
As above. However, this issue topic is to help pump in in between request without payload requests which have relatively bigger payload, in order to levarage bandwidth fully. Feature called bundling is only partly related here. This is why I mentioned that to reproduce the issue above, you should just take any enterprise setup with milions of files/shares and many many users, and sync a series of small files -> 100B-5kB, and some bigger files in other folders, or even within the same folder. You should see in any network analyser what I tried to explain here. @ckamm gave a good short version of the problem |
@felixboehm Have been looking in the code, also doing #5406, and I think this is possible that using Bundling we can achieve similar effect as discussed in first post. Bundling just cannot by default try to parallelise itself to cover all available flows. |
Bundling would already help making the PHP/WebDAV overhead smaller. What I don't understand is why we accept that 5.2 seconds is a normal request processing delay for one file, basically just to write metadata into a database while Google can search the whole Internet in 0.5 seconds. Is there really nothing we can do there on the server configuration or code? It sounds like writing the metadata in a json file per user would be faster than the database at that point. |
This is totaly true, but this is how database is currently designed. This is also why EOS is blazing fast, it is namespace based and stores metadata in the key-value datastore. BTW: for many files/shares/external storages per user your JSON will take ages |
I don't see anything near 5.2s per file on the server. This is clearly not an assumption to calculate with or optimize for... I strongly dislike alternating file sizes. |
My above example is for 1s per file.
Ok! It will be together with http2 probably. |
But I very like your work, ideas and prototypes!! |
Let's get bundling into the client first and see then if alternating file sizes has a performance impact that can be proven. Then scope the possibilities to have intelligent algorithms to dynamically set parameters for chunk sizes and/or alternating file sizes. |
Setup information
Target server is enterprise scale setup, meaning so called "1 Byte PUT duration" takes relatively long compared to home scale server. This is also server on which "100MB PUT duration" on high-bandwidth client takes relatively faster compared to home scale server.
Scenario 1
Imagine you are just doing your sync after a time of longer inactivity on one of your devices. You used your web interface, mobile and other devices - e.g. work computer - to add your files. You have many small files below 1MB to be uploaded/downloaded, many between 1MB and 5MB, and some over 5MB.
Your folder is usualy structured that folders contain usualy files of similar file sizes, with few exceptions. Number of the folders containing changes, is small. Generalizing, typical user case.
Scenario 2
Imagine you are just doing your initial sync on one of the sync clients. Imagine, that your initial sync folder contains 55000 files and 50GB of data. You have around 20000 files under 1MB, 35000 files over 1MB. This is contained in around 500 folders
Actual behaviour
If sync client finds folder with many >1MB files, everything is fine. However, when a sync client visits a folder with many <1MB files, small files PUT takes relatively long, bandwidth is underutilized, since in the same time "raw bytes data" transfer could be performed of other bigger files.
Solution
The solution is a continuation of Request Scheduler PRs on the client: Folder items scheduler - evaluation attribute sorted tree
In the implementation, Request Scheduler will construct 2(3) separate queues (robins) in each of the folders to synchronise:
Propagator, which is class responsible for scheduling items, will contain new value - queueRobin.
Directory, which is class responsible for dispatching items within folder that contain changes, will contain new value - minCurrentFilesSize.
Request scheduler will perform round robin on each of the queues, visiting them sequentially within one Directory class:
Advantages
There would be no difference syncing 1 or 10 files at once.
There would be no difference syncing significant number of files of the same filesize.
Bandwidth will be nicely utilized and sync time reduced, thus during the "server bookkeeping time", some Bytes of another file will be transferred (in-flight), if number of files is significant and their file sizes are mixed.
Above situation will never happen and other bigger files will be transferred (if there are any of course satisfying the robin).
We can close this issue, Broken sync time estimation - enterprise setup #5390, because 1MB files will already show nice approximation if they are available or not. (if they are available, transfer of 1MB will show bandwidth of client plus bookkeeping, if they are not, it will show bookeeping only, since 1MB is under 1s for most of the link bandwidths out there on desktop)
This can be very powerful combining with Bundling Prototype: Requests bundling feature implementation #5319
This can be very powerful combining with Dynamic Chunking Prototype: dynamic chunkingNG #5368
This can be very powerful combining with Priority Scheduling Prototype: Folder items scheduler - evaluation attribute sorted tree #5349
Disadvantages
Assumption
Lets do some maths. This will be our folder to sync:
100 files - average 100kB file size -> total 10MB to be transfered
10 files - average 10MB file size -> total 100MB to be transfered
Assume, that your network is 5MB/s and one "1 Byte PUT" takes 1s (we are not in ideal home case scenario with empty server now guys, it is not that easy, it could take much much more)
The request "latency" consists of 2 components, time it takes for bookkeeping on server and time it takes for data transfer.
Current case
If you do 100 small request in the row, your data transfer is neglible ~0s and bookkeeping time is 100s. Parallelising that maybe you can achieve 33s having 3 parallel flows.
If you do 10 biger files request in a row, your data transfer is 10s bookkeeping. Parallelised say 4s if you are lucky with 3 parallel flows. However, you cannot omit 20s coming from transfer of 100MB, does not matter how many requests you have in parallel 1000 or 1. Your 5MB/s office net bounds you.
Total you need around 33s plus 24s from big, having 57s.
Optimized case
If you do 100 small request in the row, your data transfer is neglible ~0s and bookkeeping time is 100s. Parallelising that maybe you can achieve 50s reserving 2 flow slots. In this 50s, you used neglible bandwidth. If you use the 3rd flow to pump there 30s comming from big files, you just synced your files in 50s, since your 30s are done in parallel filling the bandwidth.
57s vs 50s is like 13% of the time orginal time? In this example it is around 7s. 100 files is not a big deal but shows bigger picture
More big files and more small files, the different is the percentage. Do the math for 55000 files and 55GB. Where you have 40000 of small files in 10GB(avg. size 250kB) and 15000 in the rest 45GB (avg size 3MB). I think we wont be talking about even minutes there :>
Look above, tried to find a definition of small file
@hodyroff @ogoffart @jturcotte @guruz @butonic @felixboehm @DeepDiver1975 @cdamken
--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/40190729-in-the-mixed-file-size-scenario-bandwidth-is-very-underutilized-enterprise-setup?utm_campaign=plugin&utm_content=tracker%2F216457&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F216457&utm_medium=issues&utm_source=github).The text was updated successfully, but these errors were encountered: