Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group requests to ensure balance between bandwidth utilization and bookkeeping #5391 #5390 #4498 #1633 #4454 #5440

Closed
wants to merge 2 commits into from

Conversation

mrow4a
Copy link
Contributor

@mrow4a mrow4a commented Jan 8, 2017

This fixes issues:
#5391 In the mixed file size scenario bandwidth is very underutilized - enterprise setup
#5390 Broken sync time estimation - enterprise setup
#4498 sync files from smallest first to biggest last
#1633 Propagator: Balance concurrent up/down-load
#4454 Investigate HTTP Pipelining

This includes:
#5400 so if this is merged, the other can be closed

This is basic for:
#5319 Bundling
HTTP2

This can be enhanced by:
#5368 Dynamic Chunking
#5349 Sorting folders by modification time

This requires following capability on server side, otherwise will sync old way:
selection_018

Idea

The idea behind the implemented algorithm is that in the scenario, in which we have a lot of mixed within the same or different folder - big files, under chunking size files of size 100kB or 5MB, files to download, moves, deletes, directory creations etc:

  • Separate transfer jobs from jobs which server database interaction is major factor

  • Reserve uncoditionaly 4-6 flows for "db sensitive files" like small file uploads and 2 for upload/download >1MB file transfers

  • sync normalny if no other files

  • Cross folder, this means it will look for "db sensitive" and "data transfer" in all folders.

The above will ensure that there is a balance between bandwidth utilization and bookkeeping, what in turn gives faster synchronisation (for details look below)

Used algorithm:
selection_019

Checkpoints

  • Cross folder, this means it will look for "db sensitive" and "data transfer" in all folders.
  • Test manualy to ensure faster sync - passed prove of concept
  • [x ] Test on enterprise setup to ensure faster sync
  • Write unit tests to cover whole the code
  • Test with integration tests

Details

Lets do some maths. This will be our folder to sync:
100 files - average 100kB file size -> total 10MB to be transfered
10 files - average 10MB file size -> total 100MB to be transfered

Assume, that your network is 5MB/s and one "1 Byte PUT" takes 1s (we are not in ideal home case scenario with empty server now guys, it is not that easy, it could take much much more)

The request "latency" consists of 2 components, time it takes for bookkeeping on server and time it takes for data transfer.

Current case

If you do 100 small request in the row, your data transfer is neglible ~0s and bookkeeping time is 100s. Parallelising that maybe you can achieve 33s having 3 parallel flows.

If you do 10 biger files request in a row, your data transfer is 10s bookkeeping. Parallelised say 4s if you are lucky with 3 parallel flows. However, you cannot omit 20s coming from transfer of 100MB, does not matter how many requests you have in parallel 1000 or 1. Your 5MB/s office net bounds you.

Total you need around 33s plus 24s from big, having 57s.

Optimized case

If you do 100 small request in the row, your data transfer is neglible ~0s and bookkeeping time is 100s. Parallelising that maybe you can achieve 50s reserving 2 flow slots. In this 50s, you used neglible bandwidth. If you use the 3rd flow to pump there 30s comming from big files, you just synced your files in 50s, since your 30s are done in parallel filling the bandwidth.

57s vs 50s is like 13% of the time orginal time? In this example it is around 7s. 100 files is not a big deal but shows bigger picture

More big files and more small files, the different is the percentage. Do the math for 55000 files and 55GB. Where you have 40000 of small files in 10GB(avg. size 250kB) and 15000 in the rest 45GB (avg size 3MB). I think we wont be talking about even minutes there :>

@felixboehm @DeepDiver1975 @davidjericho @cdamken @jnweiger

@mention-bot
Copy link

@mrow4a, thanks for your PR! By analyzing the history of the files in this pull request, we identified @ogoffart, @ckamm and @dragotin to be potential reviewers.

@mrow4a mrow4a changed the title Group requests to ensure balance between bandwidth utilization and bookkeeping Group requests to ensure balance between bandwidth utilization and bookkeeping #5391 #5390 #4498 #1633 #4454 Jan 8, 2017
/**
* @brief The PropagateUploadBundle class is a container class for upload jobs under chunking size.
*
* It will also ensure proper bandwidth utilization vs bookkeeping balance, and that in case no other items then under chunk uploads are available,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the detailed meaning of:
"and that in case no other items then under chunk uploads are available,"

@phil-davis
Copy link
Contributor

I made some minor edit suggestions for comments in #5441
The principle of this seems very useful - for example I am a long way (in round-trip-time ping...) from my server and my upload speed is not so good (1 to 3 Mbps range). I often have cases where there are a few big files and a lot of small files. That often results in sub-optimal total transfer time, because a some big files might chug away for a while (3 in parallel), then it gets to a heap of small files and those then have to slowly get created on the server... If we make sure that the upload of lots of small files is running in one stream(s) in parallel with big files in another stream(s) then the things that need server time and the things that need raw transfer time can happen concurrently and the total elapsed time should be minimized.
👍 for the concept. I am not familiar enough with the overall client design to give a good opinion on the detailed implementation.

@mrow4a mrow4a force-pushed the group_schedule branch 2 times, most recently from 262e800 to 3701299 Compare January 9, 2017 11:07
@mrow4a
Copy link
Contributor Author

mrow4a commented Jan 9, 2017

Ok, it works awesome, need to check it in enterprise scale. Remember, if someone want to test it, one needs a server capability / enviromental variable on the client - otherwise it will sync the old way.

    static const auto bundling = qgetenv("OWNCLOUD_BUNDLING");
    if (bundling == "0") return false;
    if (bundling == "1") return true;

    return _capabilities["dav"].toMap()["bundling"].toByteArray() >= "1.0";

@phil-davis Could you rebase and check again! I made a change to be sure that it parallelises correctly for ALL items. It was not previously, so 3 big downloads will block everything if placed in alphabetical order.

Copy link
Contributor

@ckamm ckamm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a detailed review.

From a high level point of view I think this kind of request prioritization is a good idea and while I'm not certain about all details, tweaking the parameters later looks relatively easy.

I'm worrying about all the code duplication in PropagateFiles relative to PropagateDirectory. Is there really a compelling reason for having a second level of job-container, instead of folding this into PropagateDirectory? If there is, maybe a common base class between the two would help reducing the duplication that's going on.

Apart from that, see the detail comments.

if (bundling == "0") return false;
if (bundling == "1") return true;

return _capabilities["dav"].toMap()["bundling"].toByteArray() >= "1.0";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this you get the typical "2.1" > "10.1" problem. Could this just be an integer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just reproduced what @ogoffart did with chunking, nearly copy paste from ChunkingNG

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I didn't realize. That doesn't make the comparison any more correct though, even if it does set precedent for having "N.M" versions in the capabilities (which I wasn't aware of). Then I suggest we merge as-is and fix both in a follow-up.

QScopedPointer<PropagateItemJob> _firstJob;

// e.g: create class which will handle bundled uploads and bandwidth utilization vs bookkeeping balance
QScopedPointer<PropagatorJob> _filesJob;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this could be typed QScopedPointer<PropagateFiles> and you could avoid a bunch of casts.

// Ensure that only new files are inserted into PropagateFiles
if (enableBundledRequests && item->_instruction == CSYNC_INSTRUCTION_NEW) {
// Get PropagateFiles container job
PropagateFiles* filesJob = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd write this as

auto& filesJob = directories.top().second->_filesJob;
if (!filesJob) {
    filesJob = new PropagateFiles(this);
}

{
// A small filesize item is a file whose transfer time
// typically will be lower than its bookkeeping time.
static uint smallFileSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer explicit = 0 here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, copy from ChunkingNG

if (subJobsIterator.value()->_state == Finished) {
// If this item is finished, remove it from _subJobs as it is not needed anymore
// Note that in this case remove() from QVector will just perform memmove of pointer array items.
PropagatorJob * job = subJobsIterator.value();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do PropagatorJob* job = subJobsIterator.next(); at the very beginning to clean up all the subJobsIterator.value() calls.

} else {
// There are no remaining or pending standard jobs in the whole sync
// This also means that _standardJobs is empty
Q_ASSERT(!scheduleNextJobRoutine(_standardJobs));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert looks dangerous because it can have side effects.

// This also means that _standardJobs is empty
Q_ASSERT(!scheduleNextJobRoutine(_standardJobs));

// Parallelise itself into more flows flows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"flows flows"

Q_ASSERT(job);

// Reduce the global counter of db or standard jobs
if (job->_item->_size <= _propagator->smallFileSize()){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This decision of whether something is standard or db is reiterated in a bunch of places. Definitely make a function for it so the conditions don't go out of sync.

return true;
} else {
// This container does not contain any remaining dbJobs
if(_runningNow > 1){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: A bunch of times in this function spacing is missing: "if(" and "){"

directories.top().second->append(current);
} else {
// Ensure that only new files are inserted into PropagateFiles
if (enableBundledRequests && item->_instruction == CSYNC_INSTRUCTION_NEW) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell this prioritization code has nothing to do with bundling. Why is it gated by the bundling flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is basis for dispatching bundles in the future, this is why. You can call this feature bundling 1.0. i can change a name however. I also give a flag, because this is now cross-folder, some people might not like it, so I give them the chance to sync folder by folder just by changing one capability on server.

@mrow4a
Copy link
Contributor Author

mrow4a commented Jan 10, 2017

@ckamm Because PropagateDirectory is much more complicated thinking about cases to handle, did not want to introduce even more complexity, we will get lost.

Also I wanted to make this "pluggable" so you can easily switch it on and off.

Mind that PropagateFiles is only for new files without any "special" cases, just normal upload/download, chunked or not. No _first job, delete cases, WaitInDirectory etc.

@mrow4a
Copy link
Contributor Author

mrow4a commented Jan 23, 2017

Hello,

Changed a little bit a structure of this and been testing it on enterprise setup, details can be found in this document @ogoffart @guruz @jturcotte @felixboehm, which is part of my CS3 presentation :
SYNC OPTIMIZATIONS - SCHEDULING
sync-optimizations-mrowczynski-scheduling.pdf

Features:

  • Ensured balance between jobs
  • Is entriely plugable, cross-folder and separate from Propagate Directory main logic
  • Only New and Updated files are being propagated (data transfers)
  • Jobs are created lazily

finalize();
return true;
}
_subJobs.reserve(_totalItems);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need to get rid of that, since I am wasting memory here, dont I?

bool PropagateFiles::scheduleNewJob(QVector<SyncFileItemPtr> &syncJobs){
// This function is used to schedule new job and lazily create job from sync items
Q_ASSERT(!syncJobs.isEmpty());
const SyncFileItemPtr &item = syncJobs.takeFirst();
Copy link
Contributor Author

@mrow4a mrow4a Jan 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are planning to use only 5.8 dont we? Or should I still keep compatibility with Qt4? (My build is failing because this is used in Qt5)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No the plan is to use 5.6 everywhere we can, and still use Qt4 on linux platforms that we don't ship Qt (unless we manage to bundle Qt). Qt 5.6 is LTS and will be receiving patches until 2019 while 5.8 only until 2018.

@guruz guruz added this to the 2.4.0 milestone Jan 25, 2017
@mrow4a
Copy link
Contributor Author

mrow4a commented Jan 25, 2017

Ok, checked with unit tests and they covered the code, everything is passing, please also mind that https://github.com/owncloud/client/pull/5440/files#diff-7e5082f89a138020f2b1d37fc97d17dbR319 this line has to be adjusted before merging, there is TODO

@@ -388,6 +395,9 @@ void OwncloudPropagator::start(const SyncFileItemVector& items)
currentDirJob->append(dir);
}
directories.push(qMakePair(item->destination() + "/" , dir));
} else if (enableScheduledRequests
&& (item->_instruction == CSYNC_INSTRUCTION_NEW || item->_instruction == CSYNC_INSTRUCTION_SYNC)) {
filesJob->append(item);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes one purpose of the PropagateDirectory structure: update the directory's etag in the database only once child files have been synced properly.

I can't wrap my head about the conditions where this could be an issue, but @ogoffart should know if this could cause concrete problems.

See e.g.

// For new directories we always want to update the etag once

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out, did not know about this, dont think it is difficult to resolve but let @ogoffart see this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrow4a @jturcotte @ogoffart If I'm not mistaken the following is a situation in which that dependency structure could lead to an abort-related problem:

Initial state in db and server:
/  - "etag/1"
/A - "etagA1"
/A/F - "etagF1"

now when someone touches /A/F on the server, the server state becomes

Server state, after touching /A/F
/  - "etag/2"
/A - "etagA2"
/A/F - "etagF2"

but if I understand correctly the propagation dependency graph is

PropagateDirectory (/)
 |- PropagateDirectory (/A)
 |- PropagateFiles
    |- PropagateItem (/A/F)

meaning that /A and /A/F run independently of each other. So it would be possible to completely finish propagating /A before the file transfer /A/F is done. Then aborting the sync run could lead to the db tree

/  - "etag/1"
/A - "etagA2"
/A/F - "etagF1"

Which means a follow up sync would probably not pick up on /A/F being out of date.

Maybe this hints at a second dependency problem: Currently local and remote MkDir are FullParallelism - so isn't there a chance that with this change one could run into a case where the client wants to download or upload into a non-existant directory?

Possibly I'm missing something, feel free to point out incorrectness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it wont upload to non-existent directory, because directory structure / resolving conflicts etc are done before any transfers, and directory deletions are done after https://github.com/owncloud/client/pull/5440/files#diff-c9731f430e8a29b13deaaf73bc4a4e22R56
https://github.com/owncloud/client/pull/5440/files#diff-20b960bb10cf3c0781a80fb3e5775241R493
https://github.com/owncloud/client/pull/5440/files#diff-7e5082f89a138020f2b1d37fc97d17dbR420

About the etag propagation did not look into the problem yet, because we dont have yet smashbox automation for PRs (it is nearly done) and I am busy on the server side before the release.

Copy link
Contributor

@ckamm ckamm Mar 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrow4a Okay, I didn't see that. (btw, _runningNow is gone, so this branch doesn't compile - probably was rebased at some point?) In that case the second problem can indeed not happen.

@mrow4a
Copy link
Contributor Author

mrow4a commented Mar 11, 2017

For LAN sync - https://central.owncloud.org/t/gsoc-2017-fast-lan-sync/6271 - we need this PR. Sync clients need to be "on the same page", and be finished with any metadata logic around, so that they can "safely" transfer the data, as in this PR. LAN Sync has to be done only on raw PUT/GETs and this logic has to be abstracted from "Directory sync logic" (basicaly what this PR is doing). @hodyroff @DeepDiver1975

@guruz
Copy link
Contributor

guruz commented Mar 23, 2017

@mrow4a: make sure to try to interrupt syncs in the middle and then resume them (also multiple times) and check if this will really still lead to correctly synchronized directories and not to deletes file or files that are not synced.
@mrow4a: please make sure smashbox reliability sync tests pass
@mrow4a: please make sure S3 syncs faster than without the patch

@ogoffart ogoffart removed this from the 2.4.0 milestone Apr 28, 2017
@ogoffart
Copy link
Contributor

I removed the milestone because this would still need more work, and I don't think it is worth it at all. But I leave it open for the sake of discussion.

@ogoffart
Copy link
Contributor

Closing outdated pull request

@ogoffart ogoffart closed this Oct 30, 2019
@TheOneRing TheOneRing deleted the group_schedule branch December 2, 2019 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants