Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete SyncFileItems after propagation #5488

Merged
merged 5 commits into from
Jan 25, 2017

Conversation

jturcotte
Copy link
Member

This improves the number of files that can be synced in one shot.

Issue #5237
This depends on #5400 to work properly

We unconditionnally add the file on startup, but don't check for errors.
During sync we check for errors, but only try loading the file if it
exists.
This simulates a ~50k files sync that can be used to measure memory
usage without having to wait for a server.
This was to catch duplicate emissions for PropagateDirectory but we
don't emit this signal anymore from there.
This fixes a warning about PropagatorJob not being a registered metatype.

This reverts commit fe42c1a.
This will allow us to keep a reference on the items in connected slots.
@jturcotte jturcotte added this to the 2.3.0 milestone Jan 25, 2017
@jturcotte jturcotte requested review from ckamm, ogoffart and guruz January 25, 2017 16:11
@mention-bot
Copy link

@jturcotte, thanks for your PR! By analyzing the history of the files in this pull request, we identified @ckamm, @dragotin and @ogoffart to be potential reviewers.

void reset();

void appendErrorString( const QString& );
void appendErrorStrings( const QStringList& );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is probably no longer in used and should be removed.

{
// FIXME: Why are error counted as warnings? Test.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need it to know if we should use the "Problem" status in Folder.

We now delete subjobs as their propagation is complete. This allows us
to also release the item by making sure that nothing else is holding a
reference to it.

Remove the stored SyncFileItemVector from SyncEngine and SyncResult
and instead gather the needed info progressively as each itemCompleted
signal is emitted.

This frees some holes on the heap as propagation goes, allowing many
memory allocations without the need of requesting more virtual memory
from the OS, preventing the memory usage from increasingly growing.
@jturcotte jturcotte force-pushed the delete-syncfileitems branch from dfc5d24 to cf5e43b Compare January 25, 2017 18:20
@jturcotte
Copy link
Member Author

Done, diff on last commit after force-push:

diff --git a/src/gui/folder.cpp b/src/gui/folder.cpp
index 096eed568..6aa0fe2fb 100644
--- a/src/gui/folder.cpp
+++ b/src/gui/folder.cpp
@@ -709,7 +709,7 @@ void Folder::slotSyncFinished(bool success)
 
     bool syncError = !_syncResult.errorStrings().isEmpty();
     if( syncError ) {
-        qDebug() << "-> SyncEngine finished with ERROR, warn count is" << _syncResult.warnCount();
+        qDebug() << "-> SyncEngine finished with ERROR";
     } else {
         qDebug() << "-> SyncEngine finished without problem.";
     }
@@ -724,8 +724,7 @@ void Folder::slotSyncFinished(bool success)
     } else if (_csyncUnavail) {
         _syncResult.setStatus(SyncResult::Error);
         qDebug() << "  ** csync not available.";
-    } else if( _syncResult.warnCount() > 0 ) {
-        // there have been warnings on the way.
+    } else if( _syncResult.foundFilesNotSynced() ) {
         _syncResult.setStatus(SyncResult::Problem);
     } else if( _definition.paused ) {
         // Maybe the sync was terminated because the user paused the folder
diff --git a/src/libsync/syncresult.cpp b/src/libsync/syncresult.cpp
index 558c03fc5..625ebae2c 100644
--- a/src/libsync/syncresult.cpp
+++ b/src/libsync/syncresult.cpp
@@ -20,7 +20,7 @@ namespace OCC
 
 SyncResult::SyncResult()
     : _status( Undefined )
-    , _warnCount(0)
+    , _foundFilesNotSynced(false)
     , _folderStructureWasChanged(false)
     , _numNewItems(0)
     , _numRemovedItems(0)
@@ -93,16 +93,6 @@ QDateTime SyncResult::syncTime() const
     return _syncTime;
 }
 
-int SyncResult::warnCount() const
-{
-    return _warnCount;
-}
-
-void SyncResult::appendErrorStrings( const QStringList& list )
-{
-    _errors.append(list);
-}
-
 QStringList SyncResult::errorStrings() const
 {
     return _errors;
@@ -136,10 +126,9 @@ QString SyncResult::folder() const
 
 void SyncResult::processCompletedItem(const SyncFileItemPtr &item)
 {
-    // FIXME: Why are error counted as warnings? Test.
     if (Progress::isWarningKind(item->_status)) {
-        // Count all error conditions.
-        _warnCount++;
+        // Count any error conditions, error strings will have priority anyway.
+        _foundFilesNotSynced = true;
     }
 
     if (item->_isDirectory && (item->_instruction == CSYNC_INSTRUCTION_NEW
@@ -191,9 +180,9 @@ void SyncResult::processCompletedItem(const SyncFileItemPtr &item)
                 // nothing.
                 break;
             }
-        } else if( item->_direction == SyncFileItem::None ) { // ignored files counting.
+        } else if( item->_direction == SyncFileItem::None ) {
             if( item->_instruction == CSYNC_INSTRUCTION_IGNORE ) {
-                _warnCount++;
+                _foundFilesNotSynced = true;
             }
         }
     }
diff --git a/src/libsync/syncresult.h b/src/libsync/syncresult.h
index 7d694c141..0c630f737 100644
--- a/src/libsync/syncresult.h
+++ b/src/libsync/syncresult.h
@@ -50,10 +50,8 @@ public:
     void reset();
 
     void    appendErrorString( const QString& );
-    void    appendErrorStrings( const QStringList& );
     QString errorString() const;
     QStringList errorStrings() const;
-    int     warnCount() const;
     void    clearErrors();
 
     void setStatus( Status );
@@ -63,6 +61,7 @@ public:
     void setFolder(const QString& folder);
     QString folder() const;
 
+    bool foundFilesNotSynced() const { return _foundFilesNotSynced; }
     bool folderStructureWasChanged() const { return _folderStructureWasChanged; }
 
     int numNewItems() const { return _numNewItems; }
@@ -90,7 +89,7 @@ private:
      * when the sync tool support this...
      */
     QStringList        _errors;
-    int                _warnCount;
+    bool _foundFilesNotSynced;
     bool _folderStructureWasChanged;
 
     // count new, removed and updated items

@dragotin
Copy link
Contributor

you wanna really merge that in 2.3?

@mrow4a
Copy link
Contributor

mrow4a commented Jan 25, 2017

@jturcotte About deleting jobs, it needs to be carefully done and checked on some latency with real upload timings.

Reason for that is simple, remove() and delete operations are blocking, so it means it increases the idle gap between finishing the file and scheduling the next one - this transition should be blasting fast, since this moment is performance critical. Even 50ms on one file, causes 1000 to be in 6 connections -> 167 lines of blocking, which could be seconds lost! And it was in PR #5400 testing with CERN, I dont even comment HTTP2 where we can increase number of requests and we remove from vector in random places.

@jturcotte
Copy link
Member Author

@dragotin Only the last commit is meaty and the risk is pretty low (mostly bubbleUpResults has its logic changed). It's all mostly only about making sure that we don't keep SyncFileItemPtrs around except in the job. The merge request looks scary mainly because of all the harmless SyncFileItem& -> SyncFileItemPtr& changes.

@mrow4a Those operation are meaningless vs. the disk and sqlite operations. Updating a vector of 10k elements is nothing compared to even a single disk access. I passed this in the profiler and I can't even see a sample spent in the job deletion code.

@mrow4a
Copy link
Contributor

mrow4a commented Jan 25, 2017

@jturcotte On localhost they nearly come in order, so you remove from begining of the vector, try to have them out of order. Just saying that this was not working good with HTTP2. But definitely, there is benefit of memory and unnecessary loops.

@jturcotte
Copy link
Member Author

jturcotte commented Jan 25, 2017

@mrow4a Have a look at this and note that accessing a byte on disk is going to be 10000 times slower than accessing a vector in memory:
https://gist.github.com/jboner/2841832

The first access of each cache line will then load following bytes in the CPU cache, meaning that it will be even faster. Don't worry about this, even if that vector has 10000 subjobs, it will still be way faster than accessing your disk (and we do lots of it). We can give you an introduction to linux perf next time you pass by the office, then you can start thinking about more than loops.

@ogoffart
Copy link
Contributor

+1

@jturcotte jturcotte merged commit ee211d7 into owncloud:2.3 Jan 25, 2017
@jturcotte jturcotte deleted the delete-syncfileitems branch January 25, 2017 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants