Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.7.0-beta3 nightly: serious inconsistency and lost file when remote folder renamed while files uploaded to it #2296

Closed
moscicki opened this issue Oct 10, 2014 · 55 comments
Assignees
Labels
p2-high Escalation, on top of current planning, release blocker Server Involved type:bug

Comments

@moscicki
Copy link
Contributor

This is for the latest 1.7.0-beta3 on Redhat6 nightly but I believe it is version and platform independent and possibly server-independent.

I consider this use-case to be very important, especially for the folders shared between multiple users.

  1. I have normal folder V on the server (7.0.1)
    • folder V has initially 100 files vvv1...vvv100
    • (I also repeated the test with the empty folder initially on the server (see at the end) with similar results)
  2. I started uploading (new versions) of files vvv1...vvv100 into this folder via the sync client.
  3. While the upload is ongoing I rename the folder V -> V2.

What I expected is: all my files uploaded to V2. What I get is a rather messy final state:

  • V2 is empty both on the server and on the client
  • V contains some files only, namely:
    • the files which already got uploaded to V (before rename) eventually got deleted
    • the files which did not get uploaded to V yet and got errors during the first sync, got uploaded to V

So out of 100 files, I was left with 82 files in V, 0 files in V2.

Here is the log of all the actions:
https://gist.github.com/moscicki/5e4b4c8bbfd66fd92958#file-oc-remote-folder-rename-sync-mess

Here is the log of all the actions in case of initially empty folder E on the server (renamed to E2):

https://gist.github.com/moscicki/fc148a5dc6fa3cef9fab#file-oc-remote-folder-rename-mess-empty-remote-folder

NOTES:

I consider this scenario very important if more then one user syncs the same directory at the same time (sharing). However, note that all actions are performed by one only sync client, on a normal oc server folder (not-shared). I have a second client of the same user connected to this repo and passively "watching". The second sync client seems to be completely irrelevant here.

@moscicki
Copy link
Contributor Author

Please also note that we have recently had a similar case from a user (albeit with older server/client versions so I won't bother you with that). But the point is that this is a real problem.

@ogoffart
Copy link
Contributor

I was not able to reproduce the problem here with the last git from 1.7 on linux.

Regarding the step "3.": do you rename on the web interface or on the client's file system?

Maybe you can provide the full log?

@moscicki
Copy link
Contributor Author

Step3: rename remote folder via the web page.

I was able to reproduce it every time.

ogoffart added a commit that referenced this issue Oct 11, 2014
It is possible that we have should_update_etag set to true for files
that we also need to propagate.  In which case we must not write to the DB
too early as this could cause data loss.  (cf: issue #2296)
@ogoffart
Copy link
Contributor

I tried and sometimes it works as expected and sometimes it does not.
One the reason why it might not work is if there was no sync between adding new files and the sync that tries to upate them. This is because we were writing too early to the database to update the permissions. This is now fixed.

But there is still something fishy hapenning sometimes on the server, in which the server reports a 500 Internal Server Error and do not keep properly the file id, which lead that we duplicate some of the files (but we should not loose data)

@dragotin
Copy link
Contributor

This is log output in the server log that happens if there is a 500 server error:

{"app":"PHP","message":"Undefined index: extension at \/home\/kf\/oC\/core\/apps\/files_versions\/lib\/versions.php#286","level":3,"time":"2014-10-11T12:32:59+00:00"}
{"app":"core","message":"unable to rename, file is not writable : files\/plop\/V0\/d-4-4.txt.ocTransferId704783747.part","level":3,"time":"2014-10-11T12:32:59+00:00"}
{"app":"webdav","message":"\\OC\\Files\\Filesystem::rename() failed","level":3,"time":"2014-10-11T12:32:59+00:00"}
{"app":"PHP","message":"unlink(\/home\/kf\/oC\/core\/data\/ogo\/files\/plop\/V0\/d-4-4.txt.ocTransferId704783747.part): No such file or directory at \/home\/kf\/oC\/core\/lib\/private\/files\/storage\/local.php#167","level":3,"time":"2014-10-11T12:32:59+00:00"}
{"app":"webdav","message":"Sabre\\DAV\\Exception: Could not rename part file to final file","level":4,"time":"2014-10-11T12:32:59+00:00"}
{"app":"PHP","message":"filemtime(): stat failed for \/home\/kf\/oC\/core\/data\/ogo\/files\/plop\/V0 at \/home\/kf\/oC\/core\/lib\/private\/files\/storage\/local.php#133","level":3,"time":"2014-10-11T12:32:59+00:00"}
{"app":"remote","message":"An exception occurred while executing 'UPDATE `oc_filecache` SET `mtime` = ?, `storage_mtime`=? WHERE `fileid` = ?':\n\nSQLSTATE[22007]: Invalid datetime format: 1366 Incorrect integer value: '' for column 'mtime' at row 1","level":4,"time":"2014-10-11T12:32:59+00:00"}
{"app":"PHP","message":"Session has been closed - no further changes to the session as allowed at \/home\/kf\/oC\/core\/lib\/private\/session\/memory.php#80","level":3,"time":"2014-10-11T12:32:59+00:00"}

@DeepDiver1975, @PVince81 can you guys pls. help?

@moscicki
Copy link
Contributor Author

Hello,

I don't think that occasional 500 from the server for few files is the main problem for this issue (it is confusing for the user but should not critically affect the client logic).

The upload starts and initially succeeds. Then remote folder is moved so the subsequent PUT requests from the client fail with 404. That's OK because the target path does not exist anymore. [[ At this point I would propose that after few 404s for the PUT you check for existence of the remote parent folder. Like this you can avoid many useless PUT requests in to the remote tree that does not exist anymore. ]]

However, on the next sync run, for some reason you do not detect that the folder was moved. If you correctly detected that you'd then resume PUT requests to the new remote path and everything would be fine even with for those files for which you got 500 before.

BTW. The log entries are confusing: they say "Error downloading...." for the "Up" direction. Could you please fix that? Or does that indicate another, deeper problem?

@PVince81
Copy link
Contributor

I agree with @moscicki.

But also some of the warnings should be looked into like the invalid date format and make such case fail more gracefully without triggering such warnings.

CC @schiesbn for the files_versions warning.

@ogoffart
Copy link
Contributor

The client had a bug (fixed with commit 9b178c5) That led it believe that some file would have been uploaded sucessfully while there was an error in some rare cases. This bug could have triggered the dataloss that moscicki saw. But hopefully it is now fixed and there should no longer be data loss.

However, we cannot detect a move if the fileid are not kept on the server.

@ogoffart
Copy link
Contributor

(Regarding the "Error downloading" message, this string comes from Qt. We are trying to fix Qt to change the wording of the message: https://codereview.qt-project.org/83041/ )

@moscicki
Copy link
Contributor Author

Is this moving issue fixed in the latest nightly? Should I try my test again?

@moscicki
Copy link
Contributor Author

I tried against owncloud7.0.1 and client nightly build 1.7.0-20141020 on MacOSX.

It does not work as expected. Very similar behaviour as originally described (including file loss).
I got 77 files under the folder with original name, and 0 files under the folder with the new name. Same on the server.

I also tried the same test against 1.6.3 -- the exact trace of operations is different but the result equally bad - lost files and full inconsistency.

I guess there is something fundamentally broken here. Could you please check this thoroughly?

@dragotin dragotin added the p2-high Escalation, on top of current planning, release blocker label Oct 20, 2014
ogoffart added a commit that referenced this issue Oct 21, 2014
…ved on the server

Found while investigating #2296

The problem is that we should not remove a directory locally if it contains
modified files.
But the modification time of the directory is not necessarily chaning (so
the instruction of the directory may still be NONE)

We have to move the child_modified test a bit down to be recursive
@ogoffart
Copy link
Contributor

One of the problem is that the server somehow seem to only keep only some of the file id, but many file ids are new

@moscicki
Copy link
Contributor Author

I think in this model you should not use file id for anything but tracking remote moves. Otherwise you will be in trouble...

@ogoffart
Copy link
Contributor

That's exactly what we do.
But because some files are moved and some are not, it leads to some inconsistency.

@moscicki
Copy link
Contributor Author

Same problem (file loss) on 1.7.0-beta4

@ogoffart
Copy link
Contributor

I tought I fixed all the loss file problem.
What are the file that are lost now?

@dragotin
Copy link
Contributor

@moscicki can you again elaborate on this please? 1.7.0 release is coming nearer...

@ogoffart
Copy link
Contributor

Also please provide the client log.

@dragotin
Copy link
Contributor

I was able to reproduce the file loss. I put files f_a.log to f_z.log into a directory and let them sync. After that, I touched all files in the dir source and renamed the directory on server side. The files a_a.log to a_e.log got lost, all others ended up in the old directory called source.

In the server log file I find this entries:

{"app":"PHP","message":"Undefined index: extension at \/home\/kf\/oC\/stable7\/core\/apps\/files_versions\/lib\/versions.php#286","level":3,"time":"2014-10-23T14:42:08+00:00"}
{"app":"PHP","message":"filesize(): stat failed for \/home\/kf\/oC\/stable7\/core\/data\/kf\/files\/source\/f_g.log.ocTransferId146831252.part at \/home\/kf\/oC\/stable7\/core\/lib\/private\/files\/storage\/local.php#117","level":3,"time":"2014-10-23T14:42:09+00:00"}
{"app":"webdav","message":"Sabre\\DAV\\Exception\\BadRequest: expected filesize 2285355 got ","level":4,"time":"2014-10-23T14:42:09+00:00"}

More client log files can be found on s3.

@ogoffart
Copy link
Contributor

The client log show that the files are gone on the server. So it's the server who lost the files.

Note that i was not able to reproduce this problem on my server. (7.0.2 sqlite)

@moscicki
Copy link
Contributor Author

Exactly same thing. I think you are also likely to experience it if you upload the files for the first time (as far as I remember).

kuba

On Oct 23, 2014, at 5:27 PM, Klaas Freitag [email protected] wrote:

I was able to reproduce the file loss. I put files f_a.log to f_z.log into a directory and let them sync. After that, I touched all files in the dir source and renamed the directory on server side. The files a_a.log to a_e.log got lost, all others ended up in the old directory called source.

In the server log file I find this entries:

{"app":"PHP","message":"Undefined index: extension at /home/kf/oC/stable7/core/apps/files_versions/lib/versions.php#286","level":3,"time":"2014-10-23T14:42:08+00:00"}
{"app":"PHP","message":"filesize(): stat failed for /home/kf/oC/stable7/core/data/kf/files/source/f_g.log.ocTransferId146831252.part at /home/kf/oC/stable7/core/lib/private/files/storage/local.php#117","level":3,"time":"2014-10-23T14:42:09+00:00"}
{"app":"webdav","message":"Sabre\DAV\Exception\BadRequest: expected filesize 2285355 got ","level":4,"time":"2014-10-23T14:42:09+00:00"}
More client log files can be found on s3.


Reply to this email directly or view it on GitHub.

@dragotin
Copy link
Contributor

@craigpg who can help here? @DeepDiver1975 or @PVince81 ? Thanks!

@craigpg
Copy link

craigpg commented Oct 23, 2014

Yes, please work with @DeepDiver1975 and @PVince81.

@PVince81
Copy link
Contributor

@moscicki did you rename files in all your test cases while uploading ? Or are you saying that the simplest initial upload is causing the issues ?

@PVince81
Copy link
Contributor

The linked ticket was mostly about downloading and renaming during download, so probably not directly related.

@PVince81
Copy link
Contributor

Trying to reproduce, here are my steps:

  1. Login as user "root"
  2. Create a folder "V"
  3. Setup the sync client, let it download
  4. Close the sync client (to avoid it uploading during the copy)
  5. Copy a bunch of pictures (70) into the local directory "V"
  6. Start the sync client
  7. While it is uploading, rename "V" to "V2" in the web UI
  8. Client shows errors "Error download" (should say "Error uploading", possibly text issue for the remaining files)
  9. Wait for sync client to continue uploading
  10. Check md5sum between local "V" folder with original files

Result: there are now two folders on the server, "V" and "V2"
All files are in "V" but "V2" is empty.
md5sum shows that 5 files are missing in V ...

@ogoffart
Copy link
Contributor

Regarding the wrong text in the error message ("downloading"), this is a bug in Qt: https://codereview.qt-project.org/83041/

@PVince81
Copy link
Contributor

Let's have a look at the access log:

127.0.0.1 - "" [23/Oct/2014:18:12:51 +0200] "GET /owncloud/status.php HTTP/1.1" 200 81 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:52 +0200] "PROPFIND /owncloud/remote.php/webdav/ HTTP/1.1" 207 443 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:52 +0200] "PROPFIND /owncloud/remote.php/webdav/ HTTP/1.1" 207 481 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - - [23/Oct/2014:18:12:52 +0200] "PROPFIND /owncloud/remote.php/webdav HTTP/1.1" 207 1657 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
127.0.0.1 - root [23/Oct/2014:18:12:52 +0200] "PUT /owncloud/remote.php/webdav/V/1368555495004.jpg HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:52 +0200] "PUT /owncloud/remote.php/webdav/V/1391249528060.jpg HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:52 +0200] "PUT /owncloud/remote.php/webdav/V/20110610_005.jpg HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:53 +0200] "PUT /owncloud/remote.php/webdav/V/20130317_001.jpg HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:54 +0200] "PUT /owncloud/remote.php/webdav/V/DSC05207.JPG HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:55 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0100.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:54 +0200] "PUT /owncloud/remote.php/webdav/V/DSC_0158.JPG HTTP/1.1" 500 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:56 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0156.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0332.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:55 +0200] "PUT /owncloud/remote.php/webdav/V/Huoguo.jpg HTTP/1.1" 500 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0392.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:56 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0279.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0467.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - - [23/Oct/2014:18:12:56 +0200] "GET /owncloud/index.php/apps/files/ajax/rename.php?dir=%2F&newname=V2&file=V HTTP/1.1" 200 272 "http://localhost/owncloud/index.php/apps/files/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537
.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36"

After the rename all subsequent accesses get 404:

127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMG_20130910_071929.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMG_20131019_144636.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMG_20131022_202138.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMG_20131019_144640.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMG_20131101_135241.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMG_20131208_142211.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/IMG_20131210_204142.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/P1040012.JPG HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:12:57 +0200] "PUT /owncloud/remote.php/webdav/V/P1040013.JPG HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
...
127.0.0.1 - root [23/Oct/2014:18:13:00 +0200] "PUT /owncloud/remote.php/webdav/V/P1070103.JPG HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:13:00 +0200] "PUT /owncloud/remote.php/webdav/V/dsc05535.jpg HTTP/1.1" 404 279 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:13:22 +0200] "PROPFIND /owncloud/remote.php/webdav/ HTTP/1.1" 207 481 "-" "Mozilla/5.0 (Linux) mirall/1.7.0"

And here I see that the sync client is doing a MKCOL to recreate the "V" folder and then upload the files there:

127.0.0.1 - - [23/Oct/2014:18:13:32 +0200] "PROPFIND /owncloud/remote.php/webdav HTTP/1.1" 207 1658 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
127.0.0.1 - - [23/Oct/2014:18:13:32 +0200] "PROPFIND /owncloud/remote.php/webdav/V2 HTTP/1.1" 207 656 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
127.0.0.1 - - [23/Oct/2014:18:13:32 +0200] "MKCOL /owncloud/remote.php/webdav/V/ HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
127.0.0.1 - root [23/Oct/2014:18:13:33 +0200] "PUT /owncloud/remote.php/webdav/V/Huoguo.jpg HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:13:33 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0100.jpg HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:13:33 +0200] "PUT /owncloud/remote.php/webdav/V/DSC_0158.JPG HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:13:34 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0156.jpg HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"
127.0.0.1 - root [23/Oct/2014:18:13:34 +0200] "PUT /owncloud/remote.php/webdav/V/IMAG0279.jpg HTTP/1.1" 201 - "-" "Mozilla/5.0 (Linux) mirall/1.7.0"

Not sure, but @dragotin told me on IRC that second MKCOL shouldn't happen. The sync client should find out that "V" was renamed to "V2" and use that instead. Maybe there's a bug in the order in which the discovery happens ?

@PVince81
Copy link
Contributor

I'll do another test and check whether the etag of "V" and "V2" differ before and after the rename.

@PVince81
Copy link
Contributor

Ok, after renaming and before the second sync cycle, the etag has changed.
Now to find out why...

@PVince81
Copy link
Contributor

Third test run with Y->Y2:

I put a breakpoint in put() and in rename() to make it possible to control the flow better (it blocks all connections).
After letting 5 puts go, I renamed the folder in the UI.
Then I let the next puts go (which failed).

In that test run the etag of the folder did NOT change.

This time the sync client could detect the rename and correctly uploaded all files into Y2.

So it looks like if the flow is "perfect" in a way that the rename doesn't disturb any of the uploads, then the etag will stay the same.

I'll do another run to confirm.
The etag change observed previously is very likely to be due to a race condition.

@PVince81
Copy link
Contributor

Ok, confirmed: when the flow is controlled and put() does not run in parallel to rename(), the etag stays the same and everything works as expected.

Next step: find the part of put() and rename() that are likely to clash.

@ogoffart
Copy link
Contributor

that second MKCOL shouldn't happen. The sync client should find out that "V" was renamed to "V2" and use that instead.

The problem is that the fileid of V2 is no longer the fileid that V used to have (at least in the log i've seen)

@PVince81
Copy link
Contributor

@ogoffart never mind, we're past that already 😉
The etag changes due to a race condition.

I need to dig into put() and rename() and find what could cause an etag change after the rename when they both run simultaneously.

@PVince81
Copy link
Contributor

I'm observing a behavior I cannot explain. I added debug logging in many places and what I see is that:
Req1 for uploading file1.jpg
Req2 for uploading file2.jpg
Req1: puts file into C/file1.jpg
Req2: puts file into C/file2.jpg
Req1: updates cache for C/file1.jpg
Req2: updates cache for C/file2.jpg
Req3 for renaming C to C2
Req3: updates cache for C to be called C2
Req3: also updates the cache for C2/file1.jpg <= this I can't explain
Req3: also updates the cache for C2/file2.jpg <= this I can't explain

For some unknown reason, the rename operation not only updates the cache entry of the folder (for the renaming + etag change) but it ALSO updates the cache entry for every file that is already correctly uploaded into that directory. It somehow seems that the "parent" property has changed.
But normally renaming should not change the file id.

This strange update only happens if the upload process is running (race condition). If I rename the folder without uploading anything, only a single cache update is done.

Needs further research.

@PVince81
Copy link
Contributor

I think I'm starting to understand what is happening.

I did another test run and I observed that the rename operation that renames E to E2 will ALSO run scanFile on E2, and that will trigger the etag update.

In a normal rename operation, the scanner is NEVER triggered.

So the question is, why would the scanner be triggered in this case ? One possible explanation is simply that the mtime of the folder has changed. If a separate process puts a file into that folder it will automatically change the mtime of the folder on the FS-level.

A) Normally, if everything runs sequentially instead of in parallel, this is the upload process:

  1. File "test.jpg" is uploaded, code reaches put() in the sabre connector
  2. File data is stored into a part file "E/test.jpg.part"
  3. Part file size is compared against "content-length" header
  4. ...
  5. Part file "E/test.jpg.part" is renamed to final file name "E/test.jpg" (this changes the mtime of "E" on FS level)
  6. Cache entry for "E/test.jpg" is updated, new etag/mtime, etc
  7. Cache entry for "E" is updated, new etag/mtime from 5)

B) And when a rename is performed separately:

  1. Some checks are done to see whether "E" exists
  2. "E" is renamed to "E2", cache metadata is NOT changed, only the file name (etag/mtime stay the same)

If we mix both, the initial checks (file_exists) will trigger code that notices that the FS mtime has changed, and this will trigger a file scan. If at this point the FS mtime of the folder "E" was not updated yet by step 7) from the upload, then the mtime is different and the code considers this as a remote modification: trigger the file scanner.

So if the check B) 1) happens between A) 5) and A) 7) it will wrongly trigger the file scanner...

I'm still speculating and will dive even deeper to confirm this.

@PVince81
Copy link
Contributor

@icewind1991 would it make sense to prevent folder scanning during a folder rename operation ?

@PVince81
Copy link
Contributor

So, I debugged the sabre connector's put() alone and observed the parent folder's mtime + cache.
This is what I observed:

  1. put() is called for the file "G/test.jpg"
  2. file_put_contents() is called to store the content into a part file "G/test.jpg.part": this changes the mtime of "G" for BOTH the file cache and FS. (I haven't debugged inside this)
  3. rename("G/test.jpg.part", "G/test.jpg"): this changes the mtime of "G" for BOTH the file cache and FS
  4. touch("G/test.jpg", client_mtime) which sets the mtime of the file to the one sent through headers. At this point, "G" is left unchanged.

So from what I see the mtime of "G" is changed twice, once when storing the part file and once when renaming it. The FS mtime values always matched the ones in the cache.

"file_put_contents()" seems to trigger the file scanner (seen in my logs) to update the cache's mtime/etag based on the FS values. Not sure yet how it updates the parent's cache value.

But I suspect that if there's a slow operation there, there might be a small time window where the FS mtime and cache mtime are not the same.

Next step: debug into "file_put_contents()"

@PVince81
Copy link
Contributor

I debugged inside "file_put_contents()". There is a very short time window where FS mtime and cache mtime differ. But that window is so short that I doubt it is causing the issue I was able to reproduce easily several times.

I'll add more debug statements and hope to find more clues...

@PVince81
Copy link
Contributor

How is that even possible ? The file id of the folder changes after paralell upload+rename:
Before:

+--------+---------+---------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+
| fileid | storage | path    | path_hash                        | parent | name | mimetype | mimepart | size | mtime      | storage_mtime | encrypted | unencrypted_size | etag          | permissions |
+--------+---------+---------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+
|   1983 |       4 | files/H | dc2a3ba5ad4e1f95b3c42206b4441b7b |    672 | H    |        2 |        1 |    0 | 1414094714 |    1414094714 |         0 |                0 | 54495f7a21602 |          31 |
+--------+---------+---------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+

After:

+--------+---------+----------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+
| fileid | storage | path     | path_hash                        | parent | name | mimetype | mimepart | size | mtime      | storage_mtime | encrypted | unencrypted_size | etag          | permissions |
+--------+---------+----------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+
|   1994 |       4 | files/H2 | 80a540e5f8fcbf9c41bfba1909fdf6ed |    672 | H2   |        2 |        1 |    0 | 1414094991 |    1414094991 |         0 |                0 | 5449609077378 |          31 |
+--------+---------+----------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+

I understand that the mtime or etag changes, but why the fileid ?!

@moscicki
Copy link
Contributor Author

@PVince81 : that's pretty hardcore debugging -- sorry I cannot help out much in php.
However, I can suggest that in case of PUT into inexistent collection you should probably return 409 (Conflict) -- according to the webdav spec.

@PVince81
Copy link
Contributor

Here's a piece of the log, starting with the rename operation (request 544960902fa20):

// This is the rename request:
{"reqId":"544960902fa20","app":"core","message":"rename begin: \"\/H\" => \"\/H2","level":0,"time":"2014-10-23T20:09:52+00:00","method":"GET","url":"\/owncloud\/index.php\/apps\/files\/ajax\/rename.php?dir=%2F&newname=H2&file=H"}

// This is another request from an upload which updates the mtime of folder "H", probably from file_put_contents()
{"reqId":"5449608f9da1b","app":"cache","message":"Update cache entry for : \"1983\": {\"mtime\":1414094992,\"etag\":\"5449609031847\"}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}

{"reqId":"5449608f9da1b","app":"core","message":"Scanning file: \"files\/H\/IMAG0279.jpg\"","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"5449608f9da1b","app":"OC\\Files\\Cache\\Scanner","message":"!!! Path 'files\/H\/IMAG0279.jpg' is not readable !!!","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"5449608f9da1b","app":"core","message":"Scanning file: \"files\/H\/IMAG0279.jpg\" end with data null","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"5449608f9da1b","app":"cache","message":"Update cache entry for : \"1983\": {\"size\":14577457}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}

// note 672 is the /files entry
{"reqId":"5449608f9da1b","app":"cache","message":"Update cache entry for : \"672\": {\"size\":14577457}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}

// yet another upload request doing its scanning
{"reqId":"54496090729e8","app":"core","message":"Scanning file: \"files\"","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"54496090729e8","app":"OC\\Files\\Cache\\Scanner","message":"!!! No reuse of etag for 'files' !!! \ncache: Array\n(\n    [fileid] => 672\n    [storage] => home::root\n    [path] => files\n    [parent] => 670\n    [name] => files\n    [mimetype] => httpd\/unix-directory\n    [mimepart] => httpd\n    [size] => 14577457\n    [mtime] => 1414094992\n    [storage_mtime] => 1414094714\n    [encrypted] => \n    [unencrypted_size] => 0\n    [etag] => 5449609028cba\n    [permissions] => 31\n)\n \ndata: Array\n(\n    [mimetype] => httpd\/unix-directory\n    [mtime] => 1414094992\n    [size] => -1\n    [etag] => 5449609072c7b\n    [storage_mtime] => 1414094992\n    [permissions] => 31\n)\n","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"54496090729e8","app":"cache","message":"Update cache entry for : \"672\": {\"size\":-1,\"etag\":\"5449609072c7b\",\"storage_mtime\":1414094992}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"5449608f9da1b","app":"sharing","message":"No fileinfo available for: \/H\/IMAG0279.jpg","level":2,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"5449608f9da1b","app":"core","message":"Scanning file: \"files\/H\"","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"5449608f9da1b","app":"OC\\Files\\Cache\\Scanner","message":"!!! Path 'files\/H' is not readable !!!","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"54496090729e8","app":"core","message":"Scanning file: \"files\" end with data {\"mimetype\":\"httpd\\\/unix-directory\",\"mtime\":1414094992,\"size\":-1,\"etag\":\"5449609072c7b\",\"storage_mtime\":1414094992,\"permissions\":31,\"fileid\":\"672\"}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"54496090729e8","app":"core","message":"Scanning children: \"files\"","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"54496090729e8","app":"core","message":"Scanning file: \"files\/H2\"","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}

// and here, very weird: a new entry for "H2" is inserted (which explains the new file id):
{"reqId":"54496090729e8","app":"cache","message":"Put cache entry for : \"files\/H2\": {\"mimetype\":\"httpd\\\/unix-directory\",\"mtime\":1414094991,\"size\":-1,\"etag\":\"5449609077378\",\"storage_mtime\":1414094991,\"permissions\":31,\"parent\":\"672\",\"path\":\"files\\\/H2\",\"name\":\"H2\"}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"54496090729e8","app":"core","message":"Scanning file: \"files\/H2\" end with data {\"mimetype\":\"httpd\\\/unix-directory\",\"mtime\":1414094991,\"size\":-1,\"etag\":\"5449609077378\",\"storage_mtime\":1414094991,\"permissions\":31,\"parent\":\"672\",\"fileid\":1994}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"54496090729e8","app":"core","message":"Scanning file: \"files\/x\"","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"54496090729e8","app":"core","message":"Scanning file: \"files\/x\" end with data {\"mimetype\":\"application\\\/octet-stream\",\"mtime\":1414089876,\"size\":0,\"etag\":\"54494ca1a65ea\",\"storage_mtime\":1414089876,\"permissions\":27,\"parent\":\"672\"}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"5449608f9da1b","app":"core","message":"Scanning file: \"files\/H\" end with data null","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"5449608f9da1b","app":"core","message":"Scanning children: \"files\/H\"","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"5449608f9da1b","app":"PHP","message":"opendir(\/srv\/www\/htdocs\/owncloud\/data\/root\/files\/H): failed to open dir: No such file or directory at \/srv\/www\/htdocs\/owncloud\/lib\/private\/files\/storage\/local.php#75","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"5449608f9da1b","app":"cache","message":"Update cache entry for : \"672\": {\"size\":0}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0279.jpg"}
{"reqId":"54496090729e8","app":"cache","message":"Update cache entry for : \"672\": {\"size\":-1}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/H\/IMAG0392.jpg"}
{"reqId":"544960902fa20","app":"cache","message":"Update cache entry for : \"672\": {\"size\":0}","level":0,"time":"2014-10-23T20:09:52+00:00","method":"GET","url":"\/owncloud\/index.php\/apps\/files\/ajax\/rename.php?dir=%2F&newname=H2&file=H"}

But what happened with 1983 / the original "H" entry ? Maybe one of the concurrent scan requests found it missing and deleted it ? And another one found the new one and created one ?

It almost looks like we'd need some kind of scanner locking... avoid rescanning the same folder if another scanner is running... or something in the likes.

@PVince81
Copy link
Contributor

@moscicki I don't think that would help. Maybe it would just make the error messages from the client look less scary.

The main problem is that the folder's etag changes on rename, which makes the client believe that the folder is NOT the same. Goal is to keep the etag after rename (at least the latest version, the one that the client knows of) so that the client knows that it's the same folder but with a different name.

I don't see any other way to debug this except using hard-core debugging. Race conditions are hard and only logging makes it possible for me to see what happens... and every time it's not exactly the same that happens (just got a run where the fileid was kept but the etag changed).

@PVince81
Copy link
Contributor

Ok, I can now confirm the reason for the fileid change (test run K->K2)

Subdir entry before:
+--------+---------+---------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+
| fileid | storage | path | path_hash | parent | name | mimetype | mimepart | size | mtime | storage_mtime | encrypted | unencrypted_size | etag | permissions |
+--------+---------+---------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+
| 2150 | 4 | files/K | 847786f8331086ffa1f8d5425162e042 | 672 | K | 2 | 1 | 0 | 1414096733 | 1414096733 | 0 | 0 | 5449675da2b9d | 31 |
+--------+---------+---------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+


Subdir entry after:

+--------+---------+----------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+
| fileid | storage | path | path_hash | parent | name | mimetype | mimepart | size | mtime | storage_mtime | encrypted | unencrypted_size | etag | permissions |
+--------+---------+----------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+
| 2161 | 4 | files/K2 | 4793549d7aeb00320d1999e03c1dc8ca | 672 | K2 | 2 | 1 | 0 | 1414096883 | 1414096883 | 0 | 0 | 544967f39c4d3 | 31 |
+--------+---------+----------+----------------------------------+--------+------+----------+----------+------+------------+---------------+-----------+------------------+---------------+-------------+

Here is the log, followed by my interpretation:

// this here is the rename operation
{"reqId":"544967f36d9b2","app":"cache","message":"Move \"files\/K\" to \"files\/K2\": null","level":0,"time":"2014-10-23T20:41:23+00:00","method":"GET","url":"\/owncloud\/index.php\/apps\/files\/ajax\/rename.php?dir=%2F&newname=K2&file=K"}

// 672 is the "/files" entry
{"reqId":"544967f2b6f50","app":"cache","message":"Update cache entry for : \"672\": {\"mtime\":\"1414096862\",\"etag\":\"544967f3770c8\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0279.jpg"}
{"reqId":"544967f31f3e4","app":"cache","message":"Update cache entry for : \"672\": {\"mtime\":1414096883,\"etag\":\"544967f37e8d3\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f2b6f50","app":"cache","message":"Update cache entry for : \"2150\": {\"mtime\":\"1414096862\",\"etag\":\"544967f380f5a\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0279.jpg"}
{"reqId":"544967f31f3e4","app":"cache","message":"Update cache entry for : \"2150\": {\"mtime\":1414096883,\"etag\":\"544967f383a64\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f2b6f50","app":"connector","message":"Return etag for \"\/K\/IMAG0279.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0279.jpg"}

// here one of the upload requests scans the parent dir (/files)
{"reqId":"544967f388347","app":"core","message":"Scanning file: \"files\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"OC\\Files\\Cache\\Scanner","message":"!!! No reuse of etag for 'files' !!! \ncache: Array\n(\n    [fileid] => 672\n    [storage] => home::root\n    [path] => files\n    [parent] => 670\n    [name] => files\n    [mimetype] => httpd\/unix-directory\n    [mimepart] => httpd\n    [size] => 15453438\n    [mtime] => 1414096883\n    [storage_mtime] => 1414096733\n    [encrypted] => \n    [unencrypted_size] => 0\n    [etag] => 544967f37e8d3\n    [permissions] => 31\n)\n \ndata: Array\n(\n    [mimetype] => httpd\/unix-directory\n    [mtime] => 1414096883\n    [size] => -1\n    [etag] => 544967f3885bb\n    [storage_mtime] => 1414096883\n    [permissions] => 31\n)\n","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Update cache entry for : \"672\": {\"size\":-1,\"etag\":\"544967f3885bb\",\"storage_mtime\":1414096883}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f31f3e4","app":"connector","message":"put okay for \"\/K\/IMAG0332.jpg.ocTransferId346050858.part\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f31f3e4","app":"connector","message":"Getting file size for: \"\/K\/IMAG0332.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f31f3e4","app":"PHP","message":"filesize(): stat failed for \/srv\/www\/htdocs\/owncloud\/data\/root\/files\/K\/IMAG0332.jpg.ocTransferId346050858.part at \/srv\/www\/htdocs\/owncloud\/lib\/private\/files\/storage\/local.php#117","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f31f3e4","app":"connector","message":"Got file size for: \"\/K\/IMAG0332.jpg\" :","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f31f3e4","app":"cache","message":"Delete \"files\/K\/IMAG0332.jpg.ocTransferId346050858.part\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f31f3e4","app":"cache","message":"Update cache entry for : \"672\": {\"size\":15453438}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f388347","app":"core","message":"Scanning file: \"files\" end with data {\"mimetype\":\"httpd\\\/unix-directory\",\"mtime\":1414096883,\"size\":-1,\"etag\":\"544967f3885bb\",\"storage_mtime\":1414096883,\"permissions\":31,\"fileid\":\"672\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}

// another one request that scans "/files"
{"reqId":"544967f388347","app":"core","message":"Scanning children: \"files\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"core","message":"Scanning file: \"files\/K2\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}

// here the scanner process found that there's a "new" folder called "K2", so it inserts a new cache entry for it
{"reqId":"544967f388347","app":"cache","message":"Put cache entry for : \"files\/K2\": {\"mimetype\":\"httpd\\\/unix-directory\",\"mtime\":1414096883,\"size\":-1,\"etag\":\"544967f39c4d3\",\"storage_mtime\":1414096883,\"permissions\":31,\"parent\":\"672\",\"path\":\"files\\\/K2\",\"name\":\"K2\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"core","message":"Scanning file: \"files\/K2\" end with data {\"mimetype\":\"httpd\\\/unix-directory\",\"mtime\":1414096883,\"size\":-1,\"etag\":\"544967f39c4d3\",\"storage_mtime\":1414096883,\"permissions\":31,\"parent\":\"672\",\"fileid\":2161}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"core","message":"Scanning file: \"files\/x\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"core","message":"Scanning file: \"files\/x\" end with data {\"mimetype\":\"application\\\/octet-stream\",\"mtime\":1414095864,\"size\":0,\"etag\":\"544963fb473f3\",\"storage_mtime\":1414095864,\"permissions\":27,\"parent\":\"672\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}

// and here since K was already renamed to K2, the scanner finds that "K" doesn't exist any more
// so it deletes the cache entry (the scanner doesn't know the relationship between K2 and K)
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K\/1368555495004.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K2\/1391249528060.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K2\/20110610_005.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K\/20130317_001.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K\/DSC05207.JPG\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K2\/DSC_0158.JPG\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K\/Huoguo.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K2\/IMAG0100.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K\/IMAG0156.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f388347","app":"cache","message":"Delete \"files\/K\/IMAG0279.jpg\"","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f31f3e4","app":"cache","message":"Update cache entry for : \"672\": {\"mtime\":1414096883,\"etag\":\"544967f3a9f43\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f36d9b2","app":"cache","message":"Update cache entry for : \"672\": {\"size\":0}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"GET","url":"\/owncloud\/index.php\/apps\/files\/ajax\/rename.php?dir=%2F&newname=K2&file=K"}
{"reqId":"544967f388347","app":"cache","message":"Update cache entry for : \"672\": {\"size\":-1}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0392.jpg"}
{"reqId":"544967f31f3e4","app":"cache","message":"Update cache entry for : \"-1\": {\"mtime\":1414096883,\"etag\":\"544967f3c9e39\"}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"PUT","url":"\/owncloud\/remote.php\/webdav\/K\/IMAG0332.jpg"}
{"reqId":"544967f36d9b2","app":"cache","message":"Update cache entry for : \"2161\": {\"size\":0}","level":0,"time":"2014-10-23T20:41:23+00:00","method":"GET","url":"\/owncloud\/index.php\/apps\/files\/ajax\/rename.php?dir=%2F&newname=K2&file=K"}
...
// much later comes the end of the subdir rename operation
{"reqId":"544967f36d9b2","app":"core","message":"rename end: \"\/K\" => \"\/K2","level":0,"time":"2014-10-23T20:41:24+00:00","method":"GET","url":"\/owncloud\/index.php\/apps\/files\/ajax\/rename.php?dir=%2F&newname=K2&file=K"}

So basically this proves my theory about the changing fileid.

It seems that having concurrently running scanners is a bad idea and likely to cause issues. I have the feeling that there are other possible concurrency cases that could go wrong.

So we probably need to either lock at scanner level or at high-level file operation level (high-level as in close to the entry points)

@icewind1991 I'd like to hear your input about how to improve this.

@PVince81
Copy link
Contributor

Note: I had a case where the fileid stayed but the etag changed, so it's likely that there are other issues... it all depend on the order of all these operations...

@moscicki
Copy link
Contributor Author

@PVince81: I agree 409 would not help immediately but it would be according to the webdav specs and at some point maybe client guys could profit from a difference between 404 and 409. For now, agreed, it's a rather cosmetic change.

@ogoffart: I tried against our webdav backend on EOS (which apparently does not have this concurrency issue and keeps the fileid stable).

So the first time I tried I thought something was messed up (no file loss, though).

Then I retried 4 times (both with new and previously known files, see #2296 (comment)) and it seems to work!

Unless I fell into lucky race-condition during the 4 retires I would think that indeed the issue has been fixed in the client.

If you want for reference, here is the trace of operations with 1.7.0-beta4 on MacOSX:

https://gist.github.com/moscicki/25bdd9adbef509de6d95#file-rename-while-upload-sync-activity-1-7-0-beta4

@PVince81
Copy link
Contributor

I think this bug should not be a release blocker / gold ticket aka showstopper.
I'm pretty sure this already existed in past versions and was just discovered through unluck.

Note that in my case I can reproduce the issue 90% of the time.

So I suggest to remove the gold ticket / showstopper label but keep the ticket to continue investigating. Locking the file scanner and/or file operations is a general thing that needs to be looked into.

@moscicki
Copy link
Contributor Author

I think it is indeed a server issue now. I can confirm that the client 1.7.0-beta4 works fine (as far as I can test it).

Thanks to all for help. I hope this is also useful to keep on improving both client and server.

@moscicki
Copy link
Contributor Author

@ogoffart: FYI, I also checked the same scenario in the opposite direction i.e.

while files are downloaded from remote folder A to local folder A: rename a local folder A to B (on MacOSX)

works very well! everything ends up in B and also the remote folder gets renamed to B

well done, many thanks! 👍

@ogoffart
Copy link
Contributor

(Note that mirall 1.4 and before relied on the 404 to create the directories)

@moscicki
Copy link
Contributor Author

I hope you do not anymore since 1.5 - I am giving you back 409 as in the specs...

@PVince81
Copy link
Contributor

I raised owncloud/core#11795 to follow up on the server.
Closing this hence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p2-high Escalation, on top of current planning, release blocker Server Involved type:bug
Projects
None yet
Development

No branches or pull requests

5 participants