-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add wrapper for NFD encoding workaround #24349
Conversation
b28fb77
to
db5e131
Compare
More progress. More file operations work now and the option is opt-in in the UI now. |
|
Arghh, another folder listing diff that needs NFD support: https://github.com/owncloud/core/blob/v9.0.1/lib/private/files/cache/scanner.php#L397 |
Will not work with current logic that transforms the whole path to NFC/NFD). Need an algo that goes through all path sections and calls |
Reason is the same as above. Need to change |
db5e131
to
7a99ad9
Compare
Added some unit tests, might not be enough to cover all. |
* @return string original or converted path | ||
*/ | ||
private function findPathToUse($path) { | ||
if ($path !== '' && !$this->isAscii($path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: the whole function would save indentation if you said
if (path == '' || $this->isAscii($path) ) {
return $path;
}
Wouldn't that be nicer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And: The isAscii funciton initializes a regexp engine which is always not cheap.
Without having investigated details but don't you think a simple for loop over the string checking for non ascii chars is runtime-cheaper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We got plenty of indentation in stock so no need to save them 😉
Yeah, good point. I'll change it next time I touch this good, which is likely to be soon. The algo isn't adequate because every path sections needs its own NFC/NFD trial...
|
On a more general note, and sorry if my comments miss the facts, I haven't investigated too deeply: The
is done - that would results in calling the conversion function twice maybe. That can eat performance. The big gun would of course be a class Path that has the knowledge about if the encoding was checked already. |
@dragotin actually the expensive lookup for each path is only done once, the result is saved in the |
7a99ad9
to
822f591
Compare
@dragotin I fixed isAscii to be simpler and saved some indenting. Also: changed the path finding algo by converting each path section one by one to find either the NFD or NFC form. This is required in case of people having crazy paths like "nfd/nfc/nfd".
Also need to redo a bit of manual testing as per the steps in the OP. |
* | ||
* @param string $fullPath path to check | ||
* | ||
* @return string original or converted path, or null if none of the forms was found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed fixing this comment
822f591
to
e6f31f2
Compare
Rebased. Next up: redo the manual testing to tick all the TEST checkboxes above. |
Please review @owncloud/filesystem @nickvergessen @schiesbn |
@@ -33,8 +33,12 @@ var MOUNT_OPTIONS_DROPDOWN_TEMPLATE = | |||
' <option value="1" selected="selected">{{t "files_external" "Once every direct access"}}</option>' + | |||
' </select>' + | |||
' </div>' + | |||
' <div class="optionRow">' + | |||
' <input id="mountOptionsEncoding" name="encoding_compatibility" type="checkbox" value="true"/>' + | |||
' <label for="mountOptionsEncoding">{{t "files_external" "Enable encoding compatibility (decreases performance)"}}</label>' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Enable NFD encoding compatibility" or "Enable Mac unicore encoding compatibility" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I hesitated about that but you're right. In the end it's only about NFD/Mac.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw what is this {{t
magic, does our language extraction tool detect that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a template function. Unfortunately our language extraction doesn't detect it, so there's an ugly workaround here: https://github.com/owncloud/core/pull/24349/files/e6f31f2107ed86e88527bb4ac2f60936c0e57e3a#diff-55300aafc99213431ff789e8b0fb70a7R13
I hope one day we can make it work... #15106
return $this->copy($this->findPathToUse($sourceInternalPath), $targetInternalPath); | ||
} | ||
|
||
$result = $this->storage->copyFromStorage($sourceStorage, $this->findPathToUse($sourceInternalPath), $targetInternalPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$targetInternalPath
should be de-normalized, not $sourceInternalPath
since the target path is the one on this storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok seems I need to add moar unit tests
I fixed the code based on comments, please have another look @icewind1991 @nickvergessen |
$result = $this->storage->rename($this->findPathToUse($path1), $this->findPathToUse($path2)); | ||
if ($result) { | ||
unset($this->namesCache[$path1]); | ||
unset($this->namesCache[$path2]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more unneeded unsets (and the 2 below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Went through the test list and all went fine with SMB. |
Did a quick test with SFTP and it seems to work fine, too 😄 |
The encoding wrapper is now only applied when the mount option is set, disabled by default.
Since new children from the storage might contain NFD entries, these must be normalized to NFC to be properly diff'ed with the cache contents which is always NFC. This fixes an issue where NFD entries would disappear from the cache after rescannng for children.
Improved label Fixed rename/copy/moveFromStorage/copyFromStorage and added tests Improved findPathToUse algo
68dfc5a
to
bac8e13
Compare
Rebased again. I hope this PR is acceptable now 😄 |
Test files are here: #21365 (comment) |
👍 |
1 similar comment
👍 |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Fixes #21365
Adds storage wrapper "EncodingWrapper" that first tries to find which one of NFC or NFD file exists, and then uses that path to perform operations.
TODO
Bugs
Currently works ok-ish with SFTP, need to be tested with all file operations
Tests
External storages
@icewind1991 @DeepDiver1975 @nickvergessen FYI