-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Special characters in filenames #7762
Comments
@icewind1991 fs fun :) |
Are you perhaps using a non stand filesystems such as fat or ntfs? Can you try creating a php file
And run it using |
Filesystem ist ext4 on that hdd. The result of your PHP looks fine: |
I'm having similar issue on ext4 filesystem. All affected files have error: "OC\Files\Cache\Scanner","method":"--","url":"--","message": !!! Path 'ROOT/K\u00c4SKI/DIR/T\u00f6\u00f6teeb.pdf' is not accessible or present !!!","userAgent":"--","version":"13.0.2.1"} It seems that the scanner expects all filenames to be in ascii or utf-8. If i take one of the non working files from filesystem and upload it from web ui it's accessible (it seems something converts the filename enconding in that case). |
if someone hits this problem and needs solution faster then the code gets fixed, then one solution is to use rclone / rsync to modify the filename charset. |
Facing exactly the same problem. Any updates on this? OS: Ubuntu Server 18.04 |
Just stumpled accross a very similar issue: Filenames containing a Plus-sign (+) cannot be uploaded - neither via Webfrontend nor via (Windows-) Client-Application. |
Still present in v15.0.2 |
Is it possible that the problem depends on the underlying OSes? I had the problem with the Plus-Sign when uploading a file from a Windows 10 client to a Nextcloud server hosted on Linux Mint |
For me it has something to do with filename encodings I guess. I have a separated hard drive installed on the server where Nextcloud runs on. This drive is mounted as external storage with type local (ext4). Some people do have access to this drive via ssh/sftp. Folders copied over sftp on this drive containing symbols like ä, ö, ü are not shown on Nextcloud webclient. Renaming these folders manually using ssh terminal makes them visible though. |
cc @herrwiese |
I faced this again and again. |
I put a cronjob in place to rename files containing Umlaute: |
Solution: I take no responsibility! create a database backup!!Open PHPmyAdmin set Charset to ASCII and convert all tables. /edit |
I am experiencing a similar issue where some file paths containing special characters (specifically German umlauts) are not showing up. The folders in question are mounted as external storage via SFTP. I am running Nextcloud 16.0.3 as a docker container on Ubuntu Server 18.04. What confused me was that some file paths containing umlauts were showing up while others were not. After poking around a bit I discovered that the paths that were not showing up contained "A", "O", or "U" followed by the unicode character "COMBINING DIAERESIS" (0x0308) whereas file paths that showed up normally seemed to contain "Ä", "Ö", or "Ü" directly. When renaming the combining diaeresis to the respective umlaut, the file path shows up as expected. |
@schwma (and potentially others): I had the same issue (files with "COMBINING DIAERESIS" not showing up) and could resolve it by enabling the "NFD compatibility" option on the share. The problem is that Nextcloud normalizes unicode by default (see server/lib/private/Files/Filesystem.php Lines 821 to 823 in 2111963
|
I have this problem and arrived at the conclusion that the issue involved Unicode normalization too; however, I'm running on ZFS and none of the Unicode normalization options on my filesystem seemed to resolve the issue, so I've resorted to...not storing files with non-ASCII filenames in Nextcloud :( |
All of my MacosX users from different unrelated organizations fail to see files and folders containing "combining tildes" symbols. Looks like PHP is able to handle this since PHP 7: https://wiki.php.net/rfc/unicode_escape As per this page https://www.php.net/normalizer normalizing to NFC (being MacosX file and directory filenames NFD normalized) should fix this. What worked to us to solve this issue is running frecuently cron tasks using following commands:
The star here is convmv command and following SO question gave us the final touch: https://stackoverflow.com/questions/26516700/file-name-look-the-same-but-is-different-after-copying Looking now to use something like triggers to make de conversion, but we think this is issue shoud be addressed by Nextcloud. |
We are testing now using Nextcloud module Workflow making all Created and Copied files with mime type not application/fuu (to make all files and folders pass through) to this script: Here we are using spanish characters from MacosX keyboards. |
@szaimen https://help.nextcloud.com/t/invalid-encoding-on-file-names-in-nc19/83835 He posted a solution on 1. Nov. 2020... but somehow it was not accepted... read my comment from 26. Apr. 2022 until here. Adding this small line on every Nextcloud release since more than 2 years is really annoying:( |
I guess we have not seen this form post. Can you try to create the PR? I'll then help you moving this forward :) |
see also troubleshooting NFD encoding issues with external storage: https://docs.nextcloud.com/server/latest/admin_manual/issues/general_troubleshooting.html#troubleshooting-file-encoding-on-external-storages I'm not sure if the proposed patch will make everything work correctly. Maybe the scanner will find the file but when you'll try to overwrite it through the web UI or Webdav, it will create another instance of the file with the NFC normalized name. So you'll see two files on disk with seemingly the same name, but one is with NFC normalized and one with NFD (the original one). For external storages, a special compatibility mode has been developed (see link above) which will always try both encodings to avoid such issues. However this approach makes everything slower as more FS accesses are required. |
for those already using compatibility mode and can confirm that they have NFD encoded file names and it still doesn't work, then it can be handled as bug. Back then this mode was mostly tested with SMB storages and maybe some other storages like S3 need further workarounds to work correctly. |
@PVince81 I have an account on our Nextcloud 24.0.8 where I use a Samba4 (2:4.9.5+dfsg-5+deb10u3, Debian Buster) DFS enabled share. Next, I created a new excel sheet with MS Office 2013 on my Windows System with a german umlaut and a space in its name and protected it with a password. So my guess, that it is a problem with password protected files which have an umlaut in its name IS WRONG, sorry for inconvenience. Summary... fact is...
And before you ask, this excel sheet has "very" sensitive data in it, so I cannot share. When I have more time... I will try to remove the password from that excel sheet and test again. If not possible maybe changing the password from my Windows or Linux system helps. |
in case it's useful, you can copy-paste a file name and pass it to this script and it will tell you what normalization it has and also show you both conversions: <?php
$s = $argv[1];
if (\Normalizer::isNormalized($s, \Normalizer::FORM_D)) {
print("Original string is using NFD normalization\n");
$nfc = \Normalizer::normalize($s, \Normalizer::FORM_C);
print("NFC: $nfc\n");
print("NFD: $s\n");
} elseif (\Normalizer::isNormalized($s, \Normalizer::FORM_C)) {
print("Original string is using NFC normalization\n");
$nfd = \Normalizer::normalize($s, \Normalizer::FORM_D);
print("NFC: $s\n");
print("NFD: $nfd\n");
} else {
print("Unknown normalization\n");
} |
@PVince81 First I made a copy of the special excel sheet (Zugänge ITS.xlsx) with MS Explorer in the same folder, opened it with MS Excel 2013 and removed the password, then saved it. Opened it with MS Excel 2013 again > worked without password. 1. Test - patch disabled, NFD disabled. 2. Test - patch disabled, NFD enabled. 3. Test - patch enabled, NFD disabled. Enabling NFD on the share seems to help "occ files:scan" but does not help with Colabora Online Office and takes the most time as expected. FYI... I have checked the files "Zugänge ITS.xlsx", "Zugänge ITS - Kopie.xlsx" and "Einführung_Plunet_Mitarbeiterinfo.docx" with "file" on our Samba 4 server. |
Hello guys, I tried also many things before finding this solution that works for all. I think it is the only one which works for all scenarios! |
i am also affected, local external storage, files with german umlaute ÄÜÖ and some other special characters can not be scanned and do not appear in the UI. But when i manually upload the files throug the web UI they are getting shown |
@benjelloun69 do you mind opening a pull request with your patch? Thank you! ❤️ |
just debugged a major issue with files on external storage on nextcloud 27.0.1. The External folder was webdav from hetzner storage box. |
Also faced this bug, behavior is similar: file with special characters on external storage (local). |
Steps to reproduce
Expected behaviour
Every file in this folder shoud be scanned and shown in the files-app.
Actual behaviour
These files came through download on the harddisk of my homeserver. The folder containing the downloaded files are configured as “local” external storage in my nextcloud.
Files and folders with german “umlaute” created by nextcloud in the files-app appear in the file listings. Other files and folders (from download) are ignored by the occ-file-scan.
While file-scan in debug mode the following messages appear in nextcloud.log.
There have to be Lügen instead of L\u00fcgen and Hölle instead of H\u00f6lle for example.
Server configuration
Operating system: Ubuntu Server 17.10
Web server: Apache 2.4.27
Database: MySQL
PHP version: PHP 7.1.11-0ubuntu0.17.10.1
Nextcloud version: 12.0.4
Updated from an older Nextcloud/ownCloud or fresh install: fresh install
Where did you install Nextcloud from: nextcloud.com
List of activated apps:
App list
Nextcloud configuration:
Config report
Are you using external storage, if yes which one: local
Are you using encryption: no
Are you using an external user-backend, if yes which one: no
Client configuration
Browser: Opera, Chrome, Firefox
Operating system: Windows 10
The text was updated successfully, but these errors were encountered: