Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error opening netcdf in path with "special" characters #941

Open
itati01 opened this issue Jun 24, 2019 · 11 comments
Open

Error opening netcdf in path with "special" characters #941

itati01 opened this issue Jun 24, 2019 · 11 comments

Comments

@itati01
Copy link

itati01 commented Jun 24, 2019

Hi,

I could not open a netcdf file whose path contained a German umlaut in Python 3 (FileNotFoundError: [Errno 2] No such file or directory). A workaround was to change the folder and use only the filename:

import netCDF4 as nc
s = "C:/path_with_ä/test.nc"
ncf = nc.Dataset(s) # error
import os
path, fn = os.path.split(s)
os.chdir(path)
ncf = nc.Dataset(fn) # worked

I tested v1.5.1.2 under Python 3.7.2 (64bit, Win10) as installed via conda-forge. Interestingly, using a unicode string in Python 2 worked well with netcdf4 v1.4.1.

@jswhit
Copy link
Collaborator

jswhit commented Jun 26, 2019

This works for me in Python 3.6

from netCDF4 import Dataset
filename = '\xc3\xbc.nc'
nc = Dataset(filename, 'w')
nc.close()

@itati01
Copy link
Author

itati01 commented Jun 26, 2019

Works for me as well. However, the file shows up as "ü.nc" in Windows Explorer but "ü.nc" under Linux (and in Win Explorer, I used Ubuntu 18.04 via WSL).

Now, filename = "ää.nc" results in "ää.nc" under Windows but "ää.nc" under Linux. Simulating the behaviour above, I used filename = "äää/aaa.nc" which results in an error under Windows (Errno 13: Permission denied) while everything is fine under Linux (the path shows up correctly in Win Explorer as well). os.makedirs("äää") works as expected.

@jswhit
Copy link
Collaborator

jswhit commented Jun 26, 2019

Not a Unicode expert, but netcdf4-python uses utf-8 encoding by default (can be changed with the encoding Dataset kwarg). Maybe Windows uses a different encoding?

@jswhit
Copy link
Collaborator

jswhit commented Jun 26, 2019

This is related to #686.

There is actually a test for this for windows (tst_filepath.py). I suggest you try using encoding=sys.getfilesystemencoding().

@itati01
Copy link
Author

itati01 commented Jun 26, 2019

Looks indeed like an unicode issue, although I am far from being an expert. sys.getfilesystemencoding() returns "utf-8" in Python 3, under Linux and Windows.

In Python 2, using filename = u"ää.nc" is working under Windows but filename = u"ää/ää.nc only if the folder "ää" already exists, e.g.

import os
filename = u'ää/ää.nc'
path, fn = os.path.split(filename)
if not os.path.exists(path):
    os.makedirs(path)    # os handles non-ASCII characters correctly
from netCDF4 import Dataset
nc = Dataset(filename, 'w') # fn is also fine
nc.close()

The path names also appear correctly in Win Explorer. sys.getfilesystemencoding() returns "mbcs" here.

@jswhit
Copy link
Collaborator

jswhit commented Jun 27, 2019

Is the problem resolved for python 3 on windows? I'm not clear on what works and what doesn't work.

@itati01
Copy link
Author

itati01 commented Jun 27, 2019

Sorry for the confusion. No, the problem is not solved for Python 3 on Win. Python 2 on Win partly and Python 3 on Linux fully work. The unicode issues on Win seem to result in wrong (Py 3) albeit valid file names (your example) but invalid folder names (Py 2+3, my examples).

@jswhit
Copy link
Collaborator

jswhit commented Jun 27, 2019

OK, thanks for the clarification. Not having access to Windows I'm not sure where to go from here. One question that comes to mind is whether the same issue arises if you try to open a text file in Windows (independent of netcdf4-python)?

@jswhit
Copy link
Collaborator

jswhit commented Jun 28, 2019

For reference

h5py/h5py#839

Not sure if this is related or not, but it's a nice discussion of the general problem.

@jswhit
Copy link
Collaborator

jswhit commented Jun 29, 2019

Also

https://forum.hdfgroup.org/t/non-english-characters-in-hdf5-file-name/4627/3

Seems clear that unicode filenames are not fully supported in HDF5 on windows as of yet.

@itati01
Copy link
Author

itati01 commented Jul 1, 2019

Thanks for the link to the interesting discussion. So, let's hope that they might fix this issue at some time. By the way, creating a folder and writing to a new text file with os.makedirs() and write() works as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants