Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pathlib.glob() / rglob() / match() shall behave case-sensitive under MacOs #167

Closed
Quibi opened this issue Mar 12, 2017 · 11 comments
Closed
Labels

Comments

@Quibi
Copy link

Quibi commented Mar 12, 2017

I really don't know if this is a bug or an intended behaviour.

I was testing glob() method of the fake pathlib with the pattern "*.JPg" on my Mac. But these tests failed because glob() returns files such *.jpg or *.JPG too (and shouldn't).

In my opinion, the ultimate cause of these fails is that pyfakefs considers that MacOs only uses a case insensitive file system.

pyfakefs performs a simple check on fake_filesystem.py:533

self.is_case_sensitive = sys.platform not in ['win32', 'cygwin', 'darwin']

which it is evaluated as False because MacOs is, of course, a darwin system.

However, as long as I know, MacOs file system can be case sensitive or case insensitive depending on the file system installed on disks. Since Mac OS X 10.3 (late 2003) you can use HFS+ with case sensitive file names.

I try this snippet on my MacOs Sierra 10.12 for test case sensitivity (taken from :

os.path.normcase('A') == os.path.normcase('a')

which is evaluated as False meaning that the system is case sensitive (as long as 'A' and 'a' file names are not the same). I can't test this neither on different SO nor in different MacOs file systems.

After changing

filesystem.is_case_sensitive = True

on my tests, all passed and the results of fake_pathlib.Path.glob() and real pathlib.glob() were equal on my MacOs.

If this is not the intended behaviour, I think that should be change on pyfakefs.

Thanks a lot.

@Quibi Quibi changed the title MacOs can be case sensitive but pyfakefs thinks it can be only insensitive MacOs can also be case sensitive but pyfakefs thinks it can be only insensitive Mar 12, 2017
@mrbean-bremen
Copy link
Member

mrbean-bremen commented Mar 12, 2017

@Quibi : Well, you are right, of course, MacOS supports both case-insensitive and sensitive file systems (it supports the case-sensitive UFS, for example), but the default is case insensitive, but case preserving HFS+, and as far as I know, this is used in the vast majority of cases - thus the default for MacOS as case-insensitive.
That being said, there may be cases where Python under MacOS behaves differently than under Windows regarding case, as it is generally handled as a Posix system, as opposed to Windows (maybe glob is such a case). I don't have a MacOS system to test this.
@jmcgeheeiv, could you please test the os.path.normcase behavior under your MacOS installation? Maybe we shall handle the case differently for MacOS.
And @Quibi : are you sure you use a case-sensitive file system? E.g. are you able to create two files which names only differ by case (e.g. Makefile and makefile)?

@Quibi
Copy link
Author

Quibi commented Mar 12, 2017

@mrbean-bremen: Ooouh. That's completely surprising for me. My MacOS is case-insensitive as you stated. The OS don't pass your Makefile test. Moreover, I can see in the Disk Utility that the installed file system is case insensitive.

So, what's happening? Why test are not passed and pathlib.Path.glob() in fake pyfakefs file system and in real system differ. The only explanation I found was that my file system was case sensitive but it wasn't at all.

It's also surprising that os.path.normcase('A') == os.path.normcase('a') returns False if my file system is case-insensitive.

Probably the underlined code is not MacOS aware (both on real os.path and pathlib packages) and checking normcase doesn't work as I expected.

Whatever the reason, I think that Python is case sensitive in this regard. So my tests are passed when filesystem.is_case_sensitive = True despite having a case insensitive fs. More important, glob.glob() and pathlib.Path.glob() work in similar ways —but not in the exact same manner— when case insensitivity is used.

@mrbean-bremen
Copy link
Member

Ok, thanks! I suspected something like this. Can you provide some of the tests that are failing for you under MacOS? That would help us to fix the problem. Just setting is_case_sensitive to True would fail other tests, as the file system is not really case-sensitive.

@Quibi
Copy link
Author

Quibi commented Mar 12, 2017

I have been thinking a little more on what case-insensitive and case-preserving could means for glob searches on Macs. I think that it is deeply related with the difference between actual paths and possible (or pure) paths.

Since MacOS is case insensitive, two items (files or folders) can't have the same case insensitive actual path:

/a and /A can't exist at the same time in the actual file system.

But since the system is case preserving, both paths are not the same. Potentially both can exist (not actually, but potentially) and both are possible and different.

It is not the same for a case preserving file system to have a a file and A file in its database (while for a pure case insensitive file system a and A are exactly the same).

Hence os.path.normcase('A') == os.path.normcase('a') is False because it compares two different pure paths. They really don't exist in the file system (one can be, but not both) but if they could exist, they will be different paths (because the file system distinguishes between upper and lowercase).

That implies that patterns on a MacOs file system must be case sensitive since *.a and *.A doesn't match the same file names.

In a folder containing two files called x.A and y.a, a glob search for *.a returns y.a. That's what MacOS does because x.a, which is a different file from x.A, doesn't exist in this folder.

From a different point of view, we can think about how shell deals with this issue. When we retrieve a list of files from the command-line (for instance, using ls), that list is case-sensitive on MacOS. However, when we use a file name with a command, the file name is case-insensitive (for instance, ls /library works the same as ls /LibraRy). So, glob behaves in Python.

Is this right or wrong? Could be this a reasonable explanation?

@mrbean-bremen I will provide a simple failing test tomorrow since it is late night here. Sorry.
is_case_sensitive is just changed into the glob tests and revert to previous value after every test in order to avoid side effects.

@Quibi
Copy link
Author

Quibi commented Mar 13, 2017

I have created a (wordy) test on https://gist.github.com/Quibi/4b5fefe2c816fda96660a2750a117ab1

It raises an exception on my MacOS Sierra:

glob('*.jpg') returns: [PosixPath('/a.jpg'), PosixPath('/b.JPG')]
but should return: [PosixPath('/a.jpg')]

Traceback (most recent call last):
  File "/Users/jv/CAIXON/PROGRAMANDO/python_project_tools/proba.py", line 103, in <module>
    assert list(result) == [path_a]
AssertionError

@mrbean-bremen
Copy link
Member

Ok, thanks - I will look at this in the evening. If I understand correctly, the problem appears only with pathlib.glob(), right?

@Quibi
Copy link
Author

Quibi commented Mar 13, 2017

All methods using glob pattern: pathlib.glob(), rglob(), and match().

I have included match() in the test and redo the test.

Now the output is:

glob('*.jpg') returns: [PosixPath('/a.jpg'), PosixPath('/b.JPG')]
but should return: [PosixPath('/a.jpg')]


glob('*.JPG') returns: [PosixPath('/a.jpg'), PosixPath('/b.JPG')]
but should return: [PosixPath('/b.JPG')]


glob('*.jPg') returns: [PosixPath('/a.jpg'), PosixPath('/b.JPG')]
but should return: []


'a.jpg'.match('*.jPg') returns: True
but should return: False


'b.JPG'.match('*.jPg') returns: True
but should return: False

Basically, the test:

  1. Check that file system is case-sensitive and case-preserving MacOS.
  2. Test the results returned by (real) pathlib.glob and pathlib.match using a temporary folder.
  3. Test the results returned by fake_pathlib.glob and pathlib.match and prints the information about failing assertions.

Thanks a lot.

mrbean-bremen added a commit to mrbean-bremen/pyfakefs that referenced this issue Mar 13, 2017
- make it depend on posic vs. windows instead of case sensitivity of
file system
- see pytest-dev#167
@mrbean-bremen mrbean-bremen changed the title MacOs can also be case sensitive but pyfakefs thinks it can be only insensitive pathlib.glob() / rglob() / match() shall behave case-sensitive under MacOs Mar 13, 2017
@mrbean-bremen
Copy link
Member

I think I understood the problem and put a simple fix into a PR - @Quibi , @jmcgeheeiv : can you please check if this resolves the problem under MacOS?

@mrbean-bremen
Copy link
Member

mrbean-bremen commented Mar 13, 2017

@Quibi : About the reason for this behavior - your explanation is as good as any. I think this has been simply a matter of MacOS being a Posix system (inherited from BSD), so it behaves more like BSD in that respect, even if the default filesystem behavior with regard to case (case-insenstive and case-preserving) is the same as under Windows. Pathlib only differentiates between PosixPath and WindowsPath, so this must have been a design decision for pathlib.

@Quibi
Copy link
Author

Quibi commented Mar 14, 2017

Nice solution. It works smoothly. Congratulations and thanks for your help.

@mrbean-bremen
Copy link
Member

Thanks for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants