Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove tests.lib.path #6050

Closed
pradyunsg opened this issue Nov 29, 2018 · 37 comments
Closed

Remove tests.lib.path #6050

pradyunsg opened this issue Nov 29, 2018 · 37 comments
Labels
C: tests Testing and related things type: refactor Refactoring code

Comments

@pradyunsg
Copy link
Member

File: https://github.com/pypa/pip/blob/master/tests/lib/path.py

There exist better alternatives to rolling out our own "Path" class for file handling -- the stdlib pathlib.Path and pathlib2.Path (backport to Py2) should be more than sufficient to satisfy our needs now.

if sys.python_version_info[0] >= 3:
    from pathlib import Path
else:
    from pathlib2 import Path

To be done:

  • Confirm that the pathlib equivalent is indeed enough for our needs.
    • Figure out the relevant workarounds if needed.
  • Drop this file and just directly perform the imports.
@pradyunsg pradyunsg changed the title Remove Remove tests.lib.path Nov 29, 2018
@pradyunsg pradyunsg added C: tests Testing and related things type: refactor Refactoring code labels Nov 29, 2018
@pradyunsg pradyunsg added this to the Internal Cleansing milestone Nov 29, 2018
@chrahunt
Copy link
Member

A few issues:

  1. pathlib2 appears to have an issue on Windows with unicode filenames (Unusable on Windows/Python2.7 with unicode paths jazzband/pathlib2#56) which may need to be worked around.
  2. tests.lib.path.Path inheriting from str/unicode will require the most effort to overcome. We will also end up needing something like to_fs_path_type(path) in a lot of tests for Python 2 compatibility.
  3. PipTestEnvironment sets its *_path members to Path objects, but the base class expects objects that it can pass to os.path.* APIs
  4. For a result = script.run('example', 'command') (PipTestResult), we do a lot of path in result.files_created and path not in result.files_created (in
    assert Path('scratch') / 'pip_downloads' / 'INITools-0.1.tar.gz' \
    for example). The not in case is very prone to error since it would pass unexpectedly if we forget to convert the Path object to a string. We could have a dedicated method, PipTestResult.created_file(path), for checking, but need to remember to always use it.

Number 4 and similar issues with comparisons seem like they would be the biggest route for regressions to go unnoticed or invalid tests to be introduced in the future.

@chrahunt
Copy link
Member

chrahunt commented Jul 1, 2019

Here's a mapping from the current class to equivalent functionality in pathlib.Path or elsewhere:

Done Current New Note PR
+ deferred
✔️ abspath resolve() abspath #7156
✔️ - os.path.relpath() #7151
✔️ / / no change
✔️ // unused, removed #6665
✔️ normpath os.path.normpath() normpath #6747
✔️ rmtree() shutil.rmtree() #6742
✔️ copy() shutil.copy() #6746
✔️ copytree() shutil.copytree() #6743
✔️ move() shutil.move() #6745
✔️ glob() glob() no change
✔️ name name no change
✔️ read_text() read_text() no change
✔️ rename() rename() no change
✔️ rmdir() rmdir() no change
✔️ makedirs() mkdir(parents=True) #6701
✔️ touch() touch() touch #6700
✔️ exists exists() #6670
✔️ ext suffix #6670
✔️ folder parent #6670
✔️ join() joinpath() #6670
✔️ namebase stem #6670
✔️ realpath resolve() #6670
✔️ remove() unlink() #6670
✔️ rm() unlink() #6670
✔️ write() write_text() #6670
✔️ atime unused, removed #6665
✔️ ctime unused, removed #6665
✔️ mtime unused, removed #6665
✔️ noext unused, removed #6665
✔️ normcase unused, removed #6665
✔️ renames() unused, removed #6665
✔️ supports_unicode() unused, removed #6665
✔️ walk() unused, removed #6665
✔️ mkdir() mkdir() mkdir #6888
  • abspath: absolute() is undocumented (see bpo-29688) - resolve() is not equivalent since it resolves symlinks
  • normpath: the path itself removes spurious double-slashes and . but not .., only resolve() does it but it also resolves symlinks
  • mkdir: needs exist_ok parameter instead of assuming it is OK, and should not return self
  • touch: the current function takes a times parameter which is not in pathlib.Path but it is not used anywhere

@pradyunsg
Copy link
Member Author

@chrahunt I took the liberty of updating your comment above with #6670 items being marked as merged and moving things around to be:

  • all operators at top
  • done items at end of "section" (operators + methods/attributes)
  • each PR's items sorted by name

@pradyunsg
Copy link
Member Author

the current function takes a times parameter which is not in pathlib.Path but it is not used anywhere

Filed #6700 for removing this.

@pradyunsg
Copy link
Member Author

needs exist_ok parameter instead of assuming it is OK

This only needs to be added to tests that would fail on directly switching this behavior.

And, in those cases, I feel we should blanket-replace path.mkdir with assert path.exists() anyway.

@pradyunsg
Copy link
Member Author

#6701 gets rid of makedirs. :)

@chrahunt
Copy link
Member

This only needs to be added to tests that would fail on directly switching this behavior.

The note was more regarding the functionality that should be added to the existing function to make it compatible, so we don't need to think about it when making the final switch. I assume we make the change, then add the assertion as you suggest for the tests that fail.

@pradyunsg
Copy link
Member Author

pradyunsg commented Oct 7, 2019

abspath

Looking at our uses of abspath (searching tests/ for (?<!os.path)\.abspath\b), ISTM that we can move over resolve() for our uses. We're not calling things in any way that we'd get a different result from path.absolute vs path.resolve AFAICT.

To make sure my hypothesis was correct, I ran the tests with the following patch and everything passed. :)

diff --git a/tests/lib/path.py b/tests/lib/path.py
index b2676a2e..8c2384a1 100644
--- a/tests/lib/path.py
+++ b/tests/lib/path.py
@@ -129,6 +129,9 @@ class Path(_base):
         """
         './a/bc.d' -> '/home/a/bc.d'
         """
+        import pathlib
+        assert pathlib.Path(self).resolve() == pathlib.Path(self).absolute()
+
         return Path(os.path.abspath(self))
 
     def resolve(self)

@pradyunsg
Copy link
Member Author

Working on a PR to remove the - operator.

@chrahunt
Copy link
Member

chrahunt commented Oct 7, 2019

The benefit of abspath would be fewer filesystem calls, but for our use it's probably not significant.

@pradyunsg
Copy link
Member Author

@chrahunt Yep. I found "13 results in 6 files", which isn't much overhead, given that we're spawning subprocesses and doing network requests. :)

@deveshks
Copy link
Contributor

deveshks commented May 19, 2020

Hi @pradyunsg

Now that @gutsytechster in working on #7869 to divide it into smaller PRs (1 PR just for helper functions, and maybe 2-3 PRs to replace the asserts with the helper functions). How would I go about then to work on this issue from another angle, perhaps breaking down #7860 in smaller PRs too.

I am guessing once the helper function gets merged in, it's a good idea to take a few test files at a time per PR, replace the usage of tests.lib.path.Path to pathlib(2).Path, and incrementally fix the tests, until all the tests stop using tests.lib.path.Path ?

Any other suggestions on the approach to proceed with this issue is also welcome.

@pradyunsg
Copy link
Member Author

@deveshks take a look here, where @chrahunt talks about what-to-do-next here (if you haven't seen it already) made a list of next-steps for this and talks about the general approach to take here.

While the aforementioned PR (#7689) is aimed at solving 4, we'd still want to work on 2 and 3 prior to going in and replacing the test suite's usage to pathlib.

@deveshks
Copy link
Contributor

Thanks for this, perhaps I can start by tackling 3, but I am not very clear on what changes need to be done there?

What I got from the description was that since the *_path members of PipTestEnvironment, are instantiated using Path class constructors, which in turn uses os.path.* methods, we can replace the Path object entirely, and directly work on these members using those os.path.* methods.

I tried something like this with base_path, and all the tests passed. Is this what we want to do with other variables in there?

diff --git a/tests/lib/__init__.py b/tests/lib/__init__.py
index f51cce1e..b3a6ec66 100644
--- a/tests/lib/__init__.py
+++ b/tests/lib/__init__.py
@@ -433,8 +433,6 @@ class PipTestEnvironment(TestFileEnvironment):
     verbose = False
 
     def __init__(self, base_path, *args, **kwargs):
-        # Make our base_path a test.lib.path.Path object
-        base_path = Path(base_path)
 
         # Store paths related to the virtual environment
         venv = kwargs.pop("virtualenv")
@@ -462,8 +460,8 @@ class PipTestEnvironment(TestFileEnvironment):
             )
 
         # Create a Directory to use as a scratch pad
-        self.scratch_path = base_path.joinpath("scratch")
-        self.scratch_path.mkdir()
+        self.scratch_path = os.path.join(base_path, "scratch")
+        os.mkdir(self.scratch_path)
 
         # Set our default working directory
         kwargs.setdefault("cwd", self.scratch_path)

@pradyunsg
Copy link
Member Author

Changes like that look good to me. Not everything is gonna be as simple as scratch_path though. :)

I'm pretty sure you'll need to test on all supported versions since path-related stuff has had some incremental changes in the past few Python releases IIRC.

@deveshks
Copy link
Contributor

Changes like that look good to me. Not everything is gonna be as simple as scratch_path though. :)

But this means that we got rid of a functionality supported by pathib.Path and replaced it with os.path.*. Is that the end goal of point 3, or is it deciding the right mix of where can we use os.path.*, and where can we leave it unchanged.

I'm pretty sure you'll need to test on all supported versions since path-related stuff has had some incremental changes in the past few Python releases IIRC.

Agreed, I guess the only way to do that is to let the tests run through our CI?

@pradyunsg
Copy link
Member Author

Agreed, I guess the only way to do that is to let the tests run through our CI?

Or... you could run tests on all platforms, across multiple Python versions locally. :P

@deveshks
Copy link
Contributor

Or... you could run tests on all platforms, across multiple Python versions locally. :P

Unfortunately I only have access to a Mac.

I was also interested in knowing if the end goal of 3 is to get rid of a functionality supported by pathib(2).Path and replaced it with os.path.*, or is it deciding the right mix of where can we use os.path.*, and where can we leave it unchanged?

@pradyunsg
Copy link
Member Author

is it deciding the right mix of where can we use os.path.*, and where can we leave it unchanged?

Honestly, I'm not a 100% sure myself what the answer is, so I'm picking the option of "we need to figure it out". :)

@deveshks
Copy link
Contributor

Honestly, I'm not a 100% sure myself what the answer is, so I'm picking the option of "we need to figure it out". :)

No worries, let me get the ball rolling on 3 via a PR, I will first aim to just contain the changes in tests/lib/* without changing the tests itself, and see how far we can go from there to "figuring it out" :)

@pradyunsg
Copy link
Member Author

@ssurbhi560 @gutsytechster Thank you for tackling this together, collaborating on who-does-what, and creating some really nicely scoped PRs that were easy to review! ^>^

I've gone ahead and reviewed all of them (I think?). Let me know if I've missed any. I'll wait a couple of days, in case someone else wants to review them / y'all want to address any of the non-blocker comments on the PRs eagerly. :)

@sbidoul
Copy link
Member

sbidoul commented Jun 9, 2020

Thank you @gutsytechster @ssurbhi560

@hexagonrecursion
Copy link
Contributor

https://github.com/mcmtroffaes/pathlib2/blob/develop/README.rst
As of January 1 2020, this repository will no longer receive any further updates, as Python 2 is no longer supported.

If this work continues someone will have to step in and maintain pathlib2 until pip too drops support for python2

@pfmoore
Copy link
Member

pfmoore commented Feb 9, 2021

@hexagonrecursion Pip has dropped support of Python 2, so I'm not sure what you're suggesting here...

@pradyunsg
Copy link
Member Author

Well, appreciate the reminder.

#9578 filed, and I'll wrap this up there. There's a couple of loose ends, but should be easy enough for a weekend morning. :)

@uranusjr
Copy link
Member

This is done

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: tests Testing and related things type: refactor Refactoring code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants