Pre-existing build directory fix #8283

uranusjr · 2020-05-21T09:19:44Z

Fix for #8282. I’m not yet sure we can safely append the hex. It does make the test pass though.

The code here depends on th branch for #8282; it needs to be merged first.

uranusjr · 2020-05-21T09:27:20Z

Uh there’s a test ensuring PreviousBuildDirError is thrown and I’m not sure if this is a good sign or not…

pfmoore · 2020-05-22T06:59:11Z

I'm really uncomfortable about "just" adding a disambiguating string here. I'd rather we properly understood why we have two build directories with the same name first. I think it's probably a legitimate fix (in the new resolver I guess it's possible that we're looking at two sdists for the same project at the same time, just with different versions) but I think we should be clearer about what's going on before applying this.

pfmoore · 2020-05-22T12:17:08Z

OK tl;dr; version:

We don't guarantee the name used for the build directory, so I'd be fine with adding a UUID. Maybe if the directory with just the project name doesn't exist, use that so that in the normal case there's no change in behaviour. We can fix the tests that check for the error to instead confirm that the existing directory doesn't get overwritten (i.e, instead of asserting that pip fails, assert that pip succeeds and that the pre-created directory remains present and unchanged).

Some random unrelated thoughts on the failing tests:

test_no_reuse_existing_build_dir: This explicitly tests the old resolver, so that's a thing. We probably need to hunt out cases like this and decide what to do about them 🙁
test_cleanup_prevented_upon_build_dir_exception: Why is this marked as a network test???

The remainder of this is rambling notes I made while investigating. They took some time to discover, so I haven't deleted them. But they are probably of marginal value if we all agree with the suggestion above.

I did some investigation. If --build is not set, we're building in a temporary location and build directories are cleared on failures. In those cases, I see no need for the build directory to be predictable. When --build is set, I think the expectation is that if there's a failure, the user cae look at the code and try to debug. That's why we don't delete on failure, and it's where having a predictable name is important. The problem, clearly, is that the new resolver may potentially prepare multiple versions of the same project, and we're only using the project name as the directory.

The code in this area has changed. For unnamed requirements we use _temp_build_dir - and we used to do some games to later rename that directory. The code used to be in move_to_correct_build_directory, but that function got removed over a series of commits ending with #7405. So we no longer respect --build in this case. The fact that no-one complained about this change suggests to me that no-one actually cares that much about --build...

I suspect the whole of the mechanism that lets the user specify a build directory and try to diagnose build issues is only of use to a tiny proportion of users, and anyone with the expertise to do such a diagnosis is probably equally capable of manually building a wheel using the backend directly, so I think this is a prime candidate for deprecation on the basis of YAGNI.

pradyunsg · 2020-05-22T12:38:02Z

Why is this marked as a network test???

It's an OLD change, that had lots of tests marked as network. #2354

We can fix the tests that check for the error to instead confirm that the existing directory doesn't get overwritten (i.e, instead of asserting that pip fails, assert that pip succeeds and that the pre-created directory remains present and unchanged).

Sounds good to me! I thought I'd written a comment pointing toward this as well. I think I either dreamt that up or (more likely) forgot to post it because I was annoyed while trying to type down an explaination of why network requests during dependency resolution are a bad idea. :)

pradyunsg · 2020-05-22T12:42:39Z

test_no_reuse_existing_build_dir: This explicitly tests the old resolver, so that's a thing. We probably need to hunt out cases like this and decide what to do about them 🙁

This only happens in test_req.py. It's the only test module that imports from pip._internal.resolution.legacy (other than tests/unit/test_resolution_legacy_resolver.py).

Namely, we "only" need to update _basic_resolver in that module, to use the new resolver.

uranusjr · 2020-05-27T09:08:41Z

I am tempted to always use a UUID, at least in the new resolver, since I kind of feel it may improve pip’s concurrency situation (the “is there an existing directory” check can be a potential race condition). I think we can add a flag to prepare_linked_requirement() to switch between the two behaviours (allow multiple build directories, or raise PreviousBuildDirError).

One additional observation: Is this call path also used for PEP 517 packages? Because the logic to check whether there’s a previous build directory is setuptools-dependant:

pip/src/pip/_internal/operations/prepare.py

Lines 415 to 432 in 9024011

    
           with indent_log(): 
        
               # Since source_dir is only set for editable requirements. 
        
               assert req.source_dir is None 
        
               req.ensure_has_source_dir(self.build_dir, autodelete_unpacked) 
        
               # If a checkout exists, it's unwise to keep going.  version 
        
               # inconsistencies are logged later, but do not fail the 
        
               # installation. 
        
               # FIXME: this won't upgrade when there's an existing 
        
               # package unpacked in `req.source_dir` 
        
               if os.path.exists(os.path.join(req.source_dir, 'setup.py')): 
        
                   raise PreviousBuildDirError( 
        
                       "pip can't proceed with requirements '{}' due to a" 
        
                       " pre-existing build directory ({}). This is " 
        
                       "likely due to a previous installation that failed" 
        
                       ". pip is being responsible and not assuming it " 
        
                       "can delete this. Please delete it and try again." 
        
                       .format(req, req.source_dir) 
        
                   )

pfmoore · 2020-05-27T09:18:30Z

I kind of feel it may improve pip’s concurrency situation

That's a fair point. On that basis I'm fine with using a UUID always.

One additional observation: Is this call path also used for PEP 517 packages?

Um, quite probably...? I'll take a look at this after lunch (I'm busy this morning on other stuff). This might need a wider review.

pfmoore · 2020-05-27T14:57:39Z

@pradyunsg see #8333

pfmoore

I'm guessing you chose the argument name parallel_builds for this because you expect it to also be useful for the work to add multithreading to pip? I don't have a problem with the name, but if it's not linked to the multithreading stuff, we're going to end up with another case of the same name being used for two different ideas. So I'm not asking for a change (I don't know of a better name) just flagging the point up.

Some of the tests have started failing (the Travis job has expected passes that now fail. These need addressing before we merge. (I know Travis is "not required", but my view is that's only if it's green but not reporting its status correctly).

uranusjr · 2020-06-04T03:50:17Z

I'm guessing you chose the argument name parallel_builds for this because you expect it to also be useful for the work to add multithreading to pip? I don't have a problem with the name, but if it's not linked to the multithreading stuff, we're going to end up with another case of the same name being used for two different ideas. So I'm not asking for a change (I don't know of a better name) just flagging the point up.

Correct. I am not very satisfied with the name either, but have nothing better to offer 🙁

Some of the tests have started failing (the Travis job has expected passes that now fail. These need addressing before we merge. (I know Travis is "not required", but my view is that's only if it's green but not reporting its status correctly).

It turns out there are tests checking whether the build directly ends with the package name, so we appending a UUID breaks them. Arrgh why are things so difficult.

uranusjr · 2020-06-04T15:25:00Z

I pushed the following changes:

The parallel_builds parameter is now only applied to LinkCandidate, not EditableCandidate. The latter cannot backtrack anyway, so there’s no need to change the behaviour. I’m wondering whether I should extend this to the candidate held by ExplicitRequirement as well, but the candidate class does not currently have this information.
Tests on checking the PreviousBuildDirError are changed because they are no longer expected to raise it in the new resolver.

These are the only cases where backtracking can happen. This approach also accounts for VCS requirements relying on the same ensure function to do cloning :/

The new resolver uses UUID to allow parallel build directories, so this error will no longer occur.

pradyunsg · 2020-06-04T18:44:53Z

This looks like us picking up a whole bunch of technical debt. I'm on board for us doing this, since we have already run out of contract hours in some senses, for the new resolver work, and cleaning up the mess that is pip's build directory handling is too big for us right now.

I did want to flag this, just in case someone from the future comes and wonders why the heck we did something like this.

pfmoore · 2020-06-05T08:44:56Z

This looks like us picking up a whole bunch of technical debt.

I'd characterise it as paying interest on a long-standing technical debt. The build directory stuff has been a mess that's been around for ages, and the new resolver work can't be expected to fix all the problems in pip's code base. If we have to add nasty workarounds because of that existing issue, that's a shame, but inevitable.

A genuinely useful additional funding opportunity would be precisely to reduce technical debt in pip's codebase, without the constraints of a specific project deliverable like the new resolver :-)

pfmoore · 2020-06-09T14:53:46Z

@pradyunsg Gentle ping on this, I think it's just waiting for your approval.

pradyunsg

LGTM!

pradyunsg · 2020-06-09T15:05:19Z

without the constraints of a specific project deliverable

This would be awesome, but does anything in the world work like this? :o

uranusjr · 2020-06-09T15:27:38Z

Well-bounded refactoring without deliverables is generally accepted as a legitimate project in commercially-run companies with reasonable project management. It is not easily funded for pip, but that’s a problem in the OSS structure in general, not the idea of refactoring as a fundable project.

pfmoore · 2020-06-09T15:54:53Z

Looks like this failed the new lint checks. @pradyunsg I seem to remember seeing you were involved in that PR - do you know what the agreed process is for this? Was the idea that PRs that failed would need a follow-up PR to fix the new warnings?

pradyunsg · 2020-06-09T16:01:45Z

Yea, a follow up pr fixing them would be best.

deveshks · 2020-06-09T16:16:41Z

Yea, a follow up pr fixing them would be best.

I see two approaches of fixing the current lint failures caused by the offending line in follow-up PRs at

pip/src/pip/_internal/cli/cmdoptions.py

Line 838 in 7bf78f0

setattr(always_unzip, 'deprecated', True)

Just add a #noqa B010 to that line which can also be done as part of Fix "src/pip" to respect flake8-bugbear #8405
Removing the entire functionality of --always-unzip, which is a no-op. Remove --always-unzip #8408

pradyunsg · 2020-06-09T18:55:32Z

Let's go with 1. ^.^

deveshks · 2020-06-09T19:11:17Z

Let's go with 1. ^.^

Cool, I have made the required changes in #8405 , we can go ahead and merge it to fix lint failures.

uranusjr added the skip news Does not need a NEWS file entry (eg: trivial changes) label May 21, 2020

uranusjr force-pushed the pre-existing-build-directory-fix branch 2 times, most recently from 03d25f1 to 9024011 Compare May 21, 2020 09:23

uranusjr mentioned this pull request May 21, 2020

Add reprod for pre-existing build dir failure #8282

Merged

pfmoore mentioned this pull request May 27, 2020

Check in prepare_linked_requirement for an existing directory used by --src is setuptools-specific #8333

Closed

pradyunsg mentioned this pull request May 28, 2020

Improving new resolver output when backtracking choices #8346

Closed

Fix for source directory reuse

4ca684f

uranusjr force-pushed the pre-existing-build-directory-fix branch from 9024011 to 4ca684f Compare June 3, 2020 16:38

uranusjr marked this pull request as ready for review June 3, 2020 17:14

uranusjr requested review from pfmoore and pradyunsg June 3, 2020 17:15

pfmoore requested changes Jun 3, 2020

View reviewed changes

uranusjr force-pushed the pre-existing-build-directory-fix branch from 137468a to 7b2c64c Compare June 4, 2020 15:21

uranusjr added 2 commits June 4, 2020 23:26

Only attach UUID to build dir for spec candidates

09a7f27

These are the only cases where backtracking can happen. This approach also accounts for VCS requirements relying on the same ensure function to do cloning :/

Mark build dir tests as passing for new resolver

daf454b

The new resolver uses UUID to allow parallel build directories, so this error will no longer occur.

uranusjr force-pushed the pre-existing-build-directory-fix branch from 7b2c64c to daf454b Compare June 4, 2020 15:26

pfmoore approved these changes Jun 5, 2020

View reviewed changes

pradyunsg approved these changes Jun 9, 2020

View reviewed changes

pfmoore merged commit 7bf78f0 into pypa:master Jun 9, 2020

uranusjr deleted the pre-existing-build-directory-fix branch June 10, 2020 06:38

github-actions bot locked as resolved and limited conversation to collaborators Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-existing build directory fix #8283

Pre-existing build directory fix #8283

uranusjr commented May 21, 2020 •

edited

Loading

uranusjr commented May 21, 2020

pfmoore commented May 22, 2020

pfmoore commented May 22, 2020

pradyunsg commented May 22, 2020 •

edited

Loading

pradyunsg commented May 22, 2020

uranusjr commented May 27, 2020 •

edited

Loading

pfmoore commented May 27, 2020

pfmoore commented May 27, 2020

pfmoore left a comment

uranusjr commented Jun 4, 2020

uranusjr commented Jun 4, 2020

pradyunsg commented Jun 4, 2020 •

edited

Loading

pfmoore commented Jun 5, 2020

pfmoore commented Jun 9, 2020

pradyunsg left a comment

pradyunsg commented Jun 9, 2020 •

edited

Loading

uranusjr commented Jun 9, 2020

pfmoore commented Jun 9, 2020

pradyunsg commented Jun 9, 2020

deveshks commented Jun 9, 2020

pradyunsg commented Jun 9, 2020

deveshks commented Jun 9, 2020

Pre-existing build directory fix #8283

Pre-existing build directory fix #8283

Conversation

uranusjr commented May 21, 2020 • edited Loading

uranusjr commented May 21, 2020

pfmoore commented May 22, 2020

pfmoore commented May 22, 2020

pradyunsg commented May 22, 2020 • edited Loading

pradyunsg commented May 22, 2020

uranusjr commented May 27, 2020 • edited Loading

pfmoore commented May 27, 2020

pfmoore commented May 27, 2020

pfmoore left a comment

Choose a reason for hiding this comment

uranusjr commented Jun 4, 2020

uranusjr commented Jun 4, 2020

pradyunsg commented Jun 4, 2020 • edited Loading

pfmoore commented Jun 5, 2020

pfmoore commented Jun 9, 2020

pradyunsg left a comment

Choose a reason for hiding this comment

pradyunsg commented Jun 9, 2020 • edited Loading

uranusjr commented Jun 9, 2020

pfmoore commented Jun 9, 2020

pradyunsg commented Jun 9, 2020

deveshks commented Jun 9, 2020

pradyunsg commented Jun 9, 2020

deveshks commented Jun 9, 2020

uranusjr commented May 21, 2020 •

edited

Loading

pradyunsg commented May 22, 2020 •

edited

Loading

uranusjr commented May 27, 2020 •

edited

Loading

pradyunsg commented Jun 4, 2020 •

edited

Loading

pradyunsg commented Jun 9, 2020 •

edited

Loading