If git fetch fails wipe the directory and try again from scratch #1580

ndmitchell · 2015-12-30T16:35:30Z

This helps with #1418, certainly recovering from an artificial instance of the problem I was able to reproduce (just deleting the .git directory).

quyse · 2015-12-30T20:58:36Z

git fetch may fail also if remote server is temporary down, or because of wrong network settings.

Honestly I would even argue against this fix at all. Limited cache at Appveyor is a fairly rare and specific problem. I would say stack should have reasonable expectations about underlying filesystem/hardware/etc, and should not contain "repairing" functionality for every possible case of misbehaving system, as it may make things worse. Like, recursive deletion of a directory should be very carefully tested, to not delete by chance user's home directory or something because of wrong environment variables or spaces in the path, etc...

ndmitchell · 2015-12-30T21:11:30Z

The current code is non atomic - if you Ctrl C at the wrong point then your stack install is irrecoverably trashed unless you find a directory and delete it yourself. I don't think stack should ever get wedged because on disk structures aren't what it expects. The fact appveyor causes the disk structures to be incorrect is just one way this can go wrong, but it's far from the only way.

I agree recursive delete should be used carefully and tested, and confirm I have tested it, but in general, these things are tricky.

quyse · 2015-12-30T21:52:15Z

I mean it's better when user deals with unexpected breakage, rather than the tool itself. Probably the primary motivation for writing my above comment was the feeling that stack is already way too "intelligent" - like when you call stack path to just get the path to ghc, it may start downloading snapshot. I do understand why it's doing this, but it's certainly not a "unix way", and makes it harder to use stack in scripts, as it's difficult to say what sort of operations stack is going to do for specific command and what kind of things it may break. This fix with removing and total re-downloading of package index is an example of such intelligence IMHO. I would say it's better to print useful error message and let user fix the things or provide better system to operate on.

Anyway, I apologize if my previous comment sounded bluntly, I don't mean this. Thanks for all your efforts with stack!

ndmitchell · 2015-12-30T22:34:11Z

@quyse - no offense taken - it's important to have people push back, otherwise stupid ideas make it in, and my previous contribution to stack has been minimal, so entirely other people :)

For things like the presence or absence of GHC, which a user should be aware of, I agree that information querying commands that suddenly go off and magic the world around them are a bit scary (it has arguments both ways). Here I think it's a little more reasonable - it's a stack internal directory that no user should ever be aware exists, operating on a git bundle of the repo, when users shouldn't be aware git is being used like that behind the scenes. The only thing you could tell a user is "remove this directory and try again", and all the arguments about lack of network access etc mean you can never know if that's the right thing to suggest anyway.

Alternatively, some rsync like protocol where it just mirrored a directory, with no preconceived notion of the start or end point, would totally solve the entire issue. But I don't think that's on the cards at the moment.

PS. If my first comment sounded terse its because I tapped it out on a mobile while waiting for my son to fall asleep :)

mgsloan · 2015-12-30T23:37:38Z

This change seems reasonable to me. It'd also be good to show the user the exception that causes the retry to be necessary. Also, how about factoring this out like this:

let cloneAndFetch = do ...
cloneAndFetch `C.catch` \(ex :: ReadProcessException) -> do
    $logWarn (T.pack (show ex))
    $logStickyDone "Failed to fetch package index, retrying."
    removeTree acfDir
    cloneAndFetch

Can you please update the PR with these changes? Thanks!

While it is reasonable that stack shouldn't try to work around every single failure case, we should try to gracefully handle the cases that happen in practice. Having the requisite intelligence to make stuff "just work" is part of what makes it excellent! Since this folder is an implementation detail, deleting it and starting from scratch is perfectly reasonable.

kadoban · 2015-12-30T23:43:44Z

I did some thinking about this before, coming from a different place, and my tentative conclusion was: for a perfect answer we need to use a git library that can more easily tell why the command failed. If the local repo is corrupt, yeah it should be wiped. If the network connection didn't succeed it might make sense just to retry instead, without nuking the local repo. If something else failed, there's probably no point in trying again automatically, etc.

Of course "perfect" isn't always a good thing to wait for, I'm unsure if this is better than the current behavior or not. I suspect it is better in general?

mgsloan · 2015-12-30T23:49:29Z

That's a good point, @kadoban! We could try to parse out different git error messages. It seems reasonable to me to detect the case where it's a network failure, and avoid deleting the repo. We'd also want to detect this case in order to support a general "--offline" mode for stack, as alluded to here, in the context of stack new.

We could also use gitlib's binding of libgit, but I'd rather avoid depending on an additional C library.

ndmitchell · 2015-12-31T07:57:46Z

@mgsloan - the issue with putting the retry there is if there is no repo and we git clone, and that fails, we'll retry the git clone, which seems incorrect. Parsing error messages from other people's programs scares me, especially git, which scares me independently. Adding the logging seems a good idea.

ndmitchell · 2015-12-31T18:34:50Z

I now log the exception. @mgsloan, I'm happy to shove in your version if you prefer, but it does have a few downsides. If the code duplication is excessive, I can shove the two commands into a let, but given they call different command line functions in the initial and retry, it's not very neat either way.

ndmitchell · 2015-12-31T18:40:41Z

(I note while going through the process functions that they are all a bit weirdly inconsistent - with the working directory either a Maybe, or required, and the arguments in different orders. I leave that to you though.)

mgsloan · 2016-01-03T00:46:06Z

LGTM, thanks!

(I note while going through the process functions that they are all a bit weirdly inconsistent - with the working directory either a Maybe, or required, and the arguments in different orders. I leave that to you though.)

Agreed. The functions there have been created and modified as required by the uses in stack, which is one reason it's still in the stack source tree. It'd be cool to generalize + clean it up + release it as a new package.

If git fetch fails wipe the directory and try again from scratch

#1418, if git fetch fails wipe the directory and try again from scratch

06dca68

borsboom added the in progress label Dec 30, 2015

ndmitchell mentioned this pull request Dec 30, 2015

stack setup on appveyor fails #1418

Closed

ndmitchell added 2 commits December 31, 2015 18:27

Merge remote-tracking branch 'them/master'

76a8a13

Also log the error message when git updating fails

71fa056

mgsloan added a commit that referenced this pull request Jan 3, 2016

Merge pull request #1580 from ndmitchell/1418-git-retry

ecb056c

If git fetch fails wipe the directory and try again from scratch

mgsloan merged commit ecb056c into commercialhaskell:master Jan 3, 2016

borsboom removed in progress labels Jan 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If git fetch fails wipe the directory and try again from scratch #1580

If git fetch fails wipe the directory and try again from scratch #1580

ndmitchell commented Dec 30, 2015

quyse commented Dec 30, 2015

ndmitchell commented Dec 30, 2015

quyse commented Dec 30, 2015

ndmitchell commented Dec 30, 2015

mgsloan commented Dec 30, 2015

kadoban commented Dec 30, 2015

mgsloan commented Dec 30, 2015

ndmitchell commented Dec 31, 2015

ndmitchell commented Dec 31, 2015

ndmitchell commented Dec 31, 2015

mgsloan commented Jan 3, 2016

If git fetch fails wipe the directory and try again from scratch #1580

If git fetch fails wipe the directory and try again from scratch #1580

Conversation

ndmitchell commented Dec 30, 2015

quyse commented Dec 30, 2015

ndmitchell commented Dec 30, 2015

quyse commented Dec 30, 2015

ndmitchell commented Dec 30, 2015

mgsloan commented Dec 30, 2015

kadoban commented Dec 30, 2015

mgsloan commented Dec 30, 2015

ndmitchell commented Dec 31, 2015

ndmitchell commented Dec 31, 2015

ndmitchell commented Dec 31, 2015

mgsloan commented Jan 3, 2016