Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redownload pkgs when source hash verification fails #8500

Merged
merged 10 commits into from
Dec 27, 2022
38 changes: 35 additions & 3 deletions cabal-install/src/Distribution/Client/FetchUtils.hs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
--
-- Functions for fetching packages
-----------------------------------------------------------------------------
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE RecordWildCards, ScopedTypeVariables #-}
module Distribution.Client.FetchUtils (

-- * fetching packages
Expand All @@ -22,6 +22,7 @@ module Distribution.Client.FetchUtils (
-- ** specifically for repo packages
checkRepoTarballFetched,
fetchRepoTarball,
verifyFetchedTarball,

-- ** fetching packages asynchronously
asyncFetchPackages,
Expand All @@ -43,7 +44,7 @@ import Distribution.Client.HttpUtils
import Distribution.Package
( PackageId, packageName, packageVersion )
import Distribution.Simple.Utils
( notice, info, debug, die' )
( notice, info, debug, warn, die' )
import Distribution.Verbosity
( verboseUnmarkOutput )
import Distribution.Client.GlobalFlags
Expand All @@ -56,7 +57,8 @@ import qualified Control.Exception.Safe as Safe
import Control.Concurrent.Async
import Control.Concurrent.MVar
import System.Directory
( doesFileExist, createDirectoryIfMissing, getTemporaryDirectory )
( doesFileExist, createDirectoryIfMissing, getTemporaryDirectory
, getFileSize )
import System.IO
( openTempFile, hClose )
import System.FilePath
Expand All @@ -67,6 +69,8 @@ import Network.URI
( URI(uriPath) )

import qualified Hackage.Security.Client as Sec
import qualified Hackage.Security.Util.Path as Sec
import qualified Hackage.Security.Util.Checked as Sec

-- ------------------------------------------------------------
-- * Actually fetch things
Expand Down Expand Up @@ -118,6 +122,34 @@ checkRepoTarballFetched repo pkgid = do
then return (Just file)
else return Nothing

verifyFetchedTarball :: Verbosity -> RepoContext -> Repo -> PackageId -> IO Bool
verifyFetchedTarball verbosity repoCtxt repo pkgid =
let file = packageFile repo pkgid
handleError :: IO Bool -> IO Bool
handleError act = do
res <- Safe.try act
case res of
Left e -> warn verbosity ("Error verifying fetched tarball " ++ file ++ ", will redownload: " ++ show (e :: SomeException)) >> pure False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this error message triggered if file simply does not exist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is, it's pretty urgent to fix, since it's causing a warning spam: #8571 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this function verifyFetchedTarball is called only in one place in the code, for all elements of repoTarballPkgsWithMetadataUnvalidated, which seems to contain ids of any packages that come from a secure repo, regardless of whether they were ever downloaded:

https://github.com/haskell/cabal/pull/8500/files#diff-2adae381a2eb2d8c804d5396694a431e1e3babd816a5d2408a62213cf774a893R929-R935

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A solution would be to move the partitioning wrt if a package is already downloaded (the block containing checkRepoTarballFetched) earlier and apply it to allPkgLocations instead of repoTarballPkgsWithoutMetadata. Only packages that are already downloaded would then be checked for RepoSecure and then verified. Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, better, avoiding some filesystem IO, partition repoTarballPkgsWithMetadataUnvalidated wrt if a package is already downloaded.

Right b -> pure b
in handleError $ case repo of
-- a secure repo has hashes we can compare against to confirm this is the correct file.
RepoSecure{} ->
repoContextWithSecureRepo repoCtxt repo $ \repoSecure ->
Sec.withIndex repoSecure $ \callbacks ->
let warnAndFail s = warn verbosity ("Fetched tarball " ++ file ++ " does not match server, will redownload: " ++ s) >> return False
-- the do block in parens is due to dealing with the checked exceptions mechanism.
in (do fileInfo <- Sec.indexLookupFileInfo callbacks pkgid
sz <- Sec.FileLength . fromInteger <$> getFileSize file
if sz /= Sec.fileInfoLength (Sec.trusted fileInfo)
then warnAndFail "file length mismatch"
else do
res <- Sec.compareTrustedFileInfo (Sec.trusted fileInfo) <$> Sec.computeFileInfo (Sec.Path file :: Sec.Path Sec.Absolute)
if res
then pure True
else warnAndFail "file hash mismatch")
`Sec.catchChecked` (\(e :: Sec.InvalidPackageException) -> warnAndFail (show e))
`Sec.catchChecked` (\(e :: Sec.VerificationError) -> warnAndFail (show e))
_ -> pure True

-- | Fetch a package if we don't have it already.
--
Expand Down
15 changes: 11 additions & 4 deletions cabal-install/src/Distribution/Client/ProjectPlanning.hs
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ import Text.PrettyPrint (text, hang, quotes, colon, vcat, ($$), fsep,
import qualified Text.PrettyPrint as Disp
import qualified Data.Map as Map
import qualified Data.Set as Set
import Control.Monad (sequence)
import Control.Monad (sequence, forM)
import Control.Monad.IO.Class (liftIO)
import Control.Monad.State as State (State, execState, runState, state)
import Control.Exception (assert)
Expand Down Expand Up @@ -924,20 +924,26 @@ getPackageSourceHashes verbosity withRepoCtx solverPlan = do
-- Tarballs from repositories, either where the repository provides
-- hashes as part of the repo metadata, or where we will have to
-- download and hash the tarball.
repoTarballPkgsWithMetadata :: [(PackageId, Repo)]
repoTarballPkgsWithMetadataUnvalidated :: [(PackageId, Repo)]
repoTarballPkgsWithoutMetadata :: [(PackageId, Repo)]
(repoTarballPkgsWithMetadata,
(repoTarballPkgsWithMetadataUnvalidated,
repoTarballPkgsWithoutMetadata) =
partitionEithers
[ case repo of
RepoSecure{} -> Left (pkgid, repo)
_ -> Right (pkgid, repo)
| (pkgid, RepoTarballPackage repo _ _) <- allPkgLocations ]

(repoTarballPkgsWithMetadata, repoTarballPkgsToRedownload) <- fmap partitionEithers $
liftIO $ withRepoCtx $ \repoctx -> forM repoTarballPkgsWithMetadataUnvalidated $
\x@(pkg, repo) -> verifyFetchedTarball verbosity repoctx repo pkg >>= \b -> case b of
True -> return $ Left x
False -> return $ Right x

-- For tarballs from repos that do not have hashes available we now have
-- to check if the packages were downloaded already.
--
(repoTarballPkgsToDownload,
(repoTarballPkgsToDownload',
repoTarballPkgsDownloaded)
<- fmap partitionEithers $
liftIO $ sequence
Expand All @@ -947,6 +953,7 @@ getPackageSourceHashes verbosity withRepoCtx solverPlan = do
Just tarball -> return (Right (pkgid, tarball))
| (pkgid, repo) <- repoTarballPkgsWithoutMetadata ]

let repoTarballPkgsToDownload = repoTarballPkgsToRedownload ++ repoTarballPkgsToDownload'
(hashesFromRepoMetadata,
repoTarballPkgsNewlyDownloaded) <-
-- Avoid having to initialise the repository (ie 'withRepoCtx') if we
Expand Down
10 changes: 10 additions & 0 deletions changelog.d/pr-8500
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
synopsis: Redownload pkgs when source hash verification fails
packages: cabal-install
prs: #8500
issues: #7541

description: {

- Cabal-install will verify source hashes on cached downloads against the current index, and redownload on mismatch. (Which can occur with e.g. head.hackage)

}