Skip to content

Commit

Permalink
Merge #2696 #2698
Browse files Browse the repository at this point in the history
2696: Timeout earlier on SQLITE_BUSY when deleting wallets in tests r=rvl a=Anviking

# Issue Number

ADP-827

# Overview

- [x] Set `CARDANO_WALLET_INTEGRATION` env var in integration tests
- [x] Change timeout to close all connections from `60s` to `2s` if `CARDANO_WALLET_INTEGRATION` is set.


# Comments

- Workaround that should make CI faster (and perhaps a bit more stable), until we have time to investigate and address the underlying issue and its effects in production. The problem in production is likely much less serious, as users likely don't create and delete wallets as often as in our integration tests.
- Locally, running `stack test cardano-wallet:integration --ta '-j 6'` I get:
    - 2s -> 335s
    - 5s -> 380s
    - 60s (without the `setEnv`) -> 1190s

So that's why I went with 2s rather than 5s.

<!-- Additional comments or screenshots to attach if any -->

<!--
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Jira will detect and link to this PR once created, but you can also link this PR in the description of the corresponding ticket
 ✓ Acknowledge any changes required to the Wiki
 ✓ Finally, in the PR description delete any empty sections and all text commented in <!--, so that this text does not appear in merge commit messages.
-->


2698: Make test cluster epoch length twice as long r=Anviking a=Anviking

# Issue Number

<!-- Put here a reference to the issue that this PR relates to and which requirements it tackles. Jira issues of the form ADP- will be auto-linked. -->

ADP-979


# Overview

<!-- Detail in a few bullet points the work accomplished in this PR -->

- [x] Make epoch length twice as long to help epoch-sensitive tests (see commit message)


# Comments

<!-- Additional comments or screenshots to attach if any -->

<!--
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Jira will detect and link to this PR once created, but you can also link this PR in the description of the corresponding ticket
 ✓ Acknowledge any changes required to the Wiki
 ✓ Finally, in the PR description delete any empty sections and all text commented in <!--, so that this text does not appear in merge commit messages.
-->


Co-authored-by: Johannes Lund <[email protected]>
Co-authored-by: Rodney Lorrimar <[email protected]>
  • Loading branch information
3 people authored Jun 10, 2021
3 parents 27cc367 + 40428f4 + 9bf8eea commit a7f6afa
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 7 deletions.
2 changes: 1 addition & 1 deletion lib/core-integration/src/Test/Integration/Framework/DSL.hs
Original file line number Diff line number Diff line change
Expand Up @@ -659,7 +659,7 @@ securityParameterValue = 5

-- | Parameter in test cluster shelley genesis.
epochLengthValue :: Word32
epochLengthValue = 50
epochLengthValue = 100

-- | Wallet server's chosen transaction TTL value (in seconds) when none is
-- given.
Expand Down
40 changes: 35 additions & 5 deletions lib/core/src/Cardano/DB/Sqlite.hs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
{-# LANGUAGE GADTs #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE NumericUnderscores #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TupleSections #-}
Expand Down Expand Up @@ -78,6 +79,8 @@ import Data.Aeson
( ToJSON (..) )
import Data.Function
( (&) )
import Data.Functor
( (<&>) )
import Data.List
( isInfixOf )
import Data.List.Split
Expand Down Expand Up @@ -116,6 +119,8 @@ import Fmt
( fmt, ordinalF, (+|), (+||), (|+), (||+) )
import GHC.Generics
( Generic )
import System.Environment
( lookupEnv )
import System.Log.FastLogger
( fromLogStr )
import UnliftIO.Compat
Expand Down Expand Up @@ -218,7 +223,9 @@ newSqliteContext tr pool manualMigrations autoMigration = do
-- resource is NOT placed back in the pool.
runQuery :: SqlPersistT IO a -> IO a
runQuery cmd = withResource pool $
observe . retryOnBusy tr . runSqlConn cmd . fst
observe
. retryOnBusy tr retryOnBusyTimeout
. runSqlConn cmd . fst

in Right $ SqliteContext { runQuery }

Expand All @@ -236,7 +243,21 @@ destroySqliteBackend
-> IO ()
destroySqliteBackend tr sqlBackend dbFile = do
traceWith tr (MsgCloseSingleConnection dbFile)
retryOnBusy tr (close' sqlBackend)

-- Hack for ADP-827: timeout earlier in integration tests.
--
-- There seem to be some concurrency problem causing persistent-sqlite to
-- leak unfinalized statements, causing SQLITE_BUSY when we try to close the
-- connection. In this case, retrying 2 or 60 seconds would have no
-- difference.
--
-- But in production, the longer timeout isn't as much of a problem, and
-- might be needed for windows.
timeoutSec <- lookupEnv "CARDANO_WALLET_TEST_INTEGRATION" <&> \case
Just _ -> 2
Nothing -> retryOnBusyTimeout

retryOnBusy tr timeoutSec (close' sqlBackend)
& handleIf isAlreadyClosed
(traceWith tr . MsgIsAlreadyClosed . showT)
& handleIf statementAlreadyFinalized
Expand All @@ -256,6 +277,10 @@ destroySqliteBackend tr sqlBackend dbFile = do
showT :: Show a => a -> Text
showT = T.pack . show

-- | Default timeout for `retryOnBusy`
retryOnBusyTimeout :: NominalDiffTime
retryOnBusyTimeout = 60

-- | Retry an action if the database yields an 'SQLITE_BUSY' error.
--
-- From <https://www.sqlite.org/rescode.html#busy>
Expand All @@ -273,12 +298,17 @@ destroySqliteBackend tr sqlBackend dbFile = do
-- and sqlite3_busy_handler() interfaces and the busy_timeout pragma are
-- available to process B to help it deal with SQLITE_BUSY errors.
--
retryOnBusy :: Tracer IO DBLog -> IO a -> IO a
retryOnBusy tr action = recovering policy
retryOnBusy
:: Tracer IO DBLog -- ^ Logging
-> NominalDiffTime -- ^ Timeout
-> IO a -- ^ Action to retry
-> IO a
retryOnBusy tr timeout action = recovering policy
[logRetries isBusy traceRetries]
(\st -> action <* trace MsgRetryDone st)
where
policy = limitRetriesByCumulativeDelay (60000*ms) $ constantDelay (25*ms)
policy = limitRetriesByCumulativeDelay usTimeout $ constantDelay (25*ms)
usTimeout = truncate (timeout * 1_000_000)
ms = 1000 -- microseconds in a millisecond

isBusy (SqliteException name _ _) = pure (name == Sqlite.ErrorBusy)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ maxLovelaceSupply: 1000000000000000000
protocolMagicId: 764824073
networkMagic: 764824073
networkId: Mainnet
epochLength: 50
epochLength: 100
staking:
slotsPerKESPeriod: 86400
slotLength: 0.2
Expand Down
5 changes: 5 additions & 0 deletions lib/shelley/test/integration/Main.hs
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,8 @@ import Network.URI
( URI )
import System.Directory
( createDirectory )
import System.Environment
( setEnv )
import System.FilePath
( (</>) )
import System.IO
Expand Down Expand Up @@ -226,6 +228,9 @@ withTestsSetup action = do
hSetBuffering stderr LineBuffering
-- Stop cardano-cli complaining about file permissions
setDefaultFilePermissions
-- Enables small test-specific workarounds, like timing out faster if wallet
-- deletion fails.
setEnv "CARDANO_WALLET_TEST_INTEGRATION" "1"
-- Set UTF-8, regardless of user locale
withUtf8Encoding $
-- This temporary directory will contain logs, and all other data
Expand Down

0 comments on commit a7f6afa

Please sign in to comment.