Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static analysis to find unnecessary locations #458

Merged

Conversation

MaximilianAlgehed
Copy link
Contributor

@MaximilianAlgehed MaximilianAlgehed commented May 12, 2022

I don't know why this didn't work last time. Appears to be issues with CI.

@sjoerdvisscher any clue?

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Relevant tickets are mentioned in commit messages
    • Formatting, materialized Nix files, PNG optimization, etc. are updated
  • PR
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested

@MaximilianAlgehed
Copy link
Contributor Author

@koslambrou can you please continuously re-start Hydra until it doesn't crash because it runs out of memory on PRs so we see what's actually going on?

@koslambrou
Copy link
Contributor

Everything works fine when running tests with cabal test plutus-use-cases.

However, when running the tests with Nix using: nix-build -A plutus-apps.haskell.packages.plutus-use-cases.checks.plutus-use-cases-test, I'm getting error with the golden files.

multisig
    2 out of 5:                                                                    OK (0.06s)
    3 out of 5:                                                                    OK (0.07s)
    PIR:                                                                           FAIL
      Test output was different from 'test/Spec/multisig.pir'. It was:
      (program
        (let
          (nonrec)
          (datatypebind
            (datatype
              (tyvardecl Monoid (fun (type) (type)))
              (tyvardecl a (type))
              Monoid_match
              (vardecl
                CConsMonoid
                (fun [ (lam a (type) (fun a (fun a a))) a ] (fun a [ Monoid a ]))
              )
            )
          )
          (termbind
            (strict)
            (vardecl
              p1Monoid
              (all a (type) (fun [ Monoid a ] [ (lam a (type) (fun a (fun a a))) a ]))
            )
            (abs
              a
              (type)
              (lam
                v
                [ Monoid a ]
                [
                  { [ { Monoid_match a } v ] [ (lam a (type) (fun a (fun a a))) a ] }
                  (lam v [ (lam a (type) (fun a (fun a a))) a ] (lam v a v))
                ]
              )
            )
          )
          (termbind
            (strict)
            (vardecl mempty (all a (type) (fun [ Monoid a ] a)))
            (abs
              a
              (type)
              (lam
                v
                [ Monoid a ]
                [
                  { [ { Monoid_match a } v ] a }
                  (lam v [ (lam a (type) (fun a<truncated>
      Use --accept or increase --size-cutoff to see full output.
      Use -p '/multisig.PIR/' to rerun this test only.

@koslambrou
Copy link
Contributor

koslambrou commented May 12, 2022

@michaelpj Seems like something on the plutus side no?

Probably not since it only fails on Nix.

@MaximilianAlgehed
Copy link
Contributor Author

@koslambrou I can't reproduce this. But I get:

building
use cases
  crowdfunding
    Expose 'contribute' and 'scheduleCollection' endpoints:                        OK
    make contribution:                                                             OK (0.02s)
    make contributions and collect:                                                OK (0.14s)
    cannot collect money too late:                                                 OK (0.13s)
    cannot collect unless notified:                                                OK (0.13s)
    can claim a refund:                                                            OK (0.09s)
    PIR:                                                                           OK
    script size is reasonable:                                                     OK
      Script size: 3533
    renders the log of a single contract instance sensibly:                        OK (0.14s)
    renders the emulator log sensibly:                                             OK (0.11s)
    renders an error sensibly:                                                     OK
    QuickCheck ContractModel:                                                      OK (0.10s)
      +++ OK, passed 10 tests.
      
      Actions (19 in total):
      74% CContribute
      16% CStart
      11% WaitUntil
      
      Wait interval (2 in total):
      100% <10
      
      Wait until (2 in total):
      50% 10-19
      50% <10
  vesting
    secure some funds with the vesting script:                                     OK (0.02s)
    retrieve some funds:                                                           OK (0.03s)
    cannot retrieve more than allowed:                                             OK (0.01s)
    can retrieve everything at the end:                                            OK (0.05s)
    PIR:                                                                           OK (0.04s)
    script size is reasonable:                                                     OK
      Script size: 4661
    prop_Vesting:                                                                  OK (1.43s)
      +++ OK, passed 20 tests.
      
      Actions (87 in total):
      43% WaitUntil
      34% Vest
      23% Retrieve
      
      Actions rejected by precondition (221 in total):
      54.3% Retrieve
      45.7% Vest
      
      Wait interval (37 in total):
      62% <10
      27% 10-19
       8% 20-29
       3% 30-39
      
      Wait until (37 in total):
      35% <10
      14% 10-19
      14% 20-29
      11% 30-39
       8% 100-199
       5% 50-59
       3% 40-49
       3% 60-69
       3% 70-79
       3% 80-89
       3% 90-99
    prop_CheckNoLockedFundsProof:                                                  OK (7.71s)
      +++ OK, passed 20 tests.
      
      Actions (452 in total):
      49.6% WaitUntil
      32.7% Retrieve
      13.7% Vest
       4.0% Unilateral
      
      Wait interval (224 in total):
      48.2% <10
      34.8% 10-19
      15.2% 20-29
       1.8% 30-39
      
      Wait until (224 in total):
      21.0% 100-199
      17.4% 10-19
      15.6% 20-29
      10.3% <10
       8.9% 30-39
       6.2% 60-69
       4.9% 80-89
       4.5% 40-49
       4.0% 90-99
       3.1% 70-79
       2.2% 50-59
       1.8% 200-299
  error handling
    throw an error:                                                                OK
    catch an error:                                                                OK
  futures
    setup tokens:                                                                  OK (0.05s)
    can initialise and obtain tokens:                                              OK (0.21s)
    can increase margin:                                                           OK (0.30s)
    can settle early:                                                              OK (0.20s)
    can pay out:                                                                   OK (0.24s)
    PIR:                                                                           OK (0.02s)
    script size is reasonable:                                                     OK
      Script size: 9313
  multisig
    2 out of 5:                                                                    OK (0.03s)
    3 out of 5:                                                                    OK (0.04s)
    PIR:                                                                           OK
  multi sig state machine tests
    lock, propose, sign 3x, pay - SUCCESS:                                         OK (0.23s)
    lock, propose, sign 2x, pay - FAILURE:                                         OK (0.17s)
    lock, propose, sign 3x, pay x2 - SUCCESS:                                      OK (0.44s)
    lock, propose, sign 3x, pay x3 - FAILURE:                                      OK (0.48s)
    PIR:                                                                           OK (0.06s)
    script size is reasonable:                                                     OK
      Script size: 8364
  currency
    can create a new currency:                                                     OK (0.05s)
    script size is reasonable:                                                     OK (0.01s)
  pubkey
    works like a public key output:                                                OK (0.03s)
  escrow
    can pay:                                                                       OK (0.02s)
    can redeem:                                                                    OK (0.12s)
    can redeem even if more money than required has been paid in:                  OK (0.11s)
    can refund:                                                                    OK (0.08s)
    script size is reasonable:                                                     OK (0.04s)
      Script size: 5166 (0.04s)
    QuickCheck ContractModel:                                                      OK (0.77s)
      +++ OK, passed 10 tests:
      50% Redeemable
      30% Contains Redeem
      
      Actions (42 in total):
      50% Pay
      21% BadRefund
      19% WaitUntil
       10% Redeem
      
      Actions rejected by precondition (22 in total):
      77% Redeem
       9% BadRefund
       9% Refund
       5% Pay
      
      Bad refund attempts (9 in total):
      89% steal refund
      11% early refund
      
      Wait interval (8 in total):
      62% <10
      25% 10-19
      12% 20-29
      
      Wait until (8 in total):
      50% 10-19
      12% 20-29
      12% 30-39
      12% 50-59
      12% <10
    QuickCheck NoLockedFunds:                                                      OK (2.86s)
      +++ OK, passed 10 tests:
      80% Redeemable
      60% Contains Redeem
      
      Actions (162 in total):
      43.8% Pay
      20.4% WaitUntil
      13.6% Refund
       9.9% Redeem
       6.8% Unilateral
       5.6% BadRefund
      
      Bad refund attempts (9 in total):
      78% steal refund
      22% early refund
      
      Wait interval (33 in total):
      45% 30-39
      36% <10
      18% 20-29
      
      Wait until (33 in total):
      64% 40-49
      24% 10-19
      12% <10
  simple-escrow
    can lock some value in the contract:                                           OK
    can lock and redeem:                                                           OK (0.02s)
    can lock and refund:                                                           OK (0.06s)
    only locking wallet can request refund:                                        OK (0.05s)
    can't redeem if you can't pay:                                                 OK
  game with secret arguments tests
    run a successful game trace:                                                   OK (0.03s)
    run a failed trace:                                                            OK (0.03s)
    PIR:                                                                           OK
    script size is reasonable:                                                     OK
      Script size: 2301
  game state machine with secret arguments tests
    run a successful game trace:                                                   OK (0.21s)
    run a 2nd successful game trace:                                               OK (0.25s)
    run a successful game trace where we try to leave 1 Ada in the script address: OK (0.14s)
    run a failed trace:                                                            OK (0.09s)
    PIR:                                                                           OK (0.09s)
    script size is reasonable:                                                     OK
      Script size: 9750
    can always get the funds out:                                                  OK (1.45s)
      +++ OK, passed 10 tests (70% Unlocking funds).
      
      Actions (50 in total):
      52% Guess
      32% GiveToken
      14% Lock
       2% WaitUntil
      
      Wait interval (1 in total):
      100% <10
      
      Wait until (1 in total):
      100% <10
    sanity check the contract model:                                               OK
      +++ OK, failed as expected. Falsified (after 2 tests and 14 shrinks):
      Actions 
       [Lock (Wallet 2) "*******" 2000000]
    game state machine crash tolerance:                                            OK (3.31s)
      +++ OK, passed 20 tests.
      
      Actions (170 in total):
      41.8% GiveToken
      28.2% Guess
      11.2% WaitUntil
       8.8% Lock
       6.5% Crash
       3.5% Restart
      
      Actions rejected by precondition (28 in total):
      64% Restart
      36% Guess
      
      Wait interval (19 in total):
      84% <10
      16% 10-19
      
      Wait until (19 in total):
      37% 10-19
      37% <10
      21% 20-29
       5% 40-49
  showBlockchain
    renders a crowdfunding scenario sensibly:                                      OK (0.11s)
    renders a game guess scenario sensibly:                                        FAIL (0.23s)
      Test output was different from 'test/Spec/renderGuess.txt'. It was:
      ==== Slot #0, Tx #0 ====
      TxId:       ef0ca0fb043642529818003be5a6cac88aac499e4f8f1cbc3bdb35db2b7f6958
      Fee:        -
      Mint:       Ada:      Lovelace:  1000000000
      Signatures  -
      Inputs:
        
      
      
      Outputs:
        ---- Output 0 ----
        Destination:  PaymentPubKeyHash: 2e0ad60c3207248cecd47dbde3d752e0aad141d6... (Wallet c30efb78b4e272685c1f9f0c93
787fd4b6743154)
        Value:
          Ada:      Lovelace:  10000000
      
        ---- Output 1 ----
        Destination:  PaymentPubKeyHash: 2e0ad60c3207248cecd47dbde3d752e0aad141d6... (Wallet c30efb78b4e272685c1f9f0c93
787fd4b6743154)
        Value:
          Ada:      Lovelace:  10000000
      
        ---- Output 2 ----
        Destination:  PaymentPubKeyHash: 2e0ad60c3207248cecd47dbde3d752e0aad141d6... (Wallet c30efb78b4e272685c1f9f0c93
787fd4b6743154)
        Value:
          Ada:      Lovelace:  10000000
      
        ---- Output 3 ----
        Destination:  PaymentPubKeyHash: 2e0ad60c3207248cecd47dbde3d752e0aad141d6... (Wallet c30efb78b4e272685c1f9f0c93
787fd4b6743154)
        Value:
          Ada:      Lovelace:  10000000
      
        ---- Output 4 ----
        Desti<truncated>
      Use --accept or increase --size-cutoff to see full output.
      Use -p '/renders a game guess scenario sensibly/' to rerun this test only.
    renders a vesting scenario sensibly:                                           OK (0.06s)
  token account
    Create a token account:                                                        OK (0.10s)
    Pay into the account:                                                          OK (0.07s)
    Transfer & redeem all funds:                                                   OK (0.10s)
  pingpong
    activate endpoints:                                                            OK (0.12s)
    Stop the contract:                                                             OK (0.08s)
  PRISM
    withdraw:                                                                      OK (0.21s)
    QuickCheck property:                                                           OK (1.39s)
      +++ OK, passed 15 tests.
      
      Actions (45 in total):
      36% Call
      36% Revoke
      24% Issue
       4% WaitUntil
      
      Wait interval (2 in total):
      100% <10
      
      Wait until (2 in total):
      50% 10-19
      50% 30-39
  Stablecoin
    mint reservecoins:                                                             OK (0.13s)
    mint reservecoins and stablecoins:                                             OK (0.21s)
    mint reservecoins, stablecoins and redeem stablecoin at a different price:     OK (0.29s)
    Cannot exceed the maximum reserve ratio:                                       OK (0.20s)
  auction
    run an auction:                                                                OK (0.38s)
    run an auction with multiple bids:                                             OK (0.61s)
    QuickCheck property:                                                           OK (2.56s)
      +++ OK, passed 10 tests.
      
      Actions (31 in total):
      39% Bid
      39% WaitUntil
      23% Init
      
      Wait interval (12 in total):
      75% 90-99
       8% 10-19
       8% 70-79
       8% <10
      
      Wait until (12 in total):
      83% 100-199
       8% 20-29
       8% <10
    NLFP fails:                                                                    Expected funds of W[2] to change by
  Value (Map [(,Map [("",-87240078)]),(363d3944282b3d16b239235a112c0f6e2f1195de5067f61c0dfc0f5f,Map [("token",1)])])
but they changed by
  Value (Map [(,Map [("",-89240078)])])
a discrepancy of
  Value (Map [(,Map [("",-2000000)]),(363d3944282b3d16b239235a112c0f6e2f1195de5067f61c0dfc0f5f,Map [("token",-1)])])
Expected funds of W[1] to change by
  Value (Map [(,Map [("",87240078)]),(363d3944282b3d16b239235a112c0f6e2f1195de5067f61c0dfc0f5f,Map [("token",-1)])])
but they changed by
  Value (Map [(,Map [("",-2000000)]),(363d3944282b3d16b239235a112c0f6e2f1195de5067f61c0dfc0f5f,Map [("token",-1)])])
a discrepancy of
  Value (Map [(,Map [("",-89240078)])])
Test failed.
Emulator log:
OK (0.96s)
      +++ OK, failed as expected. Assertion failed (after 2 tests):
      DLScript
        [Do $ Init, 
         Do $ Bid (Wallet 2) 89240078]
      
      The ContractModel's Unilateral behaviour for Wallet 2 does not match the actual behaviour for actions:
      Actions 
       [Var 0 := Init,
        Var 1 := Bid (Wallet 2) 89240078,
        Var 2 := Unilateral (Wallet 2),
        Var 3 := WaitUntil (Slot {getSlot = 100})]
    prop_Reactive:                                                                 OK (0.11s)
      +++ OK, passed 1000 tests.
  sealed bid auction
    packInteger is injective:                                                      OK
      +++ OK, passed 100 tests; 12 discarded.
    prop_AuctionModelCorrect:                                                      OK (2.52s)
      +++ OK, passed 20 tests.
      
      Actions (94 in total):
      40% WaitUntil
      18% Init
      17% Bid
      15% Reveal
       10% Payout
      
      Actions rejected by precondition (183 in total):
      40.4% Bid
      39.3% Reveal
      20.2% Payout
      
      Reveals (14 in total):
      71% no bid
      14% dishonest
      14% honest
      
      Wait interval (38 in total):
      71% <10
       8% 30-39
       8% 50-59
       5% 20-29
       3% 10-19
       3% 40-49
       3% 60-69
      
      Wait until (38 in total):
      29% 70-79
      21% 60-69
      16% 40-49
      11% 80-89
       8% 30-39
       8% <10
       3% 20-29
       3% 50-59
       3% 90-99
  governance tests
    vote all in favor, 2 rounds - SUCCESS:                                         OK (1.51s)
    vote 60/40, accepted - SUCCESS:                                                OK (0.83s)
    vote 50/50, rejected - SUCCESS:                                                OK (0.81s)
    PIR:                                                                           OK (0.06s)
    script size is reasonable:                                                     Script size: 9246
OK
  uniswap
    can create a liquidity pool and add liquidity:                                 OK (0.52s)
    prop_UniswapAssertions:                                                        OK (0.56s)
      +++ OK, passed 1000 tests.
      
      Actions (25269 in total):
      19.724% Bad
      15.822% AddLiquidity
      15.810% PerformSwap
      14.718% RemoveLiquidity
      13.515% WaitUntil
       7.978% SetupTokens
       7.745% CreatePool
       3.376% Start
       1.314% ClosePool
      
      Actions rejected by precondition (8299 in total):
      53.26% ClosePool
      17.04% CreatePool
      15.57% Bad
      14.13% Start
      
      Bad actions (4984 in total):
      34.95% RemoveLiquidity
      28.71% AddLiquidity
      28.19% PerformSwap
       8.15% BadRemoveLiquidity
      
      Wait interval (3415 in total):
      31.07% <10
      29.19% 10-19
      25.27% 20-29
      11.95% 30-39
       2.43% 40-49
       0.09% 50-59
      
      Wait until (3415 in total):
      22.78% 100-199
      14.64% 200-299
       9.75% 300-399
       6.06% 400-499
       4.57% 60-69
       3.95% 500-599
       3.66% 50-59
       3.63% 30-39
       3.57% 40-49
       3.51% 90-99
       3.28% 70-79
       3.28% 80-89
       2.87% 1000-1999
       2.75% 600-699
       2.72% 10-19
       2.69% <10
       2.02% 700-799
       1.55% 20-29
       1.46% 800-899
       1.02% 900-999
       0.20% 2000-2999
    prop_NLFP:                                                                     OK (0.67s)
      +++ OK, passed 250 tests.

@michaelpj
Copy link
Contributor

Argh. Maybe this is the same issue that we had in plutus: somehow what you did has introduced instability in the compiler output between the Nix and non-Nix builds. We let it slide there but if it's popping up more here that's a problem.

@MaximilianAlgehed
Copy link
Contributor Author

We should probably bump the core dep even further here as we just merged a relevant bugfix there today. I'll get to that next week.

@MaximilianAlgehed
Copy link
Contributor Author

This is now stuck on cached OOM failure.

@koslambrou
Copy link
Contributor

I'm still getting failing test cases :( .

I sure don't understand why you're not getting them.. I'm on the 94b881ddd8ff0c4e8dd6e0cd970080083b3493bd commit of your branch.

use cases
  crowdfunding
    Expose 'contribute' and 'scheduleCollection' endpoints:                        OK (0.02s)
    make contribution:                                                             OK (0.04s)
    make contributions and collect:                                                OK (0.26s)
    cannot collect money too late:                                                 OK (0.30s)
    cannot collect unless notified:                                                OK (0.27s)
    can claim a refund:                                                            OK (0.20s)
    PIR:                                                                           OK (0.01s)
    script size is reasonable:                                                     OK
      Script size: 3533
    renders the log of a single contract instance sensibly:                        OK (0.26s)
    renders the emulator log sensibly:                                             OK (0.25s)
    renders an error sensibly:                                                     OK
    QuickCheck ContractModel:                                                      OK (0.36s)
      +++ OK, passed 10 tests.
      
      Actions (31 in total):
      68% CContribute
      16% CStart
      16% WaitUntil
      
      Wait interval (5 in total):
      80% <10
      20% 10-19
      
      Wait until (5 in total):
      60% <10
      40% 10-19
  vesting
    secure some funds with the vesting script:                                     OK (0.03s)
    retrieve some funds:                                                           OK (0.12s)
    cannot retrieve more than allowed:                                             OK (0.03s)
    can retrieve everything at the end:                                            OK (0.14s)
    PIR:                                                                           OK (0.03s)
    script size is reasonable:                                                     OK (0.01s)
      Script size: 4661 (0.01s)
    prop_Vesting:                                                                  OK (1.77s)
      +++ OK, passed 20 tests.
      
      Actions (51 in total):
      43% Vest
      31% WaitUntil
      25% Retrieve
      
      Actions rejected by precondition (128 in total):
      57.8% Retrieve
      42.2% Vest
      
      Wait interval (16 in total):
      69% <10
      31% 10-19
      
      Wait until (16 in total):
      38% <10
      31% 10-19
      12% 20-29
       6% 30-39
       6% 50-59
       6% 60-69
    prop_CheckNoLockedFundsProof:                                                  OK (15.37s)
      +++ OK, passed 20 tests.
      
      Actions (465 in total):
      54.2% WaitUntil
      28.4% Retrieve
      13.3% Vest
       4.1% Unilateral
      
      Wait interval (252 in total):
      42.5% <10
      40.1% 10-19
      16.7% 20-29
       0.8% 30-39
      
      Wait until (252 in total):
      18.3% 100-199
      15.5% 20-29
      15.1% 10-19
      10.3% <10
       9.1% 30-39
       8.3% 60-69
       5.6% 40-49
       5.2% 90-99
       4.0% 70-79
       3.6% 50-59
       2.8% 80-89
       2.0% 200-299
       0.4% 300-399
  error handling
    throw an error:                                                                OK
    catch an error:                                                                OK
  futures
    setup tokens:                                                                  OK (0.14s)
    can initialise and obtain tokens:                                              OK (0.49s)
    can increase margin:                                                           OK (0.71s)
    can settle early:                                                              OK (0.51s)
    can pay out:                                                                   OK (0.61s)
    PIR:                                                                           OK (0.04s)
    script size is reasonable:                                                     OK (0.02s)
      Script size: 9313 (0.02s)
  multisig
    2 out of 5:                                                                    OK (0.07s)
    3 out of 5:                                                                    OK (0.08s)
    PIR:                                                                           FAIL
      Test output was different from 'test/Spec/multisig.pir'. It was:
      (program
        (let
          (nonrec)
          (datatypebind
            (datatype
              (tyvardecl Monoid (fun (type) (type)))
              (tyvardecl a (type))
              Monoid_match
              (vardecl
                CConsMonoid
                (fun [ (lam a (type) (fun a (fun a a))) a ] (fun a [ Monoid a ]))
              )
            )
          )
          (termbind
            (strict)
            (vardecl
              p1Monoid
              (all a (type) (fun [ Monoid a ] [ (lam a (type) (fun a (fun a a))) a ]))
            )
            (abs
              a
              (type)
              (lam
                v
                [ Monoid a ]
                [
                  { [ { Monoid_match a } v ] [ (lam a (type) (fun a (fun a a))) a ] }
                  (lam v [ (lam a (type) (fun a (fun a a))) a ] (lam v a v))
                ]
              )
            )
          )
          (termbind
            (strict)
            (vardecl mempty (all a (type) (fun [ Monoid a ] a)))
            (abs
              a
              (type)
              (lam
                v
                [ Monoid a ]
                [
                  { [ { Monoid_match a } v ] a }
                  (lam v [ (lam a (type) (fun a<truncated>
      Use --accept or increase --size-cutoff to see full output.
      Use -p '/multisig.PIR/' to rerun this test only.
  multi sig state machine tests
    lock, propose, sign 3x, pay - SUCCESS:                                         OK (0.56s)
    lock, propose, sign 2x, pay - FAILURE:                                         OK (0.38s)
    lock, propose, sign 3x, pay x2 - SUCCESS:                                      OK (1.10s)
    lock, propose, sign 3x, pay x3 - FAILURE:                                      OK (1.19s)
    PIR:                                                                           FAIL (0.09s)
      Test output was different from 'test/Spec/multisigStateMachine.pir'. It was:
      (program
        (let
          (nonrec)
          (termbind (strict) (vardecl unitval (con unit)) (con unit ()))
          (let
            (rec)
            (datatypebind
              (datatype
                (tyvardecl List (fun (type) (type)))
                (tyvardecl a (type))
                Nil_match
                (vardecl Nil [ List a ])
                (vardecl Cons (fun a (fun [ List a ] [ List a ])))
              )
            )
            (let
              (rec)
              (termbind
                (strict)
                (vardecl go (fun [ List (con bytestring) ] [ (con list) (con data) ]))
                (lam
                  ds
                  [ List (con bytestring) ]
                  {
                    [
                      [
                        {
                          [ { Nil_match (con bytestring) } ds ]
                          (all dead (type) [ (con list) (con data) ])
                        }
                        (abs dead (type) [ (builtin mkNilData) unitval ])
                      ]
                      (lam
                        x
                        (con bytestring)
                        (lam
                          xs
          <truncated>
      Use --accept or increase --size-cutoff to see full output.
      Use -p '/multi sig state machine tests.PIR/' to rerun this test only.
    script size is reasonable:                                                     OK
      Script size: 8370
  currency
    can create a new currency:                                                     OK (0.13s)
    script size is reasonable:                                                     OK (0.03s)
  pubkey
    works like a public key output:                                                OK (0.08s)
  escrow
    can pay:                                                                       OK (0.04s)
    can redeem:                                                                    OK (0.25s)
    can redeem even if more money than required has been paid in:                  OK (0.28s)
    can refund:                                                                    OK (0.29s)
    script size is reasonable:                                                     OK
      Script size: 5166 (0.01s)
    QuickCheck ContractModel:                                                      OK (1.74s)
      +++ OK, passed 10 tests:
      50% Redeemable
      30% Contains Redeem
      
      Actions (39 in total):
      72% Pay
      15% Redeem
      10% WaitUntil
       3% BadRefund
      
      Actions rejected by precondition (17 in total):
      100% Redeem
      
      Bad refund attempts (1 in total):
      100% steal refund
      
      Wait interval (4 in total):
      100% <10
      
      Wait until (4 in total):
      75% 10-19
      25% <10
    QuickCheck NoLockedFunds:                                                      OK (8.51s)
      +++ OK, passed 10 tests:
      70% Redeemable
      60% Contains Redeem
      
      Actions (204 in total):
      39.7% Pay
      24.0% WaitUntil
      13.7% Refund
       8.3% Redeem
       7.4% BadRefund
       6.9% Unilateral
      
      Bad refund attempts (15 in total):
      100% steal refund
      
      Wait interval (49 in total):
      51% <10
      27% 30-39
      16% 20-29
       6% 10-19
      
      Wait until (49 in total):
      49% 40-49
      20% 10-19
      18% <10
       6% 20-29
       6% 30-39
  simple-escrow
    can lock some value in the contract:                                           OK (0.02s)
    can lock and redeem:                                                           OK (0.09s)
    can lock and refund:                                                           OK (0.20s)
    only locking wallet can request refund:                                        OK (0.18s)
    can't redeem if you can't pay:                                                 OK (0.02s)
  game with secret arguments tests
    run a successful game trace:                                                   OK (0.09s)
    run a failed trace:                                                            OK (0.07s)
    PIR:                                                                           OK
    script size is reasonable:                                                     OK
      Script size: 2301
  game state machine with secret arguments tests
    run a successful game trace:                                                   OK (0.44s)
    run a 2nd successful game trace:                                               OK (0.64s)
    run a successful game trace where we try to leave 1 Ada in the script address: OK (0.40s)
    run a failed trace:                                                            OK (0.26s)
    PIR:                                                                           FAIL (0.08s)
      Test output was different from 'test/Spec/gameStateMachine.pir'. It was:
      (program
        (let
          (nonrec)
          (termbind (strict) (vardecl w (con integer)) (con integer 0))
          (termbind (strict) (vardecl w (con integer)) (con integer 1))
          (datatypebind
            (datatype (tyvardecl Unit (type))  Unit_match (vardecl Unit Unit))
          )
          (termbind (strict) (vardecl unitval (con unit)) (con unit ()))
          (datatypebind
            (datatype
              (tyvardecl Tuple2 (fun (type) (fun (type) (type))))
              (tyvardecl a (type)) (tyvardecl b (type))
              Tuple2_match
              (vardecl Tuple2 (fun a (fun b [ [ Tuple2 a ] b ])))
            )
          )
          (termbind
            (strict)
            (vardecl
              fail (fun (con unit) [ [ Tuple2 (con bytestring) ] (con bytestring) ])
            )
            (lam
              ds
              (con unit)
              (let
                (nonrec)
                (termbind
                  (strict)
                  (vardecl thunk (con unit))
                  [
                    {
                      [
                        Unit_match
                        [ [ { (builtin trace) Unit } (con string "Lg") ] Unit ]
       <truncated>
      Use --accept or increase --size-cutoff to see full output.
      Use -p '/game state machine with secret arguments tests.PIR/' to rerun this test only.
    script size is reasonable:                                                     OK
      Script size: 9750
    can always get the funds out:                                                  OK (3.45s)
      +++ OK, passed 10 tests (70% Unlocking funds).
      
      Actions (40 in total):
      42% Guess
      32% GiveToken
      18% Lock
       8% WaitUntil
      
      Wait interval (3 in total):
      100% <10
      
      Wait until (3 in total):
      67% 10-19
      33% <10
    sanity check the contract model:                                               OK
      +++ OK, failed as expected. Falsified (after 1 test and 16 shrinks):
      Actions 
       [Lock (Wallet 2) "secret" 2000000]
    game state machine crash tolerance:                                            OK (6.61s)
      +++ OK, passed 20 tests.
      
      Actions (159 in total):
      45.3% GiveToken
      22.6% Guess
      17.0% WaitUntil
       9.4% Lock
       5.0% Crash
       0.6% Restart
      
      Actions rejected by precondition (20 in total):
      70% Restart
      30% Guess
      
      Wait interval (27 in total):
      63% <10
      22% 10-19
      15% 20-29
      
      Wait until (27 in total):
      37% 10-19
      19% 100-199
      19% <10
      11% 30-39
       4% 20-29
       4% 50-59
       4% 70-79
       4% 90-99
  showBlockchain
    renders a crowdfunding scenario sensibly:                                      OK (0.25s)
    renders a game guess scenario sensibly:                                        OK (0.41s)
    renders a vesting scenario sensibly:                                           OK (0.14s)
  token account
    Create a token account:                                                        OK (0.23s)
    Pay into the account:                                                          OK (0.14s)
    Transfer & redeem all funds:                                                   OK (0.21s)
  pingpong
    activate endpoints:                                                            OK (0.27s)
    Stop the contract:                                                             OK (0.14s)
  PRISM
    withdraw:                                                                      OK (0.39s)
            QuickCheck property:                                                           OK (5.99s)
      +++ OK, passed 15 tests.
      
      Actions (56 in total):
      43% Call
      27% Revoke
      23% Issue
       7% WaitUntil
      
      Actions rejected by precondition (2 in total):
      100% Issue
      
      Wait interval (4 in total):
      50% 10-19
      50% <10
      
      Wait until (4 in total):
      50% 20-29
      25% 40-49
      25% 50-59
  Stablecoin
<    mint reservecoins:                                                             OK (0.28s)
    mint reservecoins and stablecoins:                                             OK (0.48s)
    mint reservecoins, stablecoins and redeem stablecoin at a different price:     OK (0.71s)
    Cannot exceed the maximum reserve ratio:                                       OK (0.46s)
  auction
    run an auction:                                                                OK (0.80s)
    run an auction with multiple bids:                                             OK (1.42s)
    QuickCheck property:                                                           OK (6.14s)
      +++ OK, passed 10 tests.
      
      Actions (31 in total):
      42% Bid
      35% WaitUntil
      23% Init
      
      Wait interval (11 in total):
      64% 90-99
      27% 80-89
       9% <10
      
      Wait until (11 in total):
      91% 100-199
       9% 10-19
    NLFP fails:                                                                    Expected funds of W[4] to change by
  Value (Map [(,Map [("",-44316956)]),(363d3944282b3d16b239235a112c0f6e2f1195de5067f61c0dfc0f5f,Map [("token",1)])])
but they changed by
  Value (Map [(,Map [("",-46316956)])])
a discrepancy of
  Value (Map [(,Map [("",-2000000)]),(363d3944282b3d16b239235a112c0f6e2f1195de5067f61c0dfc0f5f,Map [("token",-1)])])
Expected funds of W[1] to change by
  Value (Map [(,Map [("",44316956)]),(363d3944282b3d16b239235a112c0f6e2f1195de5067f61c0dfc0f5f,Map [("token",-1)])])
but they changed by
  Value (Map [(,Map [("",-2000000)]),(363d3944282b3d16b239235a112c0f6e2f1195de5067f61c0dfc0f5f,Map [("token",-1)])])
a discrepancy of
  Value (Map [(,Map [("",-46316956)])])
Test failed.
Emulator log:
OK (2.12s)
      +++ OK, failed as expected. Assertion failed (after 2 tests):
      DLScript
        [Do $ Init, 
         Do $ Bid (Wallet 4) 46316956]
      
      The ContractModel's Unilateral behaviour for Wallet 4 does not match the actual behaviour for actions:
      Actions 
       [Var 0 := Init,
        Var 1 := Bid (Wallet 4) 46316956,
        Var 2 := Unilateral (Wallet 4),
        Var 3 := WaitUntil (Slot {getSlot = 100})]
    prop_Reactive:                                                                 OK (0.24s)
      +++ OK, passed 1000 tests.
  sealed bid auction
    packInteger is injective:                                                      OK
      +++ OK, passed 100 tests; 15 discarded.
    prop_AuctionModelCorrect:                                                      OK (5.57s)
      +++ OK, passed 20 tests.
      
      Actions (75 in total):
      36% Bid
      28% WaitUntil
      24% Init
       7% Reveal
       5% Payout
      
      Actions rejected by precondition (108 in total):
      43.5% Reveal
      29.6% Bid
      26.9% Payout
      
      Reveals (5 in total):
      100% no bid
      
      Wait interval (21 in total):
      52% <10
      14% 30-39
       10% 20-29
       10% 40-49
       5% 10-19
       5% 50-59
       5% 60-69
      
      Wait until (21 in total):
      33% 60-69
      24% 50-59
      19% <10
      14% 40-49
       5% 20-29
       5% 30-39
  governance tests
    vote all in favor, 2 rounds - SUCCESS:                                         OK (3.47s)
    vote 60/40, accepted - SUCCESS:                                                OK (1.81s)
    vote 50/50, rejected - SUCCESS:                                                OK (1.83s)
    PIR:                                                                           Script size: 9246
FAIL (0.10s)
      Test output was different from 'test/Spec/governance.pir'. It was:
      (program
        (let
          (nonrec)
          (termbind (strict) (vardecl unitval (con unit)) (con unit ()))
          (datatypebind
            (datatype
              (tyvardecl Tuple2 (fun (type) (fun (type) (type))))
              (tyvardecl a (type)) (tyvardecl b (type))
              Tuple2_match
              (vardecl Tuple2 (fun a (fun b [ [ Tuple2 a ] b ])))
            )
          )
          (let
            (rec)
            (datatypebind
              (datatype
                (tyvardecl List (fun (type) (type)))
                (tyvardecl a (type))
                Nil_match
                (vardecl Nil [ List a ])
                (vardecl Cons (fun a (fun [ List a ] [ List a ])))
              )
            )
            (let
              (nonrec)
              (datatypebind
                (datatype
                  (tyvardecl Bool (type))
      
                  Bool_match
                  (vardecl True Bool) (vardecl False Bool)
                )
              )
              (let
                (rec)
                (termbind
                  (strict)
                  (vardecl
                    go
                    (fun
                      [ List [ [ Tuple2 (con bytestring) ] Bool ] ]
      <truncated>
      Use --accept or increase --size-cutoff to see full output.
      Use -p '/governance tests.PIR/' to rerun this test only.
    script size is reasonable:                                                     OK
  uniswap
    can create a liquidity pool and add liquidity:                                 OK (1.09s)
    prop_UniswapAssertions:                                                        OK (1.43s)
      +++ OK, passed 1000 tests.
      
      Actions (25982 in total):
      18.417% Bad
      16.715% AddLiquidity
      16.007% PerformSwap
      15.699% RemoveLiquidity
      13.225% WaitUntil
       7.805% SetupTokens
       7.705% CreatePool
       3.295% Start
       1.132% ClosePool
      
      Actions rejected by precondition (8255 in total):
      55.83% ClosePool
      16.35% CreatePool
      14.08% Start
      13.74% Bad
      
      Bad actions (4785 in total):
      32.94% RemoveLiquidity
      29.36% PerformSwap
      29.07% AddLiquidity
       8.63% BadRemoveLiquidity
      
      Wait interval (3436 in total):
      31.78% <10
      28.58% 10-19
      23.43% 20-29
      13.18% 30-39
       2.97% 40-49
       0.06% 50-59
      
      Wait until (3436 in total):
      22.82% 100-199
      13.80% 200-299
       9.84% 300-399
       7.33% 400-499
       4.31% 500-599
       3.78% 60-69
       3.55% 50-59
       3.26% 600-699
       3.08% 40-49
       3.06% 30-39
       3.03% 1000-1999
       3.00% 10-19
       2.97% 90-99
       2.88% 80-89
       2.79% <10
       2.74% 70-79
       2.39% 700-799
       2.07% 800-899
       1.57% 20-29
       1.34% 900-999
       0.41% 2000-2999
    prop_NLFP:                                                                     OK (1.60s)
      +++ OK, passed 250 tests.

4 out of 95 tests failed (87.30s)
error: builder for '/nix/store/2m411l786d09h3vnnhnmkm42kimf9g6j-plutus-use-cases-test-plutus-use-cases-test-0.1.0.0-check.drv' failed with exit code 1;
       last 10 log lines:
       >        2.74% 70-79
       >        2.39% 700-799
       >        2.07% 800-899
       >        1.57% 20-29
       >        1.34% 900-999
       >        0.41% 2000-2999
       >     prop_NLFP:                                                                     OK (1.60s)
       >       +++ OK, passed 250 tests.
       >
       > 4 out of 95 tests failed (87.30s)
       For full logs, run 'nix log /nix/store/2m411l786d09h3vnnhnmkm42kimf9g6j-plutus-use-cases-test-plutus-use-cases-test-0.1.0.0-check.drv'.

@MaximilianAlgehed
Copy link
Contributor Author

@koslambrou I can not reproduce these issues on my machine. Does this happen both when you run it through nix and when you just do cabal test plutus-use-cases?

@MaximilianAlgehed
Copy link
Contributor Author

One concerning thing is that I just had a golden test fail on that branch - even though the last thing I did on that branch was run cabal test plutus-use-cases --test-options="--accept". Something is not right here.

@koslambrou
Copy link
Contributor

koslambrou commented May 19, 2022

@koslambrou I can not reproduce these issues on my machine. Does this happen both when you run it through nix and when you just do cabal test plutus-use-cases?

Thes issues happen when I run the tests through Nix. Surprisingly, everything works fine when doing cabal test plutus-use-cases.

One concerning thing is that I just had a golden test fail on that branch - even though the last thing I did on that branch was run cabal test plutus-use-cases --test-options="--accept". Something is not right here.

:( I don't understand either..

@MaximilianAlgehed
Copy link
Contributor Author

@michaelpj this seems like a nix issue to me. Any clue what's going on?

@MaximilianAlgehed MaximilianAlgehed force-pushed the PR-coverage-static-analysis-plutus-apps branch from c098ada to 7f545c5 Compare May 20, 2022 06:42
@MaximilianAlgehed
Copy link
Contributor Author

Also, can @michaelpj maybe do something about all these cached failures?

@michaelpj
Copy link
Contributor

@michaelpj this seems like a nix issue to me. Any clue what's going on?

No. I presume it's the same thing that caused this issue on your PR to plutus. I have asked @zliu41 to try and figure out what's going on there, so maybe we'll know something in a bit.

Also, can @michaelpj maybe do something about all these cached failures?

I restarted all the failing jobs, let's see what happens.

@MaximilianAlgehed
Copy link
Contributor Author

Ok, no idea what's up with "wrong ELF type"...

@michaelpj
Copy link
Contributor

Remaining failures appear to be actual test failures.

@MaximilianAlgehed
Copy link
Contributor Author

@michaelpj see above about these failures. They don't reproduce with cabal test so we can't update the golden tests.

@MaximilianAlgehed
Copy link
Contributor Author

I've bumped the core dependency here to:

  1. Include the latest bugfix
  2. Try to magically fix this weird non-reproducability problem

@MaximilianAlgehed
Copy link
Contributor Author

@michaelpj restart build plx?

@sjoerdvisscher
Copy link
Contributor

Maybe pulling in the latest changes from main to get GHC 8.10.7 might help? My latest PR using that went smoothly, but that could have been a fluke.

@MaximilianAlgehed
Copy link
Contributor Author

@sjoerdvisscher I rebased this morning - has it changed since?

Also, does that solve the OOM issues in hydra?

@sjoerdvisscher
Copy link
Contributor

Also, does that solve the OOM issues in hydra?

Apparently not 😞

@MaximilianAlgehed
Copy link
Contributor Author

@michaelpj could you please restart the failing CI jobs here?

@MaximilianAlgehed
Copy link
Contributor Author

I bumped the dep on core to include some bugfixes, maybe that could magically solve the flaky-inconsistent-PIR-golden-tests issue?

@koslambrou
Copy link
Contributor

@MaximilianAlgehed I'm getting the same PIR errors as before when running the tests in Nix. Latest commit is d144191

@MaximilianAlgehed
Copy link
Contributor Author

@koslambrou Ok. I don't know what to make of this issue. Probably it's a good idea to file this as an issue in some issue tracker somewhere and try to get some eyes on it because as far as I can tell this is an issue with the compiler / nix / reproducibility. I've spoken a bit to @michaelpj about it but I don't know what the right bug report to file is, or even where to file it.

@michaelpj
Copy link
Contributor

We have a ticket for it already and someone is going to look at it.

@MaximilianAlgehed
Copy link
Contributor Author

@michaelpj any progress on this reproducibility issue?

@michaelpj
Copy link
Contributor

Some progress. Try updating to a later version of the plutus release branch, see what happens.

@MaximilianAlgehed MaximilianAlgehed force-pushed the PR-coverage-static-analysis-plutus-apps branch from 01c8d09 to 0804cfd Compare May 31, 2022 09:18
@MaximilianAlgehed MaximilianAlgehed force-pushed the PR-coverage-static-analysis-plutus-apps branch from 0804cfd to aff9083 Compare May 31, 2022 09:44
@MaximilianAlgehed
Copy link
Contributor Author

@michaelpj restart hydra plz...

@michaelpj
Copy link
Contributor

done

@michaelpj
Copy link
Contributor

Weird, it is continuing to very reliably die in plutus-contract...

@MaximilianAlgehed
Copy link
Contributor Author

@michaelpj PIR reproducibility problem unfortunately wasn't fixed :(

@michaelpj
Copy link
Contributor

cc @zliu41, maybe we can repeat the investigation effort. Might be something similar...

@MaximilianAlgehed
Copy link
Contributor Author

Though this one fails when I run CI locally too! Improvement! But cabal test doesn't give us the failure 🤕

@MaximilianAlgehed
Copy link
Contributor Author

@michaelpj and @zliu41 any progress on this issue?

@MaximilianAlgehed
Copy link
Contributor Author

@koslambrou because progress on this reproducibility issue is a bit slow I think the right thing to do is to turn off the tests here, get this merged, and fix things in a separate push later. If this turns green, could you merge this?

@koslambrou
Copy link
Contributor

@MaximilianAlgehed Yes I agree.

@koslambrou koslambrou merged commit 6e560c8 into IntersectMBO:main Jun 8, 2022
@zliu41
Copy link
Member

zliu41 commented Jun 8, 2022

Sorry I just saw this. It's probably a similar issue. I'll open a ticket which hopefully I (or someone else) can work on in the next few days.

@UlfNorell UlfNorell deleted the PR-coverage-static-analysis-plutus-apps branch June 8, 2022 18:17
@MaximilianAlgehed MaximilianAlgehed restored the PR-coverage-static-analysis-plutus-apps branch June 9, 2022 13:15
koslambrou pushed a commit that referenced this pull request Jun 22, 2022
* fix golden test

* update one more golden test

* Static analysis to find unnecessary locations

* turn off tests that don't reproduce

* hscleanup
@MaximilianAlgehed MaximilianAlgehed deleted the PR-coverage-static-analysis-plutus-apps branch January 4, 2023 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants