Testnet Integration test Plutus APP does not start #189

volodyad · 2021-12-12T21:55:01Z

Summary

On running integration test PAB start takes forever,
used the latest main 46e831e

Steps to reproduce the behavior

Run the tesnet example

Actual Result

I have been waiting around 3 -4 hours and got only. tried several times

Current block: 100000. Current slot: 27243687
Current block: 200000. Current slot: 33116605

CPU usage is excessive

Expected Result

PAB starts succesffully

Describe the approach you would take to fix this

No response

System info

mac os big sur
11.5.2
16 GB
2,6 GHz 6-Core Intel Core i7

46e831e

The text was updated successfully, but these errors were encountered:

raduom · 2021-12-13T09:35:43Z

I am working on offering the possibility to synchronise the PAB starting with a given blockid. Unless your contract needs access to historical data, that should help a lot with the time required for synchronisation.

raduom · 2021-12-13T09:36:52Z

There will also be a fix that will allow you to specify the maximum rollback that you want to do which will help with the memory consumption and CPU usage (which is due to some problem with the parallel GC in Haskell).

raduom · 2021-12-13T09:43:03Z

If you want to reduce the CPU usage you can specify the following runtime options for your Haskell binary: +RTS -gq -I0. The option -gq will turn off the parallel garbage collector (reducing the CPU usage to at most 100%) and -I0 turns off IDLE GC which will make the CPU usage go down when fully synchronised.

volodyad · 2021-12-13T10:16:11Z

Hello @raduom , two weeks ago sync took about 20 minutes, was there any changes which caused sync time to increase?

luigy · 2021-12-13T12:56:24Z

@volodyad reverting #174 brought the time to sync back down to numbers I was seeing 2 weeks ago

I am working on offering the possibility to synchronise the PAB starting with a given blockid.

@raduom Nice! this will certainly save me time :). Currently have to wait from 8 to 20 min to test my changes

raduom · 2021-12-13T13:13:42Z

Honestly the 20 minutes for full synchronisation is a problem that we intend to solve by providing some sort of checkpoint-ing (saving state as we process the chain). However, that is not really on the current sprint and we don't know exactly when we will be able to schedule it. We hope that the option to synchronise given a certain block id will be an acceptable workaround until we get to implement persistent state for the PAB.

volodyad · 2021-12-13T16:11:40Z

@luigy , it helped, thank you

@raduom issue is that on the main branch it does not start all, I have been waiting for hours.

mikekeke · 2021-12-13T17:02:08Z

I see same issue for this commit while trying to start PAB for my contract. Didn't noticed big CPU usage, but noticed big RAM usage (over 7 Gb) - had to extend swap so the system doesn't kill PAB. After swap was extended, waited for an hour to sync but console log stayed at Starting PAB backend server on port 9080

mikekeke · 2021-12-14T10:24:43Z

@luigy Can you tell if that issue affects only chain-index or plutus-pab too? Like, can I use latest commit for PAB executsable and build just chain-index from older one?

raduom · 2021-12-14T12:56:03Z

What network are you trying to synchronise with?

mikekeke · 2021-12-14T13:48:31Z

I was trying public testnet (magic 1097911063)

vlasin · 2021-12-15T00:30:23Z

I have a very similar experience to what was described above (Ubuntu 21.10). For me, the PAB integration test (and my own pab program) breaks exactly at December 7 commit (46527f2). I have tried 3 scenarios:

Use the previous commit for both plutus-chain-index and plutus-pab-examples -> the test runs successfully.
Use this commit for plutus-chain-index and the previous one for plutus-pab-examples -> the synchronization takes 5-10 minutes as usual, but the PAB is killed due to out-of-memory error (16 GB RAM).
Use this commit for both plutus-chain-index and plutus-pab-examples -> I start getting synchronization messages after an hour+ but then an out-of-memory error before fully synchronizing. Also tried this on another PC (Ubuntu 20.04, 32 GB RAM) with the same result.

In all tests, I used cardano-node v1.30.1 and cardano-wallet v2021-11-11.

Probably, something needs adjusting in the PAB code after this particular commit?

raduom · 2021-12-15T04:25:51Z

Fixing the memory leaks caused the RAM usage to go up (now you no longer have unevaluated thunks, you have the real deal). I tested the PAB with the shelly_qa test net and it used around 8Gb to synchronise. I would really advise you to use that testnet.

I considered not fixing this issue (since I knew that it would temporarily increase RAM usage) however certain queries would cause the unevaluated thunks to partially evaluate which caused a much worse memory usage explosion making the PAB's memory consumption unpredictable, which in my opinion is worse.

That said, I am working on a PR that will land today or tomorrow that allows you to specify in the command line (for the PAB) how much history to you want to retain. If you don't want to test rollbacks then you can set the history to 1 block which should help with the memory usage (and since the CPU usage is connected to the memory usage it should help with that too).

If you are in a rush you can temporarily change the 500 value to 1 here: https://github.com/input-output-hk/plutus-apps/blob/835ce24b7c89fa0d49e985029defc15b6732785e/plutus-chain-index-core/src/Plutus/ChainIndex/UtxoState.hs#L132

There was another question about sync time increasing (@luigi). The memory leaks were due to unevaluated thunks, which now we fully evaluate. That is probably why it takes a bit longer.

I would also suspect that the huge sync times / failures are due to GHC's parallel GC going berserk under pressure. You can test that by turning off the parallel GC using the +RTS -qg -I0 arguments for the PAB binary.

volodyad · 2021-12-15T12:06:30Z

I have used +RTS -qg -I0
and changed logs seetings to show more
if (smod 10_000 == 0 && s > 0) || (s >= recentSlot)

in UtxoState reverted state to previous, without trimIndex
let (before, after) = FT.split ((s <=) . snd) ix

it got stuck around , sync is very slow , every 10000 around a minute

Current block: 121107. Current slot: 28070000
  Current block: 125566. Current slot: 28250000
Current block: 127509. Current slot: 28330000

at this slot cardano-node CPU not used anymore, on prev steps it was around 90%
plutus app CPU utilization around 100%

raduom · 2021-12-15T12:11:00Z

There is little chance of good things happening without trimIndex. Give me a bit of time and I will have a PR addressing these memory issues (I suspect that the slowness is due to increased pressure on the GC, but we'll see more when I get there).

volodyad · 2021-12-15T12:14:17Z

It worked faster before #174m what else could affect this?

raduom · 2021-12-15T12:16:27Z

Please read my previous answer. Specifically:

There was another question about sync time increasing (@luigi). The memory leaks were due to unevaluated thunks, which now we fully evaluate. That is probably why it takes a bit longer.

raduom · 2021-12-15T15:19:59Z

The code here, should work fine if you want to test it: #191
I managed to synchronise with shelley_qa (43kk blocks) while using only 6Gb RAM. I still need to test it on the testnet, but the current changes look promising.

It still takes quite a bit of time though..

vlasin · 2021-12-16T01:30:30Z

Thanks, @raduom. After changing the trimIndex parameter, it worked for me. And the memory usage seems more predictable indeed. Although, it took a few hours to synchronize. I will be waiting on more synchronization options for the PAB.

raduom · 2021-12-16T07:03:11Z

I would run it with --rollback-history 1 if you don't need any rollbacks. Both memory usage and synchronisation speed are improved.

volodyad · 2021-12-17T13:04:50Z

tried --rollback-history 1
20 minutes and slow down the Current block: 17822. Current slot: 6980000

Actually what kind of data is loaded? As a workardound can we setup start slot to start sync?
When you just do the testing with a new wallet and new contract and not going to use older utxos, can I just skip loading everything?

raduom · 2021-12-17T13:07:01Z

There is a bug in the rollback-history code that I will get to fix today, probably.

Actually what kind of data is loaded? As a workardound can we setup start slot to start sync?

Yes. Probably at the beginning of the next week.

volodyad · 2021-12-17T13:20:52Z

There is a bug in the rollback-history code that I will get to fix today, probably.

Please let me know , if any success, so I could try

raduom · 2021-12-20T19:51:03Z

You can try out: #210

raduom · 2021-12-22T11:16:24Z

If there is no further feedback on this issue, I would like to close it.

volodyad added the bug Something isn't working label Dec 12, 2021

volodyad changed the title ~~Plutus APP does not start~~ Testnet Integration test Plutus APP does not start Dec 12, 2021

volodyad mentioned this issue Dec 12, 2021

User error The budget was overspent on testnet #161

Closed

mikekeke mentioned this issue Dec 13, 2021

NFT buy and set-price dApp with PAB and cardano wallet mlabs-haskell/plutus-use-cases#316

Closed

5 tasks

kk-hainq mentioned this issue Dec 14, 2021

Chain Index: Allow more user configurations #73

Closed

raduom self-assigned this Dec 22, 2021

raduom closed this as completed Jan 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testnet Integration test Plutus APP does not start #189

Testnet Integration test Plutus APP does not start #189

volodyad commented Dec 12, 2021 •

edited

Loading

raduom commented Dec 13, 2021

raduom commented Dec 13, 2021

raduom commented Dec 13, 2021

volodyad commented Dec 13, 2021 •

edited

Loading

luigy commented Dec 13, 2021 •

edited

Loading

raduom commented Dec 13, 2021

volodyad commented Dec 13, 2021 •

edited

Loading

mikekeke commented Dec 13, 2021

mikekeke commented Dec 14, 2021

raduom commented Dec 14, 2021

mikekeke commented Dec 14, 2021

vlasin commented Dec 15, 2021

raduom commented Dec 15, 2021

volodyad commented Dec 15, 2021 •

edited

Loading

raduom commented Dec 15, 2021

volodyad commented Dec 15, 2021

raduom commented Dec 15, 2021 •

edited

Loading

raduom commented Dec 15, 2021 •

edited

Loading

vlasin commented Dec 16, 2021

raduom commented Dec 16, 2021

volodyad commented Dec 17, 2021 •

edited

Loading

raduom commented Dec 17, 2021

volodyad commented Dec 17, 2021

raduom commented Dec 20, 2021

raduom commented Dec 22, 2021

Testnet Integration test Plutus APP does not start #189

Testnet Integration test Plutus APP does not start #189

Comments

volodyad commented Dec 12, 2021 • edited Loading

Summary

Steps to reproduce the behavior

Actual Result

Expected Result

Describe the approach you would take to fix this

System info

raduom commented Dec 13, 2021

raduom commented Dec 13, 2021

raduom commented Dec 13, 2021

volodyad commented Dec 13, 2021 • edited Loading

luigy commented Dec 13, 2021 • edited Loading

raduom commented Dec 13, 2021

volodyad commented Dec 13, 2021 • edited Loading

mikekeke commented Dec 13, 2021

mikekeke commented Dec 14, 2021

raduom commented Dec 14, 2021

mikekeke commented Dec 14, 2021

vlasin commented Dec 15, 2021

raduom commented Dec 15, 2021

volodyad commented Dec 15, 2021 • edited Loading

raduom commented Dec 15, 2021

volodyad commented Dec 15, 2021

raduom commented Dec 15, 2021 • edited Loading

raduom commented Dec 15, 2021 • edited Loading

vlasin commented Dec 16, 2021

raduom commented Dec 16, 2021

volodyad commented Dec 17, 2021 • edited Loading

raduom commented Dec 17, 2021

volodyad commented Dec 17, 2021

raduom commented Dec 20, 2021

raduom commented Dec 22, 2021

volodyad commented Dec 12, 2021 •

edited

Loading

volodyad commented Dec 13, 2021 •

edited

Loading

luigy commented Dec 13, 2021 •

edited

Loading

volodyad commented Dec 13, 2021 •

edited

Loading

volodyad commented Dec 15, 2021 •

edited

Loading

raduom commented Dec 15, 2021 •

edited

Loading

raduom commented Dec 15, 2021 •

edited

Loading

volodyad commented Dec 17, 2021 •

edited

Loading