-
Notifications
You must be signed in to change notification settings - Fork 214
Testnet Integration test Plutus APP does not start #189
Comments
I am working on offering the possibility to synchronise the PAB starting with a given blockid. Unless your contract needs access to historical data, that should help a lot with the time required for synchronisation. |
There will also be a fix that will allow you to specify the maximum rollback that you want to do which will help with the memory consumption and CPU usage (which is due to some problem with the parallel GC in Haskell). |
If you want to reduce the CPU usage you can specify the following runtime options for your Haskell binary: |
Hello @raduom , two weeks ago sync took about 20 minutes, was there any changes which caused sync time to increase? |
@volodyad reverting #174 brought the time to sync back down to numbers I was seeing 2 weeks ago
@raduom Nice! this will certainly save me time :). Currently have to wait from 8 to 20 min to test my changes |
Honestly the 20 minutes for full synchronisation is a problem that we intend to solve by providing some sort of checkpoint-ing (saving state as we process the chain). However, that is not really on the current sprint and we don't know exactly when we will be able to schedule it. We hope that the option to synchronise given a certain block id will be an acceptable workaround until we get to implement persistent state for the PAB. |
I see same issue for this commit while trying to start PAB for my contract. Didn't noticed big CPU usage, but noticed big RAM usage (over 7 Gb) - had to extend swap so the system doesn't kill PAB. After swap was extended, waited for an hour to sync but console log stayed at |
@luigy Can you tell if that issue affects only chain-index or plutus-pab too? Like, can I use latest commit for PAB executsable and build just chain-index from older one? |
What network are you trying to synchronise with? |
I was trying public testnet (magic 1097911063) |
I have a very similar experience to what was described above (Ubuntu 21.10). For me, the PAB integration test (and my own pab program) breaks exactly at December 7 commit (46527f2). I have tried 3 scenarios:
In all tests, I used cardano-node v1.30.1 and cardano-wallet v2021-11-11. Probably, something needs adjusting in the PAB code after this particular commit? |
Fixing the memory leaks caused the RAM usage to go up (now you no longer have unevaluated thunks, you have the real deal). I tested the PAB with the I considered not fixing this issue (since I knew that it would temporarily increase RAM usage) however certain queries would cause the unevaluated thunks to partially evaluate which caused a much worse memory usage explosion making the PAB's memory consumption unpredictable, which in my opinion is worse. That said, I am working on a PR that will land today or tomorrow that allows you to specify in the command line (for the PAB) how much history to you want to retain. If you don't want to test rollbacks then you can set the history to 1 block which should help with the memory usage (and since the CPU usage is connected to the memory usage it should help with that too). If you are in a rush you can temporarily change the 500 value to 1 here: https://github.com/input-output-hk/plutus-apps/blob/835ce24b7c89fa0d49e985029defc15b6732785e/plutus-chain-index-core/src/Plutus/ChainIndex/UtxoState.hs#L132 There was another question about sync time increasing (@luigi). The memory leaks were due to unevaluated thunks, which now we fully evaluate. That is probably why it takes a bit longer. I would also suspect that the huge sync times / failures are due to GHC's parallel GC going berserk under pressure. You can test that by turning off the parallel GC using the |
I have used +RTS -qg -I0 in UtxoState reverted state to previous, without trimIndex it got stuck around , sync is very slow , every 10000 around a minute
at this slot cardano-node CPU not used anymore, on prev steps it was around 90% |
There is little chance of good things happening without trimIndex. Give me a bit of time and I will have a PR addressing these memory issues (I suspect that the slowness is due to increased pressure on the GC, but we'll see more when I get there). |
It worked faster before #174m what else could affect this? |
Please read my previous answer. Specifically:
|
The code here, should work fine if you want to test it: #191 It still takes quite a bit of time though.. |
Thanks, @raduom. After changing the trimIndex parameter, it worked for me. And the memory usage seems more predictable indeed. Although, it took a few hours to synchronize. I will be waiting on more synchronization options for the PAB. |
I would run it with |
tried --rollback-history 1 Actually what kind of data is loaded? As a workardound can we setup start slot to start sync? |
There is a bug in the rollback-history code that I will get to fix today, probably.
Yes. Probably at the beginning of the next week. |
Please let me know , if any success, so I could try |
You can try out: #210 |
If there is no further feedback on this issue, I would like to close it. |
Summary
On running integration test PAB start takes forever,
used the latest main 46e831e
Steps to reproduce the behavior
Run the tesnet example
Actual Result
I have been waiting around 3 -4 hours and got only. tried several times
CPU usage is excessive
Expected Result
PAB starts succesffully
Describe the approach you would take to fix this
No response
System info
mac os big sur
11.5.2
16 GB
2,6 GHz 6-Core Intel Core i7
46e831e
The text was updated successfully, but these errors were encountered: