-
Notifications
You must be signed in to change notification settings - Fork 214
PLT-172: Marconi index transactions that have been issued to a particular script #629
PLT-172: Marconi index transactions that have been issued to a particular script #629
Conversation
I don't think you need to understand blocks from Vasil HF to make this PR work. You can just run the indexer on Cardano mainnet (which there is no Vasil HF), but not on testnet. |
instance SQL.ToField ScriptAddress where | ||
toField (ScriptAddress hash) = SQL.SQLBlob . toStrict . serialise $ hash | ||
instance SQL.FromField ScriptAddress where | ||
fromField _ = undefined -- todo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do this before merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's finding the script hash (linked in the PR description) that I haven't been able to find yet (but working on it!).
open dbPath (Depth k) = do | ||
ix <- fromJust <$> Ix.newBoxed query store onInsert k ((k + 1) * 2) dbPath | ||
let c = ix ^. Ix.handle | ||
SQL.execute_ c "CREATE TABLE IF NOT EXISTS script_transactions (scriptAddress TEXT NOT NULL, txCbor BLOB NOT NULL)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would rather have a primary key index on scriptAddress
. Is there a reason why we are not doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be wrong here, but since a script can be targeted by many transactions (?), then the scriptAddress
won't be unique and so it can't be a primary key. (Though it still can have a index on it though, so it would be faster to search for with sql).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The scriptAddress
primary key issue is also mentioned in the PR description.)
|
||
store :: ScriptTxIndex -> IO () | ||
store ix = do | ||
persisted <- Ix.getEvents $ ix ^. Ix.storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bug I just found. We only want to save things in the buffer, not in the events.
type Query = ScriptAddress | ||
type Result = [TxCbor] | ||
|
||
data ScriptTxUpdate = ScriptTxUpdate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can probably make everything strict since these will eventually be persisted to disk.
map fst $ filter (\(_, addrs) -> scriptAddress' `elem` addrs) update | ||
|
||
both :: [TxCbor] | ||
both = buffered <> map (\(SQL.Only txCbor') -> txCbor') persisted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an Iso
may look better here, but this also works.
@@ -14,7 +14,7 @@ You need to download the configurations for the node, genesis blocks and the top | |||
I am using a shell script to start the node that I will paste here: | |||
|
|||
```shell | |||
#!/bin/bash | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future if you see a quick fix that is possible create a new PR for it as there are many more people who can review and accept it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, already did #618! (As that one is now merged, then can rebase it now and the commit won't be part of this PR anymore)
18abf13
to
3139436
Compare
28081a4
to
8209c09
Compare
@koslambrou @raduom With regarding to adding tests: marconi is currently an executable within the plutus-chain-index package, but to add tests I think it needs to be converted into a library (+ executable)? so that in the tests I could import functions from it. |
In discussion with @raduom we found that:
Thus, would it make sense to wait for #524 to be merged and then add an integration in a later PR/Jira to test that all works as expected? @koslambrou |
8209c09
to
793e904
Compare
Right, not in the indexer itself. The pure function that would be nice to test is
That's right, you can work on the integration test on a later PR :) |
plutus-chain-index/app/Marconi.hs
Outdated
txScripts :: forall era . Tx era -> [ScriptTx.ScriptAddress] | ||
txScripts tx = let | ||
Tx (body :: C.TxBody era) _ws = tx | ||
map' = plutusScriptsFromTxBody body :: M.Map Ledger.ScriptHash Ledger.Script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't suggest using plutusScriptsFromTxBody
. Marconi is an off-chain component, so it should be using types for off-chain use (like cardano-api
). Anything with with plutus-ledger-api
types are really meant for on-chain use.
I suggest reimplementing it in Marconi (something named like getScriptHashesFromTx
) and write tests for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plutusScriptsFromTxBody
hashes the script by
- deserialising it into a plutus-ledger
Script
(defined in Plutus.V1.Ledger.Scripts) - scriptHash serialises it again to bytes and makes a cardano-api
Script
(defined in Cardano.Api.Script; that script wraps the
serialised bytes), then hashes that script
If the goal is to skip plutus-ledger Script
then is it ok if I wrap the received bytes into cardano-api's Script
directly? I do this here, specifically within the mkCardanoApiScript
which wraps the incoming bytes into a cardano-api Script
(the previous way to get this script included going through deserialisation/serialisation steps).
Otherwise, I haven't found a way to go from the bytes I receive to a cardano-api Script
without using anything from the ledger.
Regarding tests, since it's a property test, then if I generate arbitrary scripts, serialize them and get a hash, and then put the scripts into a transaction to see if I get the same scripts (= same hashes) out again, then what would it be testing? (they are the same by definition) Or, perhaps there are multiple locations to put those scripts and I'd see if I lose any by not handling a branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... ideally we would use types from cardano-api
, not cardano-ledger
. I recommend not pattern matching on the body of the transaction. I recommend pattern matching with pattern TxBody :: TxBodyContent ViewTx era -> TxBody era
in Cardano.Api.TxBody
.
There are some examples in Ledger.Tx.CardanoAPI
.
Regarding property testing, here's a starting point.
You can test the txScripts
function as follows. Generate a random number of plutus scripts (see genPlutusScript
from Gen.Cardano.Api.Typed
), then generate an TxIns
which spends funds from these scripts, then generate a transaction with this TxIns
(based on genTxBodyContent
), run your txScripts
on this generated tx, and verify that the initial generated plutus scripts are part of the function's output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the scripts I'm picking out and using are not included in the pattern TxBody: https://github.com/input-output-hk/cardano-node/blob/master/cardano-api/src/Cardano/Api/TxBody.hs#L2070
Question is, if the scripts that I pick out are the right ones? As I found also that inside TxBody the Script type also exist under:
- TxBodyContent{txOuts} -> TxOut -> ReferenceScript -> ScriptInAnyLang
- TxBodyContent{txReturnCollateral} -> TxReturnCollateral -> TxOut
- TxBodyContent{txAuxScripts} -> TxAuxScripts -> ScriptInEra
I would guess the scripts at the inputs (the ones that I'm using already and that come in as serialised) are the important ones, as those are the ones that are run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea is that maybe plutus-streaming library (the one that provides the withChainSyncEventStream
that provides the blocks) can convert scripts to the correct type such that only downstream using cardano-api is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koslambrou After much confusion managed to boil it down to two questions:
generate an TxIns which spends funds from these scripts
When a TxIn
spends funds from a script, does that mean that its TxId
field must be set to the script's hash? (TxIn
is defined as data TxIn = TxIn TxId TxIx
and the only other field in it, TxIx
, is a newtype for a Word
so it can't possibly refer to a script)
generate a transaction with this TxIns (based on genTxBodyContent),
Does this imply using makeTransactionBody
which will create a "ledger transaction" out of the cardano-api's TxBodyContent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koslambrou Have managed to come up with this property test:
https://github.com/input-output-hk/plutus-apps/blob/edc8da432275c017ceba3a3aea24bbe9cb8e5e4b/plutus-chain-index-core/test/MarconiSpec.hs#L26-L66
There is still a piece missing: what does it mean to "spend a script that I created"? In the above I've tried to generate the same number of TxIns as there are scripts and replace the scripts' hashes into the TxId
field of TxIn
-- is this the correct way to think about "spending a script"?
2581d7c
to
79d644d
Compare
b050fef
to
edc8da4
Compare
Pushed another iteration where I create the first tx in which the scripts I've generated are part of the outputs. Then create the second transaction where the inputs are the outputs of the previous transaction, so I could "spend the scripts". It's still very much WIP as the test can't possibly work as running makeTransactionBody on the second transaction doesn't have access to the generated scripts nor the hashes, because the inputs ( Thus I wonder: is it implied that I need to run a local testnet in this property test too? so that when running the second transaction the TxIns could at least be looked up and the script field thus populated? |
ac12373
to
d253cfb
Compare
Ready for review; I now index all types of scripts and also test them, and the tests pass. |
|
||
-- * Copy-paste | ||
-- | ||
-- | TODO: Remove when the following function is exported from Cardano.Api.Script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point the link to the PR here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
7913b1d
to
59b2d09
Compare
59b2d09
to
efc9834
Compare
f604ffb
to
ca502a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there may be some bugs in the indexer.
where | ||
txScripts' = map (\tx -> (TxCbor $ C.serialiseToCBOR tx, getTxScripts tx)) txs | ||
|
||
getTxBodyScripts :: forall era . C.TxBody era -> [ScriptAddress] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may be useful to someone else. Should we not move this to something like CardanoAPI
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! For now, I'd say we keep it here until we decide how to structure cardano-api related code. For example, might make sense to create a decidated cardano-api-extended
package which contains a bunch of these functions which can then be pushed upstream when stabilized and tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be done in another PR?
But I guess yes, if the function would just return Shelley.ScriptHash
(ScriptAddress
is a newtype around it) then I guess Cardano.Api.Shelley could be a good location for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A related thought: if we will get a hoogle, then writing this function could be done in two steps: one would be getTxBodyScriptHashes :: forall era . C.TxBody era -> [ScriptHash]
and the other one would be the function we currently have, which just wraps ScriptHash
into a newtype. This way hooglers can find the function, and if useful enough, can move them into a more suitable module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can leave it for another PR.
|
||
store :: ScriptTxIndex -> IO () | ||
store ix = do | ||
persisted <- Ix.getEvents $ ix ^. Ix.storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not store the events, only what we have in the buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now only store the buffered events here 83ac994 and think there was the same bug in the Utxo indexer as well, does it look ok now there as well?
"SELECT txCbor FROM utxos WHERE scriptAddress = ?" (SQL.Only scriptAddress') | ||
|
||
let | ||
buffered :: [TxCbor] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When querying we also need to account for buffered
events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I get the transaction cbors for a script address from the sqlite database:
https://github.com/input-output-hk/plutus-apps/blob/83ac994f4f7b457f6d5517adca3d590a070fbc09/plutus-chain-index-core/src/Marconi/Index/ScriptTx.hs#L121 and here I get transaction cbors from the buffered events' sequence https://github.com/input-output-hk/plutus-apps/blob/83ac994f4f7b457f6d5517adca3d590a070fbc09/plutus-chain-index-core/src/Marconi/Index/ScriptTx.hs#L127
I.e I do query both and return them from the function (?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updates
parameter only contains the events though. So basically you need to do what you did before for store
. The way this works right now is that we keep K events in memory plus some buffered events so we can batch updates to the database (which makes inserting rows much more efficient than if we would do it one by one).
So when we store things on disk, we only store the things that are buffered and leave the K blocks in memory (since they can still be rolled back); but when we query we want to account for all events, including the ones that are currently being buffered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let hashesFound = map coerce $ ScriptTx.getTxBodyScripts txBody :: [ScriptHash] | ||
assert $ S.fromList scriptHashes == S.fromList hashesFound | ||
|
||
genTxBodyWithTxIns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the node export generators for these data structures already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. They are very limited and would really need to be reworked on. Again, it would make sense to put these generators in a common package so that they can be pushed upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that generators are pretty difficult to get right. You need to think about efficient ways of shrinking them, and what statistical distribution you need to test your scenarios, and how to check that the distribution you selected is actually testing those scenarios. I am not sure we are the ones that should be doing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The problem is that this is not a priority for the node team. We can probably do what the Hydra team did, use the generators from cardano-ledger
and reconvert the datatype to a cardano-api type.
Ultimately, cardano-api would need to have good generators in them, and I think this is a cross-team effort (Hydra, Djed, Plutus Tools). So I think it's okay for us to start contributing in that direction IMO.
d40d7e0
to
800d8f9
Compare
800d8f9
to
6a56e37
Compare
@koslambrou This MR is now ready merge! |
This is a draft PR for the jira issue marconi indexer: transactions that have been issued to a particular script. (This PR is also wrongly based off of
next-node
branch because it can understand the blocks since Vasil hard-fork -- will rebase once I manage to resolve the issues below.)A concrete code thing I need help with is extracting script hashes from a transaction here: how/where would I find the type(s)?
Another question is that the Jira issue says to set script address as the primary key, but if this is an index from transaction to script, then this is an 1:n relationship because (as far as I understand) a single script will be possibly targeted/ran by many transactions.
I've also read the indexer's code (the hysterical-screams) but can't say it's fully clear yet :), so would be good to see if the onInsert/store/query triple is correct.
Any other comments welcome as well!
(The main commits are 9a2e24e and 79a9d7f, the two previous ones already exist as separate PRs.)
Pre-submit checklist: