Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually test that the Zebra and zcashd mempools are similar #5935

Closed
1 of 3 tasks
Tracked by #5937 ...
teor2345 opened this issue Jan 9, 2023 · 14 comments
Closed
1 of 3 tasks
Tracked by #5937 ...

Manually test that the Zebra and zcashd mempools are similar #5935

teor2345 opened this issue Jan 9, 2023 · 14 comments
Assignees
Labels
A-compatibility Area: Compatibility with other nodes or wallets, or standard rules A-consensus Area: Consensus rule updates A-rpc Area: Remote Procedure Call interfaces C-testing Category: These are tests

Comments

@teor2345
Copy link
Contributor

teor2345 commented Jan 9, 2023

Motivation

Zebra and zcashd should have similar transactions in their mempools:

  • Zebra must only put valid transactions into mined blocks
  • Users expect Zebra to relay and mine their transactions if zcashd does

Specifications

Mempools are governed by two different kinds of rules:

  • Consensus rules must be the same in Zebra and zcashd:
    • Zebra's getblocktemplate RPC must create valid blocks
    • Zebra must not accept transactions into its mempool that zcashd says are invalid when mined into blocks
  • Standard rules can be different between node implementations, but:
    • Users will report bugs if their transactions don't get mined by Zebra
    • If Zebra becomes a large part of the network, users will report bugs if their transactions don't get relayed by Zebra

Complex Code or Requirements

If Zebra's mempool accepts a transaction that zcashd rejects in mined blocks, this is a high-priority bug that we must fix. These bugs are blockers for activating the mining RPCs in production.

If Zebra's mempool rejects a transaction that zcashd's mempool accepts, this is a medium-priority bug that we need to document or fix.

Testing

I think we can use zcash-rpc-diff with the getrawmempool RPC to test this. We might need to match zcashd's transaction order to minimise the diff.

If it's the same order as the getblocktemplate RPC, we can do a similar change to PR #5867.

Please check your box when you have successfully run the zcash-rpc-diff script with getrawmempool (see #5994 for running the test continuously until there's a diff to examine):

Related Work

@teor2345 teor2345 added A-consensus Area: Consensus rule updates S-needs-triage Status: A bug report needs triage P-Medium ⚡ C-testing Category: These are tests A-rpc Area: Remote Procedure Call interfaces A-compatibility Area: Compatibility with other nodes or wallets, or standard rules labels Jan 9, 2023
@teor2345
Copy link
Contributor Author

Hey team! Please add your planning poker estimate with Zenhub @arya2 @conradoplg @dconnolly @oxarbitrage

@arya2 arya2 self-assigned this Jan 16, 2023
@arya2
Copy link
Contributor

arya2 commented Jan 17, 2023

I ran zcash-rpc-diff about 10-15 times where the Zebra and zcashd mempools contained the same transactions in all instances:

➜  zebra git:(main) ZCASH_CLI=../zcash/src/zcash-cli ./zebra-utils/zcash-rpc-diff 3000 getrawmempool
Checking first node release info...
Checking second node release info...
Connected to zebrad (port 3000) and zcashd (../zcash/src/zcash-cli zcash.conf port).

Checking zebrad network and tip height...
Checking zcashd network and tip height...

Request:
getrawmempool

Querying zebrad main chain at height >=1952062...

real    0m0.004s
user    0m0.004s
sys     0m0.000s

Querying zcashd main chain at height >=1952062...

real    0m0.005s
user    0m0.000s
sys     0m0.005s


Response diff between zebrad and zcashd:
RPC responses were identical

/tmp/tmp.VSTLfzYXeB.rpc-diff/zebrad-main-1952062-getrawmempool.json:
[
  "2708e06750a43fd7067dfd50fc551b2ad284774f531e5be2a86b7457bce2abf1",
  "76293fcea388f6fcc1c2c4c33e3bfb50bbabc8b7727b154369bfaa1064803bd3",
  "b22f81923775370d77e64dd9c546d9895e0dec413b10bcf80c73648a28a01cd0",
  "b3421236898018882904ed7ee815c769f9e9f8e74d4d8b161a755ca2f3d0caea"
]
➜  zebra git:(main) ZCASH_CLI=../zcash/src/zcash-cli ./zebra-utils/zcash-rpc-diff 3000 getrawmempool
Checking first node release info...
Checking second node release info...
Connected to zebrad (port 3000) and zcashd (../zcash/src/zcash-cli zcash.conf port).

Checking zebrad network and tip height...
Checking zcashd network and tip height...

Request:
getrawmempool

Querying zebrad main chain at height >=1952062...

real    0m0.005s
user    0m0.004s
sys     0m0.000s

Querying zcashd main chain at height >=1952062...

real    0m0.005s
user    0m0.000s
sys     0m0.005s


Response diff between zebrad and zcashd:
--- /tmp/tmp.SaJ4BZ7XKp.rpc-diff/zebrad-main-1952062-getrawmempool.json 2023-01-17 16:00:36.589263600 -0500
+++ /tmp/tmp.SaJ4BZ7XKp.rpc-diff/zcashd-main-1952062-getrawmempool.json 2023-01-17 16:00:36.599263600 -0500
@@ -1,7 +1,7 @@
 [
-  "1b4e852c5df34faa737478f11e2e114a41ff4d03643eaa263eb4e42f686a882f",
   "2708e06750a43fd7067dfd50fc551b2ad284774f531e5be2a86b7457bce2abf1",
   "76293fcea388f6fcc1c2c4c33e3bfb50bbabc8b7727b154369bfaa1064803bd3",
+  "1b4e852c5df34faa737478f11e2e114a41ff4d03643eaa263eb4e42f686a882f",
   "b22f81923775370d77e64dd9c546d9895e0dec413b10bcf80c73648a28a01cd0",
   "b3421236898018882904ed7ee815c769f9e9f8e74d4d8b161a755ca2f3d0caea"
 ]

➜  zebra git:(main) ZCASH_CLI=../zcash/src/zcash-cli ./zebra-utils/zcash-rpc-diff 3000 getrawmempool
Checking first node release info...
Checking second node release info...
Connected to zebrad (port 3000) and zcashd (../zcash/src/zcash-cli zcash.conf port).

Checking zebrad network and tip height...
Checking zcashd network and tip height...

Request:
getrawmempool

Querying zebrad main chain at height >=1952062...

real    0m0.005s
user    0m0.005s
sys     0m0.000s

Querying zcashd main chain at height >=1952062...

real    0m0.006s
user    0m0.005s
sys     0m0.000s


Response diff between zebrad and zcashd:
--- /tmp/tmp.9TvHXZ6ANs.rpc-diff/zebrad-main-1952062-getrawmempool.json 2023-01-17 16:00:51.549263841 -0500
+++ /tmp/tmp.9TvHXZ6ANs.rpc-diff/zcashd-main-1952062-getrawmempool.json 2023-01-17 16:00:51.549263841 -0500
@@ -1,8 +1,8 @@
 [
-  "1818cb2013a18d0329edb85442570430bf9ae65cedde73a8a76e14c0df8cdf9d",
-  "1b4e852c5df34faa737478f11e2e114a41ff4d03643eaa263eb4e42f686a882f",
   "2708e06750a43fd7067dfd50fc551b2ad284774f531e5be2a86b7457bce2abf1",
   "76293fcea388f6fcc1c2c4c33e3bfb50bbabc8b7727b154369bfaa1064803bd3",
+  "1b4e852c5df34faa737478f11e2e114a41ff4d03643eaa263eb4e42f686a882f",
   "b22f81923775370d77e64dd9c546d9895e0dec413b10bcf80c73648a28a01cd0",
+  "1818cb2013a18d0329edb85442570430bf9ae65cedde73a8a76e14c0df8cdf9d",
   "b3421236898018882904ed7ee815c769f9e9f8e74d4d8b161a755ca2f3d0caea"
 ]

More:

➜  zebra git:(main) ZCASH_CLI=../zcash/src/zcash-cli ./zebra-utils/zcash-rpc-diff 3000 getrawmempool
Checking first node release info...
Checking second node release info...
Connected to zebrad (port 3000) and zcashd (../zcash/src/zcash-cli zcash.conf port).

Checking zebrad network and tip height...
Checking zcashd network and tip height...

Request:
getrawmempool

Querying zebrad main chain at height >=1952063...

real    0m0.005s
user    0m0.004s
sys     0m0.000s

Querying zcashd main chain at height >=1952063...

real    0m0.005s
user    0m0.004s
sys     0m0.000s


Response diff between zebrad and zcashd:
--- /tmp/tmp.vDLkCTchmm.rpc-diff/zebrad-main-1952063-getrawmempool.json 2023-01-17 16:03:19.169266223 -0500
+++ /tmp/tmp.vDLkCTchmm.rpc-diff/zcashd-main-1952063-getrawmempool.json 2023-01-17 16:03:19.169266223 -0500
@@ -1,13 +1,13 @@
 [
   "0fa43579dcfb8ac9796c19ef03e0e01e69d11bcc782876b86e30e6548dfba4b0",
+  "c2a841c70a41ff464a4785e0ccf23b074cb10079dfabf1b047c0023d47d2c71f",
+  "d62fae7ed8dfe890a1c831b98a33cb0aa337a40e3c2acd01b3e47665c69658d4",
   "280229a13b08bb0a2101c4d29f01bbc31e168092fd2ba16766d5dea8d3e1a6f9",
-  "3a224119d8c17313b59bb9df30d1db5b11934f2379b5a734e8cccf8d1e30dd5e",
+  "5b2087be86606ca075dc63152164444e996c29bbae03a6cef00e0d39726a4079",
   "48f8724363fcaaa41667fef3239904cb03b91eb84cce13efa6a24b316d75c26c",
+  "3a224119d8c17313b59bb9df30d1db5b11934f2379b5a734e8cccf8d1e30dd5e",
   "546fcc40a76f31a9f5e77ab5799b090cf969744de12f0a9848a1fa3709798e29",
-  "5b2087be86606ca075dc63152164444e996c29bbae03a6cef00e0d39726a4079",
-  "86a72b6a0cad53f394e926efe331c0dbff637ae896457b924e4a4f66c7f8e43d",
-  "b3421236898018882904ed7ee815c769f9e9f8e74d4d8b161a755ca2f3d0caea",
-  "c2a841c70a41ff464a4785e0ccf23b074cb10079dfabf1b047c0023d47d2c71f",
   "cb35a0e6ed6f4bd2a797f272c7761fac05b752b5083561203ac46f5be3f3b3d9",
-  "d62fae7ed8dfe890a1c831b98a33cb0aa337a40e3c2acd01b3e47665c69658d4"
+  "86a72b6a0cad53f394e926efe331c0dbff637ae896457b924e4a4f66c7f8e43d",
+  "b3421236898018882904ed7ee815c769f9e9f8e74d4d8b161a755ca2f3d0caea"
 ]

Checking first node release info...
Checking second node release info...
Connected to zebrad (port 3000) and zcashd (../zcash/src/zcash-cli zcash.conf port).

Checking zebrad network and tip height...
Checking zcashd network and tip height...

Request:
getrawmempool

Querying zebrad main chain at height >=1952102...

real    0m0.006s
user    0m0.000s
sys     0m0.005s

Querying zcashd main chain at height >=1952102...

real    0m0.014s
user    0m0.001s
sys     0m0.012s


Response diff between zebrad and zcashd:
RPC responses were identical

/tmp/tmp.6mTKh9m21n.rpc-diff/zebrad-main-1952102-getrawmempool.json:
[
  "ca92d27981f51e840d30c696274cfcd20e57daf6dca9e3bc8c1d15e78ff26daf",
  "cc446fb6c28c5ea83a7c5501c913a515c8ac5de958ce61cb74e96c7cc13c2eca"
]

@teor2345
Copy link
Contributor Author

I ran zcash-rpc-diff about 10-15 times where the Zebra and zcashd mempools contained the same transactions in all instances:

Thanks!

How many developers do you think should do this manual test to be sure?
Can you add some checkboxes to this ticket for everyone working on Zebra right now?

@arya2
Copy link
Contributor

arya2 commented Jan 17, 2023

How many developers do you think should do this manual test to be sure?

The more the better. I wonder if we'd like to write an automated test that checks every 5-10 seconds and ask that it be left running for a few hours?

Can you add some checkboxes to this ticket for everyone working on Zebra right now?

Yep!

@teor2345
Copy link
Contributor Author

How many developers do you think should do this manual test to be sure?

The more the better. I wonder if we'd like to write an automated test that checks every 5-10 seconds and ask that it be left running for a few hours?

Here are the scripts I have been using for continuous proposal tests:
#5803 (comment)

@arya2
Copy link
Contributor

arya2 commented Jan 18, 2023

How many developers do you think should do this manual test to be sure?

The more the better. I wonder if we'd like to write an automated test that checks every 5-10 seconds and ask that it be left running for a few hours?

Here are the scripts I have been using for continuous proposal tests: #5803 (comment)

I could sort the output then so zcash-rpc-diff can fail when the diffs don't match up exactly (zcashd sorts by fee and size) but I've observed Zebra sometimes returning a transaction hash a second earlier than zcashd so that might still be problematic.

@teor2345
Copy link
Contributor Author

How many developers do you think should do this manual test to be sure?

The more the better. I wonder if we'd like to write an automated test that checks every 5-10 seconds and ask that it be left running for a few hours?

Here are the scripts I have been using for continuous proposal tests: #5803 (comment)

I could sort the output then so zcash-rpc-diff can fail when the diffs don't match up exactly (zcashd sorts by fee and size) but I've observed Zebra sometimes returning a transaction hash a second earlier than zcashd so that might still be problematic.

Feel free to change the RPC Rust code, no-one should be depending on the order of the output.

I think it's ok to run manual tests for a while, then manually check any discrepancies. There's not much we can do if one of the nodes is a bit faster to verify new transactions.

@mpguerra mpguerra moved this to 🛑 Won't Fix in Zebra Jan 19, 2023
@mpguerra mpguerra added this to Zebra Jan 19, 2023
@mpguerra mpguerra moved this from 🛑 Won't Fix to 📋 Sprint Backlog in Zebra Jan 19, 2023
@mpguerra mpguerra moved this from 📋 Sprint Backlog to 🏗 In progress in Zebra Jan 19, 2023
@oxarbitrage
Copy link
Contributor

We want to get additional data from the differences that the zcashd and zebrad mempools can have at different times. We know this happens. Some of them could be normal and others could be bugs.

So, in addition to just compare the differences of the getrawmempool rpc calls in the two implementations, we want to get additional transaction data and blockchain height . This can be done by calling an extra rpc method getrawtransaction after getrawmempool using the hash we got from getrawmempool.

@teor2345
Copy link
Contributor Author

We also need the tip block hash, in case there is a chain fork 🙂

@teor2345
Copy link
Contributor Author

@oxarbitrage just checking if you are working on these script changes?

@oxarbitrage
Copy link
Contributor

@oxarbitrage just checking if you are working on these script changes?

Yes, i am still working on it. The problem is that we have the hash (transaction id) from the getrawmempool call and then we can call getrawtransaction with it. However, in the current zebra implementation we just get the raw hex of the transaction. This will not be enough for our purposes so we need to call rust for deserialization, which is a bit more work.

Other ideas are welcome.

@teor2345
Copy link
Contributor Author

@oxarbitrage just checking if you are working on these script changes?

Yes, i am still working on it. The problem is that we have the hash (transaction id) from the getrawmempool call and then we can call getrawtransaction with it. However, in the current zebra implementation we just get the raw hex of the transaction. This will not be enough for our purposes so we need to call rust for deserialization, which is a bit more work.

Other ideas are welcome.

We could call the decoderawtransaction RPC from zcashd on the transaction data?
https://zcash.github.io/rpc/decoderawtransaction.html

That's what I have been using to diagnose getblocktemplate RPC differences and consensus bugs.

@mpguerra
Copy link
Contributor

mpguerra commented Feb 2, 2023

I think we're done here 🎉

@mpguerra mpguerra closed this as completed Feb 2, 2023
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Zebra Feb 2, 2023
@teor2345
Copy link
Contributor Author

teor2345 commented Feb 3, 2023

If anyone is looking, we did a lot of testing and diagnosing differences in this PR's comments:
#6035 (comment)

The most frequent difference was Zebra verifying transactions faster than zcashd, which is expected because we do it in parallel.

@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compatibility Area: Compatibility with other nodes or wallets, or standard rules A-consensus Area: Consensus rule updates A-rpc Area: Remote Procedure Call interfaces C-testing Category: These are tests
Projects
Archived in project
Development

No branches or pull requests

4 participants