Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

getRecentBlockhash JSON RPC API returns different values from different nodes in testnet-beta #3442

Closed
mvines opened this issue Mar 22, 2019 · 12 comments · Fixed by #3749
Closed
Assignees

Comments

@mvines
Copy link
Contributor

mvines commented Mar 22, 2019

STR:

  1. Figure the IP addresses of the individual testnet nodes from the latest "testnet-beta" run at https://buildkite.com/solana-labs/testnet-management/builds?branch=v0.12
  2. Run the following for a couple nodes, setting rpc correctly:
$ rpc=52.53.220.212:8899
$ while sleep 1; do curl -X POST -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":1, "method":"getRecentBlockhash"}' $rpc; done
  1. Observe non-overlapping getRecentBlockhash values.

It appears that the validator nodes are generating different blockhash values, which means any transaction create via their RPC API will not be accepted by the leader.

@mvines
Copy link
Contributor Author

mvines commented Mar 22, 2019

I've confirmed that v0.11 doesn't have this behaviour (totally expected but a nice sanity check regardless)

@mvines
Copy link
Contributor Author

mvines commented Mar 22, 2019

Two v0.12 nodes run locally on the same machine with multinode-demo don't exhibit this issue initially either. Unclear exactly how testnet-beta is getting into a bad state, issues like #3451 may be related.

#3455 also removes testnet-beta node updating and we just always recreate the nodes so that may help work around the root issue here

@mvines
Copy link
Contributor Author

mvines commented Mar 22, 2019

Here's a little script I wrote to help monitor getRecentBlockhash across a bunch of nodes:

#!/usr/bin/env bash
# Usage: blockhash-monitor.sh <rpc address 1> <rpc address 2> ...  <rpc address N>
while true; do
  echo -
  for rpc in "$@"; do
    printf "%-20s | " "$rpc"
    curl -X POST -H 'Content-Type: application/json' \
      -d '{"jsonrpc":"2.0","id":1, "method":"getRecentBlockhash"}' "$rpc"
  done
  sleep 0.5
done

Sample output when things go wrong:

$ ./blockhash-monitor.sh 54.67.24.173:8899  52.53.142.153:8899 13.56.193.176:8899 18.144.34.252:8899 54.183.167.57:8899
-
54.67.24.173:8899    | {"jsonrpc":"2.0","result":"9njeRGoCoCsjCo5easkjFKqnPHS3dsDmaptGpgVJNKaw","id":1}
52.53.142.153:8899   | {"jsonrpc":"2.0","result":"GtGZmQZKovUuAPqEfesVzvhqUWDZuwega6jB7LErUoKk","id":1}
13.56.193.176:8899   | {"jsonrpc":"2.0","result":"233z1RUnJh22TkberBZW4AirrUhyaFSc9GKYrFHttXGa","id":1}
18.144.34.252:8899   | {"jsonrpc":"2.0","result":"Dow3k5ubv6HTALZohQNWzdA1Zr9m1dvrGhL7gtRJE8HH","id":1}
54.183.167.57:8899   | {"jsonrpc":"2.0","result":"5624aJFTVERgaZcJhuFgMJz9dRPBGJaPs8F7hgcyBfy2","id":1}
-
54.67.24.173:8899    | {"jsonrpc":"2.0","result":"Euz2vBtrbhsbQ5MfBaadmB4T4uWWPgNJqcVvTXoMXXrn","id":1}
52.53.142.153:8899   | {"jsonrpc":"2.0","result":"GtGZmQZKovUuAPqEfesVzvhqUWDZuwega6jB7LErUoKk","id":1}
13.56.193.176:8899   | {"jsonrpc":"2.0","result":"233z1RUnJh22TkberBZW4AirrUhyaFSc9GKYrFHttXGa","id":1}
18.144.34.252:8899   | {"jsonrpc":"2.0","result":"Dow3k5ubv6HTALZohQNWzdA1Zr9m1dvrGhL7gtRJE8HH","id":1}
54.183.167.57:8899   | {"jsonrpc":"2.0","result":"5624aJFTVERgaZcJhuFgMJz9dRPBGJaPs8F7hgcyBfy2","id":1}
-
54.67.24.173:8899    | {"jsonrpc":"2.0","result":"4MarmKcyfTF7APd8B7pAcvSZPsnPNYH6CuphxDeJsM4P","id":1}
52.53.142.153:8899   | {"jsonrpc":"2.0","result":"AzPRigjdVaqzCaCFbcLh7PwuXzKZhnTauDKhTo5WJcU4","id":1}
13.56.193.176:8899   | {"jsonrpc":"2.0","result":"233z1RUnJh22TkberBZW4AirrUhyaFSc9GKYrFHttXGa","id":1}
18.144.34.252:8899   | {"jsonrpc":"2.0","result":"Dow3k5ubv6HTALZohQNWzdA1Zr9m1dvrGhL7gtRJE8HH","id":1}
54.183.167.57:8899   | {"jsonrpc":"2.0","result":"5624aJFTVERgaZcJhuFgMJz9dRPBGJaPs8F7hgcyBfy2","id":1}
-
54.67.24.173:8899    | {"jsonrpc":"2.0","result":"9pxA9XLaHePSmZwnCVswVEfCYCCsBq6k75dyKMTMofRb","id":1}
52.53.142.153:8899   | {"jsonrpc":"2.0","result":"AzPRigjdVaqzCaCFbcLh7PwuXzKZhnTauDKhTo5WJcU4","id":1}
13.56.193.176:8899   | {"jsonrpc":"2.0","result":"HdukU6NXcAzAV44CxyizLieo29zKQi1ixe5EhmsstQtr","id":1}
18.144.34.252:8899   | {"jsonrpc":"2.0","result":"FoJWqhZ3AwoEuj3rRsAAZ4NyEzh65PBKgXdgPkRq1ydR","id":1}
54.183.167.57:8899   | {"jsonrpc":"2.0","result":"G6VFQPrRuLWgNhoEUtn1XJRfbtYesFobwaUf5MrBsoKg","id":1}
^C

@carllin
Copy link
Contributor

carllin commented Mar 25, 2019

hmmm strange, seems like the nodes ended up on different forks. Strange because I think with our long epochs, it should be almost an hour before real leader rotation starts (the first epochs just schedule the bootstrap leader)

@aeyakovenko
Copy link
Member

#3474

@pgarg66
Copy link
Contributor

pgarg66 commented Mar 29, 2019

@aeyakovenko , @carllin
do you know if this is still an issue?

@aeyakovenko
Copy link
Member

@pgarg66 sort of. The bank returns 2 hashes for the same slot

@mvines
Copy link
Contributor Author

mvines commented Apr 10, 2019

I retried the script at #3442 (comment) against the latest edge testnet and the situation has improved.

getRecentBlockhash now returns a consistent value across all the nodes except the blockstreamer node, which always returns a different block hash. Perhaps this is due to the blockstreamer node not being staked somehow?

@mvines
Copy link
Contributor Author

mvines commented Apr 10, 2019

An RPC API that allows the caller to check if a given blockhash has not been expired could be useful to debug this difference. If the blockstreamer node's blockhash is still considered valid by the other cluster nodes then this issue can be closed.

@mvines
Copy link
Contributor Author

mvines commented Apr 12, 2019

Seems like this is related to how the blockstreamer node does not vote. @sagar-solana was looking at this aspect yesterday, so assigning to him for now

@aeyakovenko
Copy link
Member

@mvines, it should return the root blockhash for clients

@mvines
Copy link
Contributor Author

mvines commented Apr 13, 2019

@sagar-solana - I've confirmed that your patch fixed the issue

behzadnouri pushed a commit to behzadnouri/solana that referenced this issue Nov 2, 2024
…ana-labs#3442)

* streamer: put testing_utilities.rs behind dev-context-only-utils

* fix feature activation in tests
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants