Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New subscription and wallet sync protocol #17340

Merged
merged 36 commits into from
Mar 29, 2024
Merged

Conversation

Rigidity
Copy link
Contributor

@Rigidity Rigidity commented Jan 16, 2024

Purpose:

The wallet protocol is lacking some essential functionality for syncing wallets and requesting coin data. This improves it with new messages for fetching coin state and subscribing.

Existing Limitations:

  • You cannot unsubscribe from puzzle hashes or coin ids without disconnecting from the peer. Once you reach the limit you have to reset the connection or connect to additional peers.
  • If a puzzle hash has too many coins to fit in one message, the response will be truncated. This is not a reasonable outcome for many use cases.
  • You cannot request coin state states without subscribing for future updates, which is unnecessary and means the subscription limit will eventually be reached.

New Features

  • Removing subscriptions.
  • Requesting coin state without subscribing.
  • Filtering coin states by amount, hint, and spent status.
  • Pagination when syncing coin state for puzzle hashes.

New Protocol

Remove Puzzle Subscriptions

class RequestRemovePuzzleSubscriptions:
    puzzle_hashes: Optional[List[bytes32]]

class RespondRemovePuzzleSubscriptions:
    puzzle_hashes: List[bytes32]

Removes puzzle hashes from the subscription list (or all of them if None), returning the hashes that were actually removed.

Remove Coin Subscriptions

class RequestRemoveCoinSubscriptions:
    coin_ids: Optional[List[bytes32]]

class RespondRemoveCoinSubscriptions:
    coin_ids: List[bytes32]

Removes coin ids from the subscription list (or all of them if None), returning the ids that were actually removed.

Request Puzzle State

class RequestPuzzleState:
    puzzle_hashes: List[bytes32]
    previous_height: Optional[uint32]
    header_hash: bytes32
    filters: CoinStateFilters
    subscribe_when_finished: bool

class RespondPuzzleState:
    puzzle_hashes: List[bytes32]
    height: uint32
    header_hash: bytes32
    is_finished: bool
    coin_states: List[CoinState]

class RejectPuzzleState:
    reason: uint8 # RejectStateReason

class CoinStateFilters:
    include_spent: bool
    include_unspent: bool
    include_hinted: bool
    min_amount: uint64

Requests coin states that match the given puzzle hashes (or hints).

Unlike RegisterForPhUpdates, this does not add subscriptions for the puzzle hashes automatically.
When subscribe_when_finished is set to True, it will add subscriptions, but only once the last batch has been requested.

As well as this, previously it was impossible to get all coin records if the number of items exceeded the limit.
This implementation allows you to continue where you left off with previous_height and header_hash.

If a reorg of relevant blocks occurs while doing so, previous_height will no longer match header_hash.
This can be handled by a wallet by simply backtracking a bit, or restarting the sync from genesis. It could be inconvenient, but at least you can detect it.
In the event that a reorg is detected by a node, RejectPuzzleState will be returned. This is the only scenario it will be rejected directly like this.

Additionally, it is now possible to filter out spent, unspent, or hinted coins, as well as coins below a minimum amount.
This can reduce the risk of spamming or DoS of a wallet in some cases, and improve performance.

If previous_height is None, you are syncing from genesis. The header_hash should match the genesis challenge of the network you are connected to.

Request Coin State

class RequestCoinState:
    coin_ids: List[bytes32]
    previous_height: Optional[uint32]
    header_hash: bytes32
    subscribe: bool

class RespondCoinState:
    coin_ids: List[bytes32]
    coin_states: List[CoinState]

class RejectCoinState:
    reason: uint8 # RejectStateReason

Request coin states that match the given coin ids.

Unlike RegisterForCoinUpdates, this does not add subscriptions for the coin ids automatically.
When subscribe is set to True, it will add and return as many coin ids to the subscriptions list as possible.

Unlike the new RequestPuzzleState message, this does not implement batching for simplicity. The order is also not guaranteed.
However, you can still specify the previous_height and header_hash to start from.

If a reorg of relevant blocks has occurred, previous_height will no longer match header_hash.
This can be handled by a wallet depending on the use case (for example by restarting from zero). It could be inconvenient, but at least you can detect it.
In the event that a reorg is detected by a node, RejectCoinState will be returned. This is the only scenario it will be rejected directly like this.

If previous_height is None, you are syncing from genesis. The header_hash should match the genesis challenge of the network you are connected to.

Reject State Reason

class RejectStateReason(IntEnum):
    REORG = 0
    EXCEEDED_SUBSCRIPTION_LIMIT = 1

Concerns

The data which a full node gives you cannot be trusted unless it is your own node.
These additions to the protocol make no effort to make untrusted data any easier to validate.
The best method I am aware of is still just to cross reference with other peers and ban outliers.

Checklist

  • Waiting for Add way to get coin states in batches #17300 to be merged.
  • Design and document the new protocol messages.
  • Implement the protocol messages in the full node.
  • Write unit tests for these subscription and coin state protocol messages.
  • Update serialization and regression tests with these messages.
  • Write integration test for syncing a wallet from zero using this new sync protocol.
  • Write integration test for syncing from the last known height, and recovering when a reorg occurs.
  • Write integration test for a reorg occurring during a wallet's sync.
  • Ensure protocol version is correct, and doesn't need to be bumped again.
  • Implement protocol messages in Rust Add UnfinishedBlock2 and released CHIP-0026 protocol messages chia_rs#447 to keep parity.

@Rigidity Rigidity added full_node Added Required label for PR that categorizes merge commit message as "Added" for changelog labels Jan 16, 2024
@Rigidity Rigidity closed this Jan 18, 2024
@Rigidity Rigidity reopened this Jan 18, 2024

This comment was marked as outdated.

@github-actions github-actions bot added the merge_conflict Branch has conflicts that prevent merge to main label Jan 23, 2024
@Rigidity Rigidity closed this Jan 24, 2024
@Rigidity Rigidity reopened this Jan 24, 2024
@github-actions github-actions bot removed the merge_conflict Branch has conflicts that prevent merge to main label Jan 24, 2024
@Rigidity Rigidity changed the title New wallet protocol and wallet sync changes Wallet protocol update Jan 24, 2024
@github-actions github-actions bot added the merge_conflict Branch has conflicts that prevent merge to main label Jan 26, 2024
@github-actions github-actions bot removed the merge_conflict Branch has conflicts that prevent merge to main label Feb 1, 2024
Copy link
Contributor

github-actions bot commented Feb 1, 2024

Conflicts have been resolved. A maintainer will review the pull request shortly.

@Chia-Network Chia-Network deleted a comment from github-actions bot Feb 1, 2024
@Chia-Network Chia-Network deleted a comment from github-actions bot Feb 1, 2024
@Chia-Network Chia-Network deleted a comment from github-actions bot Feb 1, 2024
@richardkiss
Copy link
Contributor

In general, there are opportunities to improve the protocol design as new messages are added. Subscriptions essentially open a "channel" multiplexed over the message stream, so we could introduce an explicit channel concept - just an incrementing nonce namespaced on each websocket.

The request could allow specifying a start block height, so that we don't sync from genesis each time. Wallets can track a "current height" per puzzle hash.

Perhaps something like:

# non-message struct
class PuzzleSubscriptionRequest:  
  puzzle_hash: bytes32
  height: uint32

# message
class RequestAddPuzzleSubscriptions:
  requests: List[PuzzleSubscriptionRequest]

# non-message struct
class RespondSubscriptionRequest:
  channel: uint32
  puzzle_hash: bytes32  
  height: uint32
  header_hash: bytes32

# message
class RespondAddPuzzleSubscriptions:
  requests: List[RespondSubscriptionRequest]

This makes unsubscribe simpler:

class RequestRemoveSubscriptions:
  channel: List[uint32]

class RespondRemoveSubscriptions:
  channel: List[uint32]

Update messages might look something like:

class SubscriptionUpdate:
  channel: uint32
  updates: list[(info about coin spend)] 
  completed_updated_to_height: uint32

The completed_updated_to_height field is a height that we have completely exhausted all updates for. This might be much larger than the block of the last update (if we see no further spends for example) or ONE LESS than the block the last update occurs in (if there are further updates in the same block that will not fit into this message). Essentially, it's the value the wallet can later subscribe to and be assured to have not missed an update.

@richardkiss
Copy link
Contributor

Generally speaking, having multiple potential responses to a particular request is annoying. It means when a client makes a request, it has multiple message IDs it needs to wait on to find a response.

The serialization protocol doesn't support enum types, and there's no obvious way to see how to do it. However, we could simply have each different potential response be Optional. As an example, rather than having two messages each with their own id:

class RespondPuzzleState:
    puzzle_hashes: List[bytes32]
    next_height: Optional[uint32]
    next_header_hash: Optional[bytes32]
    coin_states: List[CoinState]

class RejectPuzzleState:
    header_hash: Optional[bytes32]

we could merge them:

# used in `RespondPuzzleState`
class ResponsePuzzleStateOK:
    puzzle_hashes: List[bytes32]
    next_height: Optional[uint32]
    next_header_hash: Optional[bytes32]
    coin_states: List[CoinState]

# used in `RespondPuzzleState`
class ResponsePuzzleStateFail:
    header_hash: Optional[bytes32]

# message
class RespondPuzzleState:
    ok: Optional[ResponsePuzzleStateOK]
    fail: Optional[ResponsePuzzleStateFail]

Here, exactly one of ok and fail should be not None. We can't enforce this through type-checking, but we can enforce at runtime, and simply fail on messages that don't comply like we would any other ill-formed message (for example, disconnect from or otherwise punish the remote).

@Rigidity
Copy link
Contributor Author

Rigidity commented Feb 3, 2024

In general, there are opportunities to improve the protocol design as new messages are added. Subscriptions essentially open a "channel" multiplexed over the message stream, so we could introduce an explicit channel concept - just an incrementing nonce namespaced on each websocket.

The request could allow specifying a start block height, so that we don't sync from genesis each time. Wallets can track a "current height" per puzzle hash.

Perhaps something like:

# non-message struct
class PuzzleSubscriptionRequest:  
  puzzle_hash: bytes32
  height: uint32

# message
class RequestAddPuzzleSubscriptions:
  requests: List[PuzzleSubscriptionRequest]

# non-message struct
class RespondSubscriptionRequest:
  channel: uint32
  puzzle_hash: bytes32  
  height: uint32
  header_hash: bytes32

# message
class RespondAddPuzzleSubscriptions:
  requests: List[RespondSubscriptionRequest]

This makes unsubscribe simpler:

class RequestRemoveSubscriptions:
  channel: List[uint32]

class RespondRemoveSubscriptions:
  channel: List[uint32]

Update messages might look something like:

class SubscriptionUpdate:
  channel: uint32
  updates: list[(info about coin spend)] 
  completed_updated_to_height: uint32

The completed_updated_to_height field is a height that we have completely exhausted all updates for. This might be much larger than the block of the last update (if we see no further spends for example) or ONE LESS than the block the last update occurs in (if there are further updates in the same block that will not fit into this message). Essentially, it's the value the wallet can later subscribe to and be assured to have not missed an update.

I don't understand the benefit of this, it seems like a more complicated API to do roughly the same thing. But maybe I'm misunderstanding something? Subscribing doesn't actually return any coin states, it just registers that you're interested in a given set of puzzle hashes going forward whenever coin states are updated.

The RequestPuzzleState message allows you to specify the height and header hash to start from, like you're describing, and you keep requesting more coin states until you've reached the peak or you decide to stop.

@Rigidity
Copy link
Contributor Author

Rigidity commented Feb 3, 2024

Generally speaking, having multiple potential responses to a particular request is annoying. It means when a client makes a request, it has multiple message IDs it needs to wait on to find a response.

The serialization protocol doesn't support enum types, and there's no obvious way to see how to do it. However, we could simply have each different potential response be Optional. As an example, rather than having two messages each with their own id:

class RespondPuzzleState:
    puzzle_hashes: List[bytes32]
    next_height: Optional[uint32]
    next_header_hash: Optional[bytes32]
    coin_states: List[CoinState]

class RejectPuzzleState:
    header_hash: Optional[bytes32]

we could merge them:

# used in `RespondPuzzleState`
class ResponsePuzzleStateOK:
    puzzle_hashes: List[bytes32]
    next_height: Optional[uint32]
    next_header_hash: Optional[bytes32]
    coin_states: List[CoinState]

# used in `RespondPuzzleState`
class ResponsePuzzleStateFail:
    header_hash: Optional[bytes32]

# message
class RespondPuzzleState:
    ok: Optional[ResponsePuzzleStateOK]
    fail: Optional[ResponsePuzzleStateFail]

Here, exactly one of ok and fail should be not None. We can't enforce this through type-checking, but we can enforce at runtime, and simply fail on messages that don't comply like we would any other ill-formed message (for example, disconnect from or otherwise punish the remote).

We don't have any precedent for doing it this way, and this increases the size of the protocol messages and requires differentiating between the response types just the same. Message responses are identified by an id, so rather than listening for a given response type you just wait until you receive a message with the same nonce as was passed in the request.

It's pretty simple to do, here is an example I have (a bit outdated, but it shows the idea):
https://github.com/Chia-Network/chia_rs/pull/369/files#diff-7f65462d1a9e187413a5013428d70e00a201bec3b2d7e0b26ce7192cbc281462R275

@Rigidity
Copy link
Contributor Author

Rigidity commented Feb 3, 2024

You will usually use RequestPuzzleState and RequestCoinState as an entrypoint for subscribing, since it's more resilient to reorgs and more concise. There's a case to be made for removing the AddSubscriptions messages entirely if they don't have a use-case.

Copy link
Contributor

@arvidn arvidn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable

chia/full_node/full_node.py Outdated Show resolved Hide resolved
chia/full_node/full_node.py Outdated Show resolved Hide resolved
chia/full_node/full_node.py Outdated Show resolved Hide resolved
@richardkiss
Copy link
Contributor

We don't have any precedent for doing it this way, and this increases the size of the protocol messages and requires differentiating between the response types just the same. Message responses are identified by an id, so rather than listening for a given response type you just wait until you receive a message with the same nonce as was passed in the request.

I did notice a Optional[uint16] in the wrapper Message type. I thought I tried using that and it was None, but I will try again. It's definitely the best way I can think of to route messages, by mapping the responses to their requests with a nonce.

@richardkiss
Copy link
Contributor

I don't understand the benefit of this, it seems like a more complicated API to do roughly the same thing. But maybe I'm misunderstanding something? Subscribing doesn't actually return any coin states, it just registers that you're interested in a given set of puzzle hashes going forward whenever coin states are updated.

For request/responses, I see the signature conceptually as

class SomeRemoteObject:
    async def some_request(self, foo: SerializableType1, bar: SerializableType2) -> SerializableType3:
        ...

For subscriptions, I see it as

class SomeRemoteObject:
    async def subscribe_to_something(self, identifier: SerializableIdentifier) -> AsyncIterator[SubscriptionUpdate]:
        ...

Internal to this, you'd use a nonce to identify subscriptions. Remote messages containing subscription updates could use that subscription nonce to route it to the correct async iterator, and also use that nonce to inform the remote to cancel the subscription.

This makes local use really easy. You get the async iterator with the subscription api, then simply await to get items out.

[As I write this, I realize you probably actually want to return an async context manager so you can more affirmatively cancel the subscription with a remote message no matter how you exit locally.]

I realize these comments aren't really about the code. But I've been trying to write code to connect to remote chia nodes and I've found it to be much more complicated than it should be, and how certain changes could allow for better abstractions and smaller code. But maybe you don't want to tackle this now.

@Rigidity
Copy link
Contributor Author

Rigidity commented Feb 5, 2024

We don't have any precedent for doing it this way, and this increases the size of the protocol messages and requires differentiating between the response types just the same. Message responses are identified by an id, so rather than listening for a given response type you just wait until you receive a message with the same nonce as was passed in the request.

I did notice a Optional[uint16] in the wrapper Message type. I thought I tried using that and it was None, but I will try again. It's definitely the best way I can think of to route messages, by mapping the responses to their requests with a nonce.

It should be the same value as the request id, if that's left as None the response will be too. If it's an update (ie CoinStateUpdate), it will also be None.

Btw I wrote the chia-client crate in this repo which helps connect to (trusted) full node peers and routes messages. It's not perfect (lacking proper error handling, disconnect handling, and validation) but it's a start.

@arvidn
Copy link
Contributor

arvidn commented Feb 6, 2024

Generally speaking, having multiple potential responses to a particular request is annoying. It means when a client makes a request, it has multiple message IDs it needs to wait on to find a response.

my recollection of how our protocol driver works is that it's not waiting for message IDs, it's demultiplexing based on transaction IDs (or request IDs). So, any message can (in theory) be a response to any other. The current protocols usually have one success message and a separate failure message too.

@github-actions github-actions bot removed the merge_conflict Branch has conflicts that prevent merge to main label Mar 29, 2024
Copy link
Contributor

Conflicts have been resolved. A maintainer will review the pull request shortly.

@arvidn
Copy link
Contributor

arvidn commented Mar 29, 2024

I think it's reasonable to not cover the checks for race condition

Copy link
Contributor

File Coverage Missing Lines
chia/full_node/full_node_api.py 98.2% lines 1824, 1881
Total Missing Coverage
818 lines 2 lines 99%

@arvidn arvidn added ready_to_merge Submitter and reviewers think this is ready and removed coverage-diff labels Mar 29, 2024
@pmaslana pmaslana merged commit 23e8863 into main Mar 29, 2024
311 of 312 checks passed
@pmaslana pmaslana deleted the new-wallet-sync-protocol branch March 29, 2024 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Added Required label for PR that categorizes merge commit message as "Added" for changelog full_node ready_to_merge Submitter and reviewers think this is ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants