Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Practical Detached State Persistence with Trees and Oracles #769

Closed
igormcoelho opened this issue May 21, 2019 · 8 comments
Closed

Practical Detached State Persistence with Trees and Oracles #769

igormcoelho opened this issue May 21, 2019 · 8 comments

Comments

@igormcoelho
Copy link
Contributor

Currently, state is not persisted on the chain, but calculated locally on each node. It was opened a discussion over having a detached state and block persistence: #302

One follow-up is to develop some state tree (MPT/SPT/...) and put the hash on block header: #385

This effectively resolves the state persistence, however it strongly affects consensus performance, as state calculation must be made before block proposal. This proposal aims to restore the original intent of having a detached state and block persistence, without affecting performance and also considering Neo 3.0 oracles.

Consensus nodes have big jobs and we need to keep it as simple as possible.
The most important job is: order valid transactions, calculate fees and create the block.
This job is made by the speaker, and the consensus signs/validates the proposed block after three phases. Since nodes will only have access to the order after speaker proposes the block, the state computation will be delayed on these terms as well.

Compute states only after block is persisted locally
After block is relayed and persisted locally, state computation can start safely (because tx order is known). As this is finished, state tree hash will be updated accordingly, and since all operations are deterministic, every node on the network will have the same state hashes.
During consensus, nodes can send avaialble state hash on PrepareResponse/Request, and also attach commit signature according to the current available state height (hash), even if this value is delayed a few blocks. This increases 32 bytes (state hash) per signature on block relay.
If this extra space is harmful to the network, as more consensus nodes join (or if we adopt Schnorr signatures), this doesn't need to be done every block but every few blocks (intermediate computation for missing values can be done as explained later here, even on light nodes). If hashes do not match past expected values (on PrepRequest/Response), consensus will be considered faulty.

Computation on light nodes without whole state information
If light nodes want to be certain that computation is correct, it is possible to take the last signed state hash of consensus nodes (which are trusted) as a checkpoint hash C . After this checkpoint, it needs to locally compute transactions (so it will need the full blocks after that), provided by RPC only the specific storage states at C (which is trusted) that are necessary for tx computations. So, locally, node can be certain of any state it doesn't have, but it will require to execute a simplified version of NeoVM/ApplicationEngine. Note that it doesn't need to execute all blocks, but only tx that affect the information it needs (for example, checking user balance of specific token).
This may be even not used on practice if wallets have a trusted RPC endpoint (that does this job), but at least it's possible to have safe information without any trusted agent.

** The oracles **
Oracles introduce amazing capabilities to the chain, like accessing any external entities (even as active agents), so it's a fundamental change on Neo 3.0. It also may introduce overheads, if oracle is required to run in advance.
One proposal is to require users to attach oracle intents on a TransactionAttributeUsage.Url attribute, to use them later. This is similar to UTXO two-stage cash-out performed by current Neo contract tokens (CGAS/CNEO). Contract first stages a oracle operation via attribute, and depending on the logic, it could freeze assets or any contract storage state markers (imagine a bet, for example, that depends on a external random agent). And in a future step, it could access this oracle result (further stored on chain state) which is corresponding to the tx id that submitted it in the first place. This way, no pre-calculation needs to be made in advance.

@igormcoelho
Copy link
Contributor Author

@erikzhang
Copy link
Member

erikzhang commented May 21, 2019

For oracle, we can read the URLs from Transaction.Attributes, and write the results to the Block.ConsensusData. This doesn't need execute the contracts in the block.

For MPT, I think we can store it outside the block header. Because if we have bugs in the code, the states may be wrong. If we compute the tree from the wrong states, we will never have a chance to correct it.

@igormcoelho
Copy link
Contributor Author

igormcoelho commented May 21, 2019

At some point, we need the state to enter block header (perhaps not immediately), otherwise we will need to create an alternative chain to handle that, which is more overhead. But we can version the tree, which is nice since technology changes. MPT is outdated today, SMT is more promising.. but who knows about tomorrow. If we version it, we have opportunity to fix it, and also to evolve it in time.

For oracle requests, perhaps they are not fast enough to keep single block response Erik. How much does it take to parse a single external page in the globe (and how many bytes maximum allowed)? Imagine many pages now (every tx can request some of them...). It could take milliseconds or seconds, and this is enough to slow down block generation.
Let's think on leaving oracle responses dettached, for few blocks if necessary. We can use Block.ConsensusData to put it or any storage state field, doesn't matter, as long as this is kept for the future.

@igormcoelho
Copy link
Contributor Author

igormcoelho commented May 21, 2019

Other option I see now for Oracle is to process it as soon as it enters consensus node mempool. I haven't thought of that before, but this is enough to guarantee it to enter Block.ConsensusData (tx only is considered to be in when its field is fully processed).

However, two execution modes could be allowed on oracles: (I) execute trusting single consensus node (II) require other consensus to agree on the data.
Some external data may be non-deterministic (random.org), so a single node will need to be trusted (if it's byzantine, contract will need to handle it somehow). However, for external deterministic data, it would be nice to have all nodes validating it during consensus (perhaps a byte field on Url attribute, as deterministic or non-deterministic).

@erikzhang
Copy link
Member

erikzhang commented May 21, 2019

For example:

int SomeMethod()
{
    int x = 0x7fffffff;
    int y = 1;
    return x + y;
}

The result in NeoVM should be 0x80000000, because in NeoVM the numbers are converted into BigInteger.

But if we mistakenly calculate the result of this contract as -1 (by not convert the int to BigInteger) and store it in storage, then the states and MPT we got are wrong. We can easily correct the states by fixing the bug and resync the blocks, but there is no way to correct the MPT root in the header. It's not a versioning problem.

@igormcoelho
Copy link
Contributor Author

igormcoelho commented May 21, 2019

Just to understand, you mention Native Contract implementations, right? One possibility for them could be to stay outside the state hash, as long as no "in-state" contract can access this information too (otherwise cascade effect would corrupt its storage hash too).
We could have a verified state, useful and necessary for NEP-5 and deployed contracts, and a unverified state for native contracts, that may change due to bug fixes. Do you think this is interesting?

@erikzhang
Copy link
Member

Native contracts and other contracts can be called from each other. You have no way to separate their states. We can store the MPT root outside the block header.

@igormcoelho
Copy link
Contributor Author

igormcoelho commented May 21, 2019

No problem for me to store them outside headers, as long as they are accessible to the nodes (for external verification). We can create a separate file to load them (so nodes can verify them from time to time, if wanted) and NGD can distribute them.
Since each contract will have a specific state hash (and these hashes will then be merged) into a final hash, it will be easy to track bugfixes that may affect state. Anyway, I don't think this will happen on a stable Neo 3.0, we will endup having a unique state hash, outside headers, which is fine too.

Thacryba pushed a commit to simplitech/neo that referenced this issue Feb 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants