Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Polkadot service #82

Merged
merged 21 commits into from
Mar 30, 2018
Merged

Polkadot service #82

merged 21 commits into from
Mar 30, 2018

Conversation

arkpar
Copy link
Member

@arkpar arkpar commented Feb 25, 2018

This PR integrates the client, consensus, networking and transaction pool into a Service struct.
Still TODO:

  • Integrate keystore
  • CLI options for network configuration (at least bootnodes)
  • Gossip protocol for BFT
  • BFT message accumulation and sorting

@arkpar arkpar added the A3-in_progress Pull request is in progress. No review needed at this stage. label Feb 25, 2018
trace!(target:"sync", "Error broadcasting statement notification: {:?}", e);
}
} else {
trace!(target:"sync", "Ignored statement from unregistered peer {}", peer_id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra tab

}
}

fn run_bft<C, P>(bft_service: &BftService<P, C>, network: Arc<net::ConsensusService>, client: &C, handle: reactor::Handle, header: &Header) -> Result<(), <P as bft::ProposerFactory>::Error> where
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: more idiomatic as a free function

bft_service.build_upon(&header, input, output, handle.clone())
}

fn process_message(msg: net::BftMessage, authorities: &[AuthorityId], parent_hash: HeaderHash) -> Option<bft::Communication> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this too

impl Drop for Service {
fn drop(&mut self) {
if let Some(thread) = self.thread.take() {
thread.join().expect("The service thread has panicked");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i worry that this could cause shutdown to deadlock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service now explicitly shuts down the network and client event stream, that should cancel all futures and stop the event loop.


impl<E> Sink for BftSink<E> {
type SinkItem = bft::Communication;
type SinkError = E;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we leave a TODO to replace this with the ! type when that's stabilized?

@arkpar arkpar changed the title Consensus network messages Polkadot service Mar 20, 2018
debug!("Error starting BFT agreement: {:?}", e);
}
}).map_err(|e| debug!("BFT agreement error: {:?}", e));
if let Err(_e) = core.run(start_bft.into_future()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of doing this in a loop, couldn't we do

let bft_starter = client.import_notification_stream().for_each(|notification| ...) and core.run(bft_starter)?

let mut storage = Default::default();
let key: AccountId = Keyring::One.into();

let genesis_config = GenesisConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this duplicated from the CLI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved, not duplicated

let events = thread_client.import_notification_stream().map(|notification| {
thread_network.on_block_imported(&notification.header);
});
if let Err(_) = core.run(events.into_future()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, i think that this is a misuse of into_future() and the loop isn't strictly required

/// Signal that a valid block with the given header has been imported.
///
/// If the local signing key is an authority, this will begin the consensus process to build a
/// block on top of it. If the executor fails to run the future, an error will be returned.
pub fn build_upon(&self, header: &Header) -> Result<(), P::Error> {
pub fn build_upon<InStream, OutSink, E>(&self, header: &Header, input: InStream, output: OutSink, executor: E) -> Result<(), P::Error> where
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may as well skip the executor as a parameter if it isn't owned by the service anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip the executor as a parameter of what? This function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. this function takes an executor as a parameter just so it can call executor.execute at the end. might as well just do that at the call sites and save some complexity

/// Block was broadcasted on the network.
NetworkBroadcast,
/// Block that was collated by this node.
Own,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this mean in the context of the 2-phase commit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that the block body was generated on this machine and was not received from the network.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the block body is almost always generated on another machine, but any block bodies coming directly from the consensus process have already been evaluated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a variant for that.

/// Block is part of the initial sync with the network.
NetworkInitialSync,
/// Block was broadcasted on the network.
NetworkBroadcast,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't all blocks broadcast over the network in some sense? doesn't it make more sense to provide hints about whether a block is the head of the chain or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The head hint is provided as a separate parameter to import function. This is for something else. Mainly for logging and possibly skipping certain checks for local blocks.

header: header,
is_new_best: is_new_best,
};
if let Err(e) = self.import_notification_sink.lock().try_send(notification) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fails if the buffer is full; seems dangerous to discard the notification in this case. maybe use the Sink::send api and wait for the future to resolve?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import is more important and should not block on the notification handler. Normally this should never happen but if it does, skipping a notification is fine since there will be a new block eventually anyway. The handlers are not supposed to expect to receive notifications for every block anyway. It is quite possible that the node goes to sleep or loses connectivity for a while and when it goes back online there won't be any notifications for all the blocks that arrived in the meantime.

Copy link
Contributor

@rphmeier rphmeier Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipping a notification is fine since there will be a new block eventually anyway

this is reasoning about "eventually" a little too long term for me to be comfortable using this in practice.
there won't be a new block if not enough authorities start the agreement process (e.g. if there are 1/3 offline and one of the "honest" nodes skips this notification. then the agreement would only complete when this guy restarted his node)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increased the buffer to 65536 for now. Unfortunately it is preallocated.

Copy link
Contributor

@rphmeier rphmeier Mar 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still dangerous but probably workable for now. we should write a fix after this PR -- maybe a 50ms timer where the same chain head being observed twice in a row starts consensus?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll address that in the following PR. Probably ny replacing multiqueue with a vector of mpsc channels.

if self.peers.contains_key(&peer_id) {
// TODO: validate signature?
if let Err(e) = self.bft_message_sink.try_send(message) {
trace!(target:"sync", "Error broadcasting BFT message notification: {:?}", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropping messages when the buffer is full will almost certainly cause consensus to fail

Copy link
Member Author

@arkpar arkpar Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The network handler should not block. If the buffer is full this means that the system just lacks performance to handle the messages and consensus will fail anyway. Blocking here will just move congestion to the lower level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree -- we could get 65 messages all at once and then one will be dropped. But if those are the last messages we were going to receive then it's fine to let the agreement logic process them slowly. It isn't fine to drop the last commit message, for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increased the buffer to 65536 for now. Unfortunately it is preallocated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worrying but workable for PoC-1. This is something that should fail on OOM and not before

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll address that in the following PR. Probably ny replacing multiqueue with a vector of mpsc channels.

@@ -23,7 +25,7 @@ error_chain! {
}

links {
Client(client::error::Error, client::error::ErrorKind);
Client(client::error::Error, client::error::ErrorKind) #[doc="Blockchain error"];
Copy link
Contributor

@rphmeier rphmeier Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it always a blockchain error?

@arkpar arkpar added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Mar 21, 2018
@gavofyork
Copy link
Member

looks reasonable to me

@@ -140,7 +140,7 @@ pub trait PolkadotApi {

/// A checked block ID used for the substrate-client implementation of CheckedBlockId;
#[derive(Debug, Clone, Copy)]
pub struct CheckedId(BlockId);
pub struct CheckedId(pub BlockId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making this pub kind of defeats the purpose of the interface


pub fn on_statement(&mut self, peer_id: PeerId, statement: message::Statement) {
if let Some(ref mut peer) = self.peers.get_mut(&peer_id) {
// TODO: validate signature?
Copy link
Contributor

@rphmeier rphmeier Mar 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also TODO (but beyond this PR) is handle only current parent-hash statements

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be addressed in an upcoming PR


pub fn on_bft_message(&mut self, peer_id: PeerId, message: message::BftMessage) {
if self.peers.contains_key(&peer_id) {
// TODO: validate signature?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't the signatures being checked? I thought you mentioned that to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(it's done in process_message on the other side of this stream)

Copy link
Member Author

@arkpar arkpar Mar 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but eventually there has to be a check here as well to avoid propagating invalid messages. Or it should be moved here.

@rphmeier rphmeier added A5-grumble and removed A0-please_review Pull request needs code review. labels Mar 26, 2018
@arkpar arkpar added A0-please_review Pull request needs code review. and removed A5-grumble labels Mar 26, 2018
@rphmeier rphmeier merged commit 8561d59 into master Mar 30, 2018
@rphmeier rphmeier deleted the ark-consensus-network branch March 30, 2018 11:54
JoshOrndorff added a commit to moonbeam-foundation/substrate that referenced this pull request Apr 21, 2021
* Fix and re-enable ts-block tests

* Update frontier checkout

* Runtime impl_version bump

* cargo update

* Remove comments

Co-authored-by: Joshy Orndorff <[email protected]>
iStrike7 referenced this pull request in gasp-xyz/substrate Aug 3, 2021
liuchengxu pushed a commit to chainx-org/substrate that referenced this pull request Aug 23, 2021
liuchengxu pushed a commit to chainx-org/substrate that referenced this pull request Aug 23, 2021
* refactor for multisig

remove doublemap for multisig

* Feature/linked node (paritytech#79)

provide linked node data struct for runtime module storage
this linked node support option data or no option data

* provide linkednode struct in cxsupport
* refactor linked node
* add option for LinkedNode
* refactor linkednode
      remove template mode, use associate type to replace it

* Fix static net address (paritytech#80)

* Feature/linked_node provide multinodexindex (paritytech#82)

provide multinodexindex

* fix bug for linked_node

when add same index node, do nothing for this node

* refactor financialrecords to support withdraw cache

refactor financialrecords to support withdraw cache and remove deposit fee

* btc bridge

* rename num/hash relationship data (NumberForHash/HashsForNumber)
* let HashsForNumber map to a vec to get all forked block
* add blocknumber in BlockHeaderFor

* tokenbalances

refactor tokenbalances to support issue token in genesis

* reject issue chainx token and provide u32, usize as for tokenbalance

* Perfect deposit  (paritytech#83)

* Add deposit cache

* Perfect deposit

* Perfect withdraw (paritytech#84)

* Perfect withdraw

* add network check in btc bridge

* when meet testnet, jump header bit check
* check the bit change block in genesis

* Fix test build

* Feature/refactor match (paritytech#86)

*  matchorder and pendingorders

* Fix op_return account id parse

* tokenbalances: provide reserved type for reservedtoken

* Fix merge error

* update genesis_config

* Update genesis_config

* x-btc

* provide  codec for btreemap

due to orphan for mod, use local struct named `CodecBTreeMap`

* Update latest bitcoin-rust dependeces

* Implement initial vote weight via coin age (paritytech#87)

* Use struct instead of map as much as possible

* Unchecked initial version

* All intentions except these validators are candidates

* Add harsh test

* Put candidates into stats

* Rename unstake to deactive

* Revert StakeWeight

* Remove useless code

* Remove MAX_INTENTIONS

* Refactor with btreemap (paritytech#88)

* Refactor NominationRecordsOf to NominationRecords using BTreeMap

* Remove candidate_count

* Rename deactive to deactivate

* optimization match (paritytech#89)

* remove ensureaccountliquid in tokenbalances and support

for ensureaccountliquied has changed in staking module

* Hotfix/fix stakeweight type (paritytech#90)

* Revert StakeWeight type

* Change Balance to u64

* Fix total_minted and remove valid_cands_weight

* Fix insert registration information failure (paritytech#91)

* Change receive_address type

* Update exterbn

* update secp256k1 dependency (paritytech#94)

* Fix receive_address bug

* Support new substrate in cxrml

* update rust-secp256k1 dependeces

* Runtime build ok

* Build ok

* New runtime interface

* Update all runtime module

* Runtime build ok

* All build ok

* Add node runtime to support chainx runtime

* Update new runtime
liuchengxu pushed a commit to autonomys/substrate that referenced this pull request Jun 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants