Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Update to work with async keystore – Companion PR for #7000 #1740

Merged
55 commits merged into from
Oct 9, 2020

Conversation

rakanalh
Copy link
Contributor

@rakanalh rakanalh commented Sep 21, 2020

Companion for paritytech/substrate#7000

Burnin PR: #1746

@rakanalh rakanalh added A3-in_progress Pull request is in progress. No review needed at this stage. A4-companion A PR that needs a companion PR to merge in parallel for one of its downstream dependencies. labels Sep 21, 2020
@rakanalh rakanalh self-assigned this Sep 21, 2020
@rakanalh rakanalh requested a review from mxinden September 23, 2020 02:39
Copy link
Contributor

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I am the right person to approve anything touching Polkadot as I have not been involved in this part of the code base enough to do so.

node/core/provisioner/Cargo.toml Outdated Show resolved Hide resolved
}

#[test]
fn not_more_than_one_per_validator() {
let keystore_path = tempfile::tempdir().expect("Creates keystore path");
let keystore : CryptoStorePtr = Arc::new(LocalKeystore::open(keystore_path.path(), None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely a dump question: Wouldn't a in memory keystore suffice here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. However, generating keys without seeds would skip adding those keys to additional (cache) and go straight to writing them on the filesystem. Without configuring the keystore to use the filesystem, adding any key would generate the key and then lose sight of it. This is why i opted for configuring the keystore to write on the filesystem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without configuring the keystore to use the filesystem, adding any key would generate the key and then lose sight of it

Could you elaborate? We are storing the pubkeys in the VALIDATORS hashmap though, why do we need a seed? Are you saying that if we replace LocalKeystore::open with LocalKeystore::in_memory the tests will fail?

If so, this warrants a comment in code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure VALIDATORS here is needed anymore but didn't want to remove it. However, to answer your question, and according to substrate's keystore impl, generating a key without a seed would trigger this function which requires that the keystore's path is configured rather than using the in-memory storage.

I will add a comment about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added. VALIDATORS removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification.
It's still can be avoided maybe by e.g. using dump seeds. But it's only used in tests anyway, so probably not worth time spent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure VALIDATORS here is needed anymore but didn't want to remove it.

I think it's still needed (we're expecting the same key for the given ValidatorIndex), cc @coriolinus

node/network/bitfield-distribution/src/lib.rs Outdated Show resolved Hide resolved
runtime/parachains/src/inclusion.rs Outdated Show resolved Hide resolved
Copy link
Member

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable.

@ghost
Copy link

ghost commented Oct 8, 2020

Waiting for commit status.

node/core/backing/Cargo.toml Outdated Show resolved Hide resolved
node/core/backing/src/lib.rs Outdated Show resolved Hide resolved
node/core/backing/src/lib.rs Outdated Show resolved Hide resolved
@ghost
Copy link

ghost commented Oct 8, 2020

Head SHA changed; merge aborted.

} else {
None
for (idx, validator) in validators.iter().enumerate() {
if CryptoStore::has_keys(&*keystore, &[(validator.to_raw_vec(), PARACHAIN_KEY_TYPE_ID)]).await {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be absolutely slow in the moment we have remote signing. Sounds like we need a function that returns all the indices of the keys that it has.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we only make one remote call.

Copy link
Contributor Author

@rakanalh rakanalh Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a change in the way that has_keys works where it should return the index of the items found from the array sent as an argument instead of a bool. This could be implemented as a follow-up PR to both substrate and polkadot. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the idea was to fix this in a follow up pr.

.find_map(|v| keystore.key_pair::<ValidatorPair>(&v).ok())
pub async fn signing_key(validators: &[ValidatorId], keystore: SyncCryptoStorePtr) -> Option<ValidatorId> {
for v in validators.iter() {
if CryptoStore::has_keys(&*keystore, &[(v.to_raw_vec(), ValidatorId::ID)]).await {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

primitives/src/v0.rs Outdated Show resolved Hide resolved
Co-authored-by: Bastian Köcher <[email protected]>
@tomaka tomaka added B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Oct 9, 2020
@gnunicorn gnunicorn added B1-releasenotes and removed B0-silent Changes should not be mentioned in any release notes labels Oct 9, 2020
@gnunicorn gnunicorn changed the title Companion PR for #7000 Update to work with async keystore – Companion PR for #7000 Oct 9, 2020
@rakanalh
Copy link
Contributor Author

rakanalh commented Oct 9, 2020

bot merge

@ghost
Copy link

ghost commented Oct 9, 2020

Trying merge.

@ghost ghost merged commit bc7d132 into paritytech:master Oct 9, 2020
@rakanalh rakanalh deleted the keystore-fixes branch October 9, 2020 10:56
@tomaka
Copy link
Contributor

tomaka commented Oct 9, 2020

gnunicorn added the B1-releasenotes label 3 minutes ago

I don't know what is the right thing, but fwiw I put silent because the Substrate PR already has the appropriate labels.

coriolinus added a commit that referenced this pull request Oct 9, 2020
ghost pushed a commit that referenced this pull request Oct 28, 2020
* start working on building the real overseer

Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.

* fill in AllSubsystems internal constructors

* replace fn make_metrics with Metrics::attempt_to_register

* update to account for #1740

* remove Metrics::register, rename Metrics::attempt_to_register

* add 'static bounds to real_overseer type params

* pass authority_discovery and network_service to real_overseer

It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.

* select a proper database configuration for the availability store db

* use subdirectory for av-store database path

* apply Basti's patch which avoids needing to parameterize everything on Block

* simplify path extraction

* get all tests to compile

* Fix Prometheus double-registry error

for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:

```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
	err
}),
```

That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!

* authorities must have authority discovery, but not necessarily overseer handlers

* fix broken SpawnedSubsystem impls

detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.

This is not the final fix needed to get things working properly,
but it's a good start.

* use prometheus properly

Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.

We were using them wrong, though. This pattern was repeated in a
variety of places in the code:

```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```

The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.

This commit fixes that.

* get av-store subsystem to actually run properly and not die on first signal

* typo fix: incomming -> incoming

* don't disable authority discovery in test nodes

* Fix rococo-v1 missing session keys

* Update node/core/av-store/Cargo.toml

* try dummying out av-store on non-full-nodes

* overseer and subsystems are required only for full nodes

* Reduce the amount of warnings on browser target

* Fix two more warnings

* InclusionInherent should actually have an Inherent module on rococo

* Ancestry: don't return genesis' parent hash

* Update Cargo.lock

* fix broken test

* update test script: specify chainspec as script argument

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <[email protected]>

* node/service/src/lib: Return error via ? operator

* post-merge blues

* add is_collator flag

* prevent occasional av-store test panic

* simplify fix; expand application

* run authority_discovery in Role::Discover when collating

* distinguish between proposer closed channel errors

* add IsCollator enum, remove is_collator CLI flag

* improve formatting

* remove nop loop

* Fix some stuff

Co-authored-by: Andronik Ordian <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Fedor Sakharov <[email protected]>
Co-authored-by: Robert Habermeier <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Max Inden <[email protected]>
ghost pushed a commit that referenced this pull request Oct 31, 2020
* start working on building the real overseer

Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.

* fill in AllSubsystems internal constructors

* replace fn make_metrics with Metrics::attempt_to_register

* update to account for #1740

* remove Metrics::register, rename Metrics::attempt_to_register

* add 'static bounds to real_overseer type params

* pass authority_discovery and network_service to real_overseer

It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.

* select a proper database configuration for the availability store db

* use subdirectory for av-store database path

* apply Basti's patch which avoids needing to parameterize everything on Block

* simplify path extraction

* get all tests to compile

* Fix Prometheus double-registry error

for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:

```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
	err
}),
```

That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!

* authorities must have authority discovery, but not necessarily overseer handlers

* fix broken SpawnedSubsystem impls

detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.

This is not the final fix needed to get things working properly,
but it's a good start.

* use prometheus properly

Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.

We were using them wrong, though. This pattern was repeated in a
variety of places in the code:

```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```

The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.

This commit fixes that.

* get av-store subsystem to actually run properly and not die on first signal

* typo fix: incomming -> incoming

* don't disable authority discovery in test nodes

* Fix rococo-v1 missing session keys

* Update node/core/av-store/Cargo.toml

* try dummying out av-store on non-full-nodes

* overseer and subsystems are required only for full nodes

* Reduce the amount of warnings on browser target

* Fix two more warnings

* InclusionInherent should actually have an Inherent module on rococo

* Ancestry: don't return genesis' parent hash

* Update Cargo.lock

* fix broken test

* update test script: specify chainspec as script argument

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <[email protected]>

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <[email protected]>

* node/service/src/lib: Return error via ? operator

* post-merge blues

* add is_collator flag

* prevent occasional av-store test panic

* simplify fix; expand application

* run authority_discovery in Role::Discover when collating

* distinguish between proposer closed channel errors

* add IsCollator enum, remove is_collator CLI flag

* improve formatting

* remove nop loop

* Fix some stuff

* Adds test parachain adder collator

* Add sudo to Rococo, change session length to 30 seconds and some renaming

* Update to the latest changes on master

* Some fixes

* Fix compilation

* Update parachain/test-parachains/adder/collator/src/lib.rs

Co-authored-by: Sergei Shulepov <[email protected]>

* Review comments

* Downgrade transaction version

* Fixes

* MOARE

* Register notification protocols

* utils: remove unused error

* av-store: more resilient to some errors

* address review nits

* address more review nits

Co-authored-by: Peter Goodspeed-Niklaus <[email protected]>
Co-authored-by: Andronik Ordian <[email protected]>
Co-authored-by: Fedor Sakharov <[email protected]>
Co-authored-by: Robert Habermeier <[email protected]>
Co-authored-by: Peter Goodspeed-Niklaus <[email protected]>
Co-authored-by: Max Inden <[email protected]>
Co-authored-by: Sergey Shulepov <[email protected]>
Co-authored-by: Sergei Shulepov <[email protected]>
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A3-in_progress Pull request is in progress. No review needed at this stage. A4-companion A PR that needs a companion PR to merge in parallel for one of its downstream dependencies. C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants