Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Fix compiler warning (that will become an error) #10683

Merged
merged 6 commits into from
May 23, 2019

Conversation

dvdplm
Copy link
Collaborator

@dvdplm dvdplm commented May 21, 2019

Fixes #10648

Depends on #10689

I'm not sure this fix is Good™ but rustc seems happy enough.

There's a deeper issue which may or may not be related to this: the Engine is not shutdown properly and the StepService thread keeps running indefinitely after Ctrl-C (so update_sealing() is called repeatedly for 300sec). I don't think this is related to Clique as I've seen this happen on mainnet as well, but I wonder if the effects of it are worse for a PoA network where the node can create new blocks all on its own?

dvdplm added 2 commits May 21, 2019 11:57
Fixes #10648

I'm not sure this fix is Good™ but rustc seems happy enough.

There's a deeper issue which may or may not be related to this: the Engine is not shutdown properly and the `StepService` thread keeps running indefinitely after Ctrl-C (so `update_sealing()` is called repeatedly for 300sec). I don't think this is related to Clique as I've seen this happen on mainnet as well, but I wonder if the effects of it are worse for a PoA network where the node can create new blocks all on its own?
@dvdplm dvdplm changed the title Thank you for your Pull Request! Fix compiler warning (that will become an error) May 21, 2019
@dvdplm dvdplm requested review from niklasad1 and tomusdrw May 21, 2019 20:47
@dvdplm dvdplm self-assigned this May 21, 2019
@dvdplm dvdplm added the A0-pleasereview 🤓 Pull request needs code review. label May 21, 2019
engine.step_service = Some(StepService::start(Arc::downgrade(&res) as Weak<Engine<_>>));
let engine_weak = Arc::downgrade(&engine.clone());
let step_service = StepService::start(engine_weak as Weak<Engine<_>>);
Arc::get_mut(&mut engine).map(|e| e.step_service = Some(step_service));
Copy link
Collaborator

@niklasad1 niklasad1 May 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will ever work because we got two references to the same value, https://doc.rust-lang.org/std/sync/struct.Arc.html#method.get_mut

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it compiles with no warning and runs (seemingly) ok! It feels iffy for sure though.

Do you think it works because of the 5sec delay in the thread started by StepService?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, indeed this shouldn't work - can we replace map with expect to make sure it's always there (cause that's what we assume).
Maybe let's separate creation of StepService from actually starting the inner thread? So that we can do:

let mut engine = ...;
let step_service = StepService::new();
engine.step_service = step_service;
let engine = Arc::new(engine);
let weak = Arc::downgrade(&engine);
engine.step_service.start(weak);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomusdrw excellent idea, I'll see where it takes me. Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

engine.step_service.start(weak); does not work as long as step_service.start() takes &mut self (can't move out of an Arc) so need some tinkering. Hmm.

Copy link
Collaborator

@tomusdrw tomusdrw May 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvdplm because you can't store the thread handle? Maybe put it to Mutex<Option<JoinHandle>>, since it's not hot code anyway, and we just need to store it. That way you will avoid mutability requirement.

Copy link
Collaborator Author

@dvdplm dvdplm May 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because you can't store the thread handle?

Precisely. Using a Mutex most likely works; possibly reducing StepService to an Arc<AtomicBool> (i.e. not storing the thread handle at all) and let the thread run wild and free could work.

I am however pausing work on this for the moment because shutdown is currently broken elsewhere, and that stops the shutdown code in Clique to ever run, so I need to fix that before coming back to this. :/

Copy link
Collaborator

@niklasad1 niklasad1 May 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it could potentially be this https://github.com/paritytech/parity-ethereum/blob/master/ethcore/src/engines/clique/mod.rs#L749?

I expect the Client to aliased there, have you seen the warn print?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@niklasad1 with the other shutdown bug I do see it. Without it, that code is never called.

Copy link
Collaborator

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good apart from the dArc magic.


if our_params.period > 0 {
engine.step_service = Some(StepService::start(Arc::downgrade(&res) as Weak<Engine<_>>));
let engine_weak = Arc::downgrade(&engine.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clone() doesn't seem to be needed here

Suggested change
let engine_weak = Arc::downgrade(&engine.clone());
let engine_weak = Arc::downgrade(&engine);

engine.step_service = Some(StepService::start(Arc::downgrade(&res) as Weak<Engine<_>>));
let engine_weak = Arc::downgrade(&engine.clone());
let step_service = StepService::start(engine_weak as Weak<Engine<_>>);
Arc::get_mut(&mut engine).map(|e| e.step_service = Some(step_service));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, indeed this shouldn't work - can we replace map with expect to make sure it's always there (cause that's what we assume).
Maybe let's separate creation of StepService from actually starting the inner thread? So that we can do:

let mut engine = ...;
let step_service = StepService::new();
engine.step_service = step_service;
let engine = Arc::new(engine);
let weak = Arc::downgrade(&engine);
engine.step_service.start(weak);

ethcore/src/engines/clique/mod.rs Outdated Show resolved Hide resolved
@tomusdrw tomusdrw added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels May 22, 2019
@dvdplm dvdplm added A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. and removed A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. labels May 22, 2019
dvdplm added 2 commits May 23, 2019 07:54
* master:
  add_sync_notifier in EthPubSubClient holds on to a Client for too long (#10689)
  Don't panic if extra_data is longer than VANITY_LENGTH (#10682)
  docs: evmbin - Update Rust docs (#10658)
  Remove annoying compiler warnings (#10679)
  Reset blockchain properly (#10669)
The idea here is to avoid using `Arc::get_mut()` (which does not work: fails every time here) and instead trust `drop()` to do the right thing.

This is a conservative change. I think this can be reformed further, e.g. by `impl Drop for StepService` and halt the thread there, or even skip `join()`ing the thread entirely and trust the `AtomicBool` to signal shutdown. I also have doubts abut the `Option<StepService>`: seems a bit much to have an `Option` there and it makes things cumbersome.
@dvdplm
Copy link
Collaborator Author

dvdplm commented May 23, 2019

@tomusdrw @niklasad1 hoisting the commit message here so it's easier to read:

Fix warning, second attempt

The idea here is to avoid using Arc::get_mut() (which does not work: fails every time here) and instead trust drop() to do the right thing.

This is a conservative change. I think this can be reformed further, e.g. by impl Drop for StepService and halt the thread there, or even skip join()ing the thread entirely and trust the AtomicBool to signal shutdown. I also have doubts abut the Option<StepService>: seems a bit much to have an Option there and it makes things cumbersome.

@dvdplm
Copy link
Collaborator Author

dvdplm commented May 23, 2019

To add to the above, I don't think it's valid to run görli with period set to 0 so I think it's best to remove the conditional code that makes the code a bit convoluted here. I'll have a stab at that in a follow-up PR.

@dvdplm dvdplm added A0-pleasereview 🤓 Pull request needs code review. and removed A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. labels May 23, 2019
impl Drop for Client {
fn drop(&mut self) {
if let Some(c) = Arc::get_mut(&mut self.engine) {
c.stop()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT Clique is the only engine that actually impls stop(). And the Arc::get_mut here fails every time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change https://github.com/paritytech/parity-ethereum/blob/master/ethcore/src/engines/mod.rs#L429 to fn stop(&self) it is broken, because Engine is stored as Arc<Engine> and doesn't support mutation.

I'm sorry that I didn't catch when reviewing it :(

@dvdplm dvdplm requested review from tomusdrw and niklasad1 May 23, 2019 12:27
Copy link
Collaborator

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -168,7 +168,7 @@ pub struct Clique {
block_state_by_hash: RwLock<LruCache<H256, CliqueBlockState>>,
proposals: RwLock<HashMap<Address, VoteType>>,
signer: RwLock<Option<Box<EngineSigner>>>,
step_service: Option<Arc<StepService>>,
step_service: Option<StepService>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we disallow period=0 we could get rid of the Option here as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's planned for the follow up PR. :)

@@ -744,14 +756,6 @@ impl Engine<EthereumMachine> for Clique {
}
}

fn stop(&mut self) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need stop on the Engine trait then? Is it used anywhere?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change https://github.com/paritytech/parity-ethereum/blob/master/ethcore/src/engines/mod.rs#L429 to fn stop(&self) it is broken, because Engine is stored as Arc and doesn't support mutation.

It is not used and it is broken. Either remove it or change it to fn stop(&self)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's planned for the follow up PR. :)

(What, you a mind reader @tomusdrw ???)

@dvdplm dvdplm merged commit 752031a into master May 23, 2019
@niklasad1 niklasad1 deleted the dp/fix/move-step-service-into-clique branch May 24, 2019 07:53
ordian added a commit that referenced this pull request May 28, 2019
* master:
  Upgrade to parity-crypto 0.4 (#10650)
  new image (#10673)
  Add SealingState; don't prepare block when not ready. (#10529)
  Fix compiler warning (that will become an error) (#10683)
@niklasad1
Copy link
Collaborator

@dvdplm @s3krit I would really want this and https://github.com/paritytech/parity-ethereum/pull/10691/files backported, does it make sense?

@dvdplm
Copy link
Collaborator Author

dvdplm commented Oct 8, 2019

Wow, so they are missing on 2.5? That is unfortunate. :/

@dvdplm
Copy link
Collaborator Author

dvdplm commented Oct 8, 2019

Now I’m curious: how did you notice? And yes, ofc we should back port them. I wonder if adding the labels is enough or...?

@niklasad1
Copy link
Collaborator

niklasad1 commented Oct 8, 2019

@dvdplm just by coincidence, I tried to build the stable branch with the nightly toolchain which failed because of the compiler warning (error now) that you fixed in this PR

EDIT: backported to beta but not stable

niklasad1 pushed a commit that referenced this pull request Oct 8, 2019
* Remove annoying compiler warnings

* Fix compiler warning (that will become an error)

Fixes #10648

I'm not sure this fix is Good™ but rustc seems happy enough.

There's a deeper issue which may or may not be related to this: the Engine is not shutdown properly and the `StepService` thread keeps running indefinitely after Ctrl-C (so `update_sealing()` is called repeatedly for 300sec). I don't think this is related to Clique as I've seen this happen on mainnet as well, but I wonder if the effects of it are worse for a PoA network where the node can create new blocks all on its own?

* Use saturating_sub

* WIP

* Fix warning, second attempt

The idea here is to avoid using `Arc::get_mut()` (which does not work: fails every time here) and instead trust `drop()` to do the right thing.

This is a conservative change. I think this can be reformed further, e.g. by `impl Drop for StepService` and halt the thread there, or even skip `join()`ing the thread entirely and trust the `AtomicBool` to signal shutdown. I also have doubts abut the `Option<StepService>`: seems a bit much to have an `Option` there and it makes things cumbersome.
niklasad1 added a commit that referenced this pull request Oct 23, 2019
* Fix compiler warning (that will become an error) (#10683)

* Remove annoying compiler warnings

* Fix compiler warning (that will become an error)

Fixes #10648

I'm not sure this fix is Good™ but rustc seems happy enough.

There's a deeper issue which may or may not be related to this: the Engine is not shutdown properly and the `StepService` thread keeps running indefinitely after Ctrl-C (so `update_sealing()` is called repeatedly for 300sec). I don't think this is related to Clique as I've seen this happen on mainnet as well, but I wonder if the effects of it are worse for a PoA network where the node can create new blocks all on its own?

* Use saturating_sub

* WIP

* Fix warning, second attempt

The idea here is to avoid using `Arc::get_mut()` (which does not work: fails every time here) and instead trust `drop()` to do the right thing.

This is a conservative change. I think this can be reformed further, e.g. by `impl Drop for StepService` and halt the thread there, or even skip `join()`ing the thread entirely and trust the `AtomicBool` to signal shutdown. I also have doubts abut the `Option<StepService>`: seems a bit much to have an `Option` there and it makes things cumbersome.

* Refactor Clique stepping (#10691)

* Use Drop to shutdown stepper thread
Make period == 0 an error and remove the Option from step_service

* Remove StepService

Remove StepService and spawn the stepping thread in `Clique::new()`. Don't store the thread handle and instead trust the `AtomicBool` to signal shutdown time.
Don't check for `period > 0`: we assume a valid chainspec file.

* Don't shutdown the stepper thread at all, just let it run until exit
Also: fix a few warnings and tests

* Put kvdb_memorydb back

* Warn&exit when engine is dropped
Don't sleep too long!

* Don't delay stepping thread

* Better formatting
@soc1c soc1c mentioned this pull request Nov 5, 2019
36 tasks
s3krit pushed a commit that referenced this pull request Nov 11, 2019
* ropsten #6631425 foundation #8798209 (#11201)
* [stable] builtin, istanbul and mordor testnet backports (#11234)
  * ethcore-builtin (#10850)
  * [builtin]: support `multiple prices and activations` in chain spec (#11039)
  * [chain specs]: activate `Istanbul` on mainnet (#11228)
  * ethcore/res: add mordor testnet configuration (#11200)
* Update list of bootnodes for xDai chain (#11236)
* ethcore: remove `test-helper feat` from build (#11047)
* Secret store: fix Instant::now() related race in net_keep_alive (#11155) (#11159)
* [stable]: backport #10691 and #10683 (#11143)
  * Fix compiler warning (that will become an error) (#10683)
  * Refactor Clique stepping (#10691)
* Add Constantinople eips to the dev (instant_seal) config (#10809)
* Add cargo-remote dir to .gitignore (?)
* Insert explicit warning into the panic hook (#11225)
* Fix docker centos build (#11226)
* Update MIX bootnodes. (#11203)
* Use provided usd-per-eth value if an endpoint is specified (#11209)
* Add new line after writing block to hex file. (#10984)
* Type annotation for next_key() matching of json filter options (#11192) (but no `FilterOption` in 2.5 so…)
* Upgrade jsonrpc to latest (#11206)
* [CI] check evmbin build (#11096)
* Correct EIP-712 encoding (#11092)
* [client]: Fix for incorrectly dropped consensus messages (#11086)
* Fix block detail updating (#11015)
* Switching sccache from local to Redis (#10971)
* Made ecrecover implementation trait public (#11188)
* [dependencies]: jsonrpc `14.0.1` (#11183)
* [receipt]: add `sender` & `receiver` to `RichReceipts` (#11179)
* [ethcore/builtin]: do not panic in blake2pricer on short input (#11180)
* util Host: fix a double Read Lock bug in fn Host::session_readable() (#11175)
* ethcore client: fix a double Read Lock bug in fn Client::logs() (#11172)
* Change how RPCs eth_call and eth_estimateGas handle "Pending" (#11127)
* Cleanup stratum a bit (#11161)
* Upgrade to jsonrpc v14 (#11151)
* SecretStore: expose restore_key_public in HTTP API (#10241)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-pleasereview 🤓 Pull request needs code review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refactor Clique and StepService
3 participants