Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

PVF host prechecking support v2 #4123

Merged
merged 24 commits into from
Nov 13, 2021
Merged

Conversation

slumber
Copy link
Contributor

@slumber slumber commented Oct 21, 2021

Make every prepare request single-threaded

Resolves #3857
Based on #3907

NOTE(@pepyakin): this PR disables parallel compilation for preparation. I conducted a couple of tests on different machines. It takes ~0.7-0.8ms to compile the code with parallel_compilation on and around 3.5s without it.

@slumber slumber requested a review from pepyakin October 21, 2021 18:10
@slumber slumber changed the base branch from master to slumber-move-artifact-status-into-memory October 21, 2021 18:11
@github-actions github-actions bot added the A0-please_review Pull request needs code review. label Oct 21, 2021
@slumber slumber added B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit. labels Oct 21, 2021
Copy link
Contributor

@pepyakin pepyakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! However, it could use some tests.

node/core/pvf/src/prepare/mod.rs Outdated Show resolved Hide resolved
node/core/pvf/src/prepare/pool.rs Outdated Show resolved Hide resolved
node/core/pvf/src/prepare/worker.rs Show resolved Hide resolved
node/core/pvf/src/prepare/worker.rs Show resolved Hide resolved
@pepyakin
Copy link
Contributor

/cla run-cla-check

@cla-bot-2021
Copy link

cla-bot-2021 bot commented Oct 22, 2021

Queueing command execution: run-cla-check

@cla-bot-2021
Copy link

cla-bot-2021 bot commented Oct 22, 2021

@pepyakin Command execution has finished.

Base automatically changed from slumber-move-artifact-status-into-memory to master October 22, 2021 16:38
@azerella
Copy link

/cla run-cla-check

@cla-bot-2021
Copy link

cla-bot-2021 bot commented Oct 24, 2021

Queueing command execution: run-cla-check

@cla-bot-2021
Copy link

cla-bot-2021 bot commented Oct 24, 2021

@azerella Command execution has finished.

node/core/pvf/src/host.rs Outdated Show resolved Hide resolved
node/core/pvf/src/executor_intf.rs Show resolved Hide resolved
Copy link
Contributor

@pepyakin pepyakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good!

Due to the nature of the changes I think we might want to run a burn-in.

@pepyakin pepyakin added this to the v0.9.13 milestone Nov 12, 2021
@rphmeier
Copy link
Contributor

checks failing

@pepyakin pepyakin merged commit 903c6f2 into master Nov 13, 2021
@pepyakin pepyakin deleted the slumber-pvf-host-prechecking-2 branch November 13, 2021 16:26
drahnr pushed a commit that referenced this pull request Nov 15, 2021
* pvf host: store only compiled artifacts on disk

* Correctly handle failed artifacts

* Serialize result of PVF preparation uniquely

* Set the artifact state depending on the result

* Return the result of PVF preparation directly

* Move PrepareError to the error module

* Update doc comments

* Update misleading comment

* pvf host: turn off parallel compilation

* pvf host: implement precheck requests

* Fix warnings

* Unnecessary clone

* Add a note about timed out outcome

* Revert the pool outcome handling behavior

* Move the prepare result type into error mod

* Test prepare done

* fmt

* Add an explanation to wasmtime config

* Split pvf host test

* Add precheck to dictionary

Co-authored-by: Sergei Shulepov <[email protected]>
pepyakin added a commit that referenced this pull request Dec 14, 2021
We wanted to change niceness to accomodate the fact that some of the
preparation tasks are low priority. For example, when a node sees that
there is a new para was onboarded the node may start preparing right
away. Since all other activities are more important, such as network I/O
or validation of the backed candidates and preparation of the
immediatelly needed PVFs.

However, it turned out that this approach does not work: generally
non-root processes can only decrease niceness and they cannot increase
it to the previous value, as was assumed by the code.

Apart from that, #4123
assumes all PVFs are prepared in the same way. Specifically, that if a
PVF preparation failed before, then PVF pre-checking will also report
that it was failed, even though it could happen that preparation failed
due to being low-priority. In order to avoid such cases, we decided to
simplify the whole preparation model. Preparation under low priority
does not work well with that.

Closes #4520
pepyakin added a commit that referenced this pull request Dec 14, 2021
We wanted to change niceness to accomodate the fact that some of the
preparation tasks are low priority. For example, when a node sees that
there is a new para was onboarded the node may start preparing right
away. Since all other activities are more important, such as network I/O
or validation of the backed candidates and preparation of the
immediatelly needed PVFs.

However, it turned out that this approach does not work: generally
non-root processes can only decrease niceness and they cannot increase
it to the previous value, as was assumed by the code.

Apart from that, #4123
assumes all PVFs are prepared in the same way. Specifically, that if a
PVF preparation failed before, then PVF pre-checking will also report
that it was failed, even though it could happen that preparation failed
due to being low-priority. In order to avoid such cases, we decided to
simplify the whole preparation model. Preparation under low priority
does not work well with that.

Closes #4520
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Validation host support for pre-checking
4 participants