PVF: include polkadot's version into the artifact path #695

mrcnski · 2023-03-13T16:39:46Z

ISSUE

Overview

Compilation artifact filenames should contain the polkadot version. This way if there is a node upgrade, an old version can't override an artifact for the new node.

mrcnski · 2023-03-14T13:20:00Z

We may also want to include a hash per run; see here:

paritytech/polkadot#6861 (comment)

mrcnski · 2023-08-08T09:32:48Z

We may also want to include a hash per run; see here:

#6861 (comment)

Note this is only if we don't want artifacts to persist across restarts. But I think most of the issues with reusing artifacts across restarts would be addressed by this issue and by #677. See #685

eagr · 2023-10-08T04:21:34Z

I presume no one is working on this right? @s0me0ne-unkn0wn @mrcnski

I dug around the code a bit. On node startup, all existing artifacts will be nuked. Then a host will spin up a worker to compile the PVF and it writes to a tmp file on completion on success. Later, the host will be notified and rename the tmp file to the dest. But the artifact is only executed if it is marked Prepared. Please correct me if I got it wrong.

I could imagine there will be some corner cases, like a host restarting during PVF compilation and purging the caches before compilation is completed. But even that wouldn’t be a problem if the host doesn’t deem the artifact prepared, which doesn’t seem to be the case.

Please help me understand why are these needed. Like in what scenario this could otherwise be a problem? Or is this some kind of precaution?

s0me0ne-unkn0wn · 2023-10-08T10:29:59Z

@eagr, the initial idea comes from an incident we had some time ago. Worker binaries were not separate then and spawned from the main polkadot binary. There was a validator whose operator upgraded his node in the wrong way. He overwrote the polkadot binary with a new version and didn't restart the node. There was an artifact in cache generated by the old node version. The new node came with a new version of Wasmtime that introduced some breaking changes, and that new version of Wasmtime wasn't able to execute artifacts produced by older versions. So, when it came to the execution, the node spawned an execution worker from the new polkadot binary, Wasmtime refused to execute the old artifact, validation failed, and the validator started a dispute. As the dispute obviously concluded invalid, the validator might be slashed.

That's some pre-history of why we wanted to include the node version into the artifact path initially. The workers have been split out since then, and we have a version control between node and workers, so this measure does not make as much sense for that very problem now. But now there're discussions on preserving artifacts between runs. Preparation is a very resource-heavy process, and in the light of on-demand parachains upcoming, it would be good to cut on resource usage, so if artifacts are reusable, why not reuse them? But we must ensure they're really reusable, that is, produced with a proper node version.

Hope it helps, and maybe @mrcnski might add something, I'm not sure I remember all the background and reasoning behind this issue.

eagr · 2023-10-08T13:30:36Z

But now there're discussions on preserving artifacts between runs.

With that change, these additions definitely make sense now. Thanks for the elaboration! @s0me0ne-unkn0wn

mrcnski · 2023-10-09T13:12:47Z

Good question @eagr and great answer @s0me0ne-unkn0wn! Thanks!

closes #695 Could potentially be helpful to preserving caches when applicable, as discussed in #685 kusama address: FvpsvV1GQAAbwqX6oyRjemgdKV11QU5bXsMg9xsonD1FLGK

…h#699) * SNO-324: Enable multiple accounts in outbound channel contract (paritytech#685) * Track nonces by source address * Remove principal from basic outbound channel * Update relayer bindings * Track nonces by origin address * Fix outbound channel tests * SNO-325: Enable multiple accounts in inbound channel pallet (paritytech#687) * Add origin field to message decoding * Track nonces by message origin * Include origin address in basic channel message id * Update basic inbound fixture data * Add note about updating inbound channel test data * Mention message data encoding in comment * Update test data in envelope test * Remove channel id from message id * s/origin/user/g * Swap the order of source & user * Reword test data generation section * s/value/nonce * Use named fields in MessageId enum * Switch to Twox64Concat hasher This prevents accidental expensive lookups while remaining relatively fast vs Blake2_128 and resistant to attacks that cause prefix collisions, thanks to the security of keccak used to create the Ethereum address. * Remove rogue print statement * SNO-326: Forward messages by address in relayer (paritytech#695) * Remove redundant slice * Add initial address filtering * Remove extra nonce var * Clean up basic channel address config * Rename parachain relayer account config * Remove unused method * Fix up comments * Add default eth addresses for basic channel * Improve map log readability * Fix parachain writer error messages * Bump lodestar version in example command * Switch to default eth account for E2E script * Increase default timeout from 6m 40s to 25m This helps with some of the longer running tests that are timing out when given 400s. * Rename mapping to nonce * Rename origin to account in basic outbound channel * Remove principal from outbound channel contract * Rename user to account in basic inbound channel * Fix event account field rename in test

* upgrade substrate to polkadot-v0.9.22 cargo update -p sp-std --precise 616d33ea23bab86cafffaf116fc607b6790fb4eb * replace jsonrpc-* dependencies by jsonrpsee * wip * ref: migrate to jsonrpsee * migrate template to jsonrpsee * fix clippy warnings * fix rust tests * fix manual seal compilation * fmt * Add ethereum style subscription id provider * ref: not use atomic * Use ethereum style subscription id provider in manual-seal mode too * fix manual seal build * fix: subscription id should be prefixed by 0x * for some reason, the extrinsic gas limit change * pin substrate 6c14be4 * fix build

mrcnski assigned s0me0ne-unkn0wn Mar 13, 2023

mrcnski mentioned this issue Aug 24, 2023

PVF: Throw error if worker process cannot spawn necessary threads #659

Open

Sophia-Gold transferred this issue from paritytech/polkadot Aug 24, 2023

eagr mentioned this issue Oct 9, 2023

Include polkadot version in artifact path #1828

Merged

bkchr closed this as completed in #1828 Oct 15, 2023

bkchr pushed a commit that referenced this issue Oct 15, 2023

Include polkadot version in artifact path (#1828)

9e14470

closes #695 Could potentially be helpful to preserving caches when applicable, as discussed in #685 kusama address: FvpsvV1GQAAbwqX6oyRjemgdKV11QU5bXsMg9xsonD1FLGK

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PVF: include polkadot's version into the artifact path #695

PVF: include polkadot's version into the artifact path #695

mrcnski commented Mar 13, 2023

mrcnski commented Mar 14, 2023

mrcnski commented Aug 8, 2023

eagr commented Oct 8, 2023

s0me0ne-unkn0wn commented Oct 8, 2023

eagr commented Oct 8, 2023

mrcnski commented Oct 9, 2023

PVF: include polkadot's version into the artifact path #695

PVF: include polkadot's version into the artifact path #695

Comments

mrcnski commented Mar 13, 2023

ISSUE

Overview

mrcnski commented Mar 14, 2023

mrcnski commented Aug 8, 2023

eagr commented Oct 8, 2023

s0me0ne-unkn0wn commented Oct 8, 2023

eagr commented Oct 8, 2023

mrcnski commented Oct 9, 2023