Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add metrics to measure compiled-contract cache hits/misses for chunk production vs. chunk validation. #11178

Merged
merged 24 commits into from
May 6, 2024

Conversation

tayfunelmas
Copy link
Contributor

@tayfunelmas tayfunelmas commented Apr 29, 2024

As part of #11125, we add metrics to count compiled-contract cache hits/misses.

We want to track the use of compiled-contract cache for apply-chunk operations. With stateless validation, apply-chunk will be executed for two different purposes: (1) while updating the shards by chunk producers as part of tracking shards and (2) while validating the chunk-witness in stateless validation (they do not necessarily track the shards they validate). Since the chunk validators are stateless, they will need to recompile contracts that they see anew unless they have already cached the previous compilation results.

We want to closely monitor the trend for chunk validators, thus we need to distinguish the apply-chunk operations executed for chunk producers vs chunk validators.

In this PR, we introduce a new enum to make this distinction and propagate it through apply-chunk call stack up to the point VM is called. Then we use the Metrics structure exported by the VM.run method to report the metrics.

@tayfunelmas tayfunelmas requested a review from nagisa April 29, 2024 17:47
@tayfunelmas
Copy link
Contributor Author

Adding @nagisa to see if this approach makes sense.

@tayfunelmas tayfunelmas marked this pull request as ready for review April 29, 2024 18:52
@tayfunelmas tayfunelmas requested a review from a team as a code owner April 29, 2024 18:52
@tayfunelmas tayfunelmas deleted the compile-metric branch April 29, 2024 19:22
@tayfunelmas tayfunelmas restored the compile-metric branch April 29, 2024 19:23
@tayfunelmas tayfunelmas reopened this Apr 29, 2024
runtime/near-vm-runner/src/near_vm_runner/metrics.rs Outdated Show resolved Hide resolved
runtime/near-vm-runner/src/near_vm_runner/runner.rs Outdated Show resolved Hide resolved
runtime/near-vm-runner/src/near_vm_runner/metrics.rs Outdated Show resolved Hide resolved
r
}
res => res,
res => {
metrics.report(
Copy link
Contributor Author

@tayfunelmas tayfunelmas Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this metrics.report call (for when we can find the compiled code in cache even without the contract code) here to avoid double counting cache lookups. metrics reset when running VM.run but are reported before the reset. please check.

Copy link
Collaborator

@nagisa nagisa May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want to report metrics even if there is a failure in execution, so something smarter needs to be done... but I'm also not-quite following the concern, zero out the values as necessary when reporting and/or returning the Metrics?

Copy link
Contributor Author

@tayfunelmas tayfunelmas May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Updated the code to avoid double counting and only invoking the report_metrics once. I call reset_metrics at the beginning and report_metrics at the end (to cover both calls to VM.run). PTAL.

#[derive(Default, Copy, Clone)]
pub struct Metrics {
near_vm_compilation_time: Duration,
wasmtime_compilation_time: Duration,
/// True iff the runtime checked the compiled contract cache and found already-compiled code.
compiled_contract_cache_hit: Option<bool>,
Copy link
Collaborator

@nagisa nagisa May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use a usize that counts the count. It is less error prone this way (e.g. what happens if the code is adjusted to access the cache multiple times?)

It might also solve your concerns with double counting if we make fn report operate on the value inside the TLS... So we would change the return value of run back to only return an VMResult, and then change fn report to be a static method or something that we would operate on the TLS value much like the increment functions do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I like that. I changed report to a static function (renamed to report_metrics) and also renamed reset to reset_metrics. And calling reset_metrics and report_metrics once in the VM runner (outside of VM.run). PTAL.

Copy link
Collaborator

@nagisa nagisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tayfunelmas tayfunelmas enabled auto-merge May 2, 2024 01:17
@tayfunelmas tayfunelmas added this pull request to the merge queue May 6, 2024
Copy link

codecov bot commented May 6, 2024

Codecov Report

Attention: Patch coverage is 92.70833% with 7 lines in your changes are missing coverage. Please review.

Project coverage is 71.21%. Comparing base (a0c1466) to head (b129a73).

Files Patch % Lines
runtime/near-vm-runner/src/metrics.rs 90.56% 3 Missing and 2 partials ⚠️
core/primitives-core/src/apply.rs 87.50% 1 Missing ⚠️
tools/state-viewer/src/commands.rs 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11178      +/-   ##
==========================================
+ Coverage   71.19%   71.21%   +0.01%     
==========================================
  Files         783      784       +1     
  Lines      155172   155243      +71     
  Branches   155172   155243      +71     
==========================================
+ Hits       110471   110549      +78     
+ Misses      39960    39955       -5     
+ Partials     4741     4739       -2     
Flag Coverage Δ
backward-compatibility 0.24% <0.00%> (-0.01%) ⬇️
db-migration 0.24% <0.00%> (-0.01%) ⬇️
genesis-check 1.41% <0.00%> (-0.01%) ⬇️
integration-tests 37.00% <83.33%> (+0.09%) ⬆️
linux 69.19% <90.62%> (+<0.01%) ⬆️
linux-nightly 70.68% <92.70%> (+0.03%) ⬆️
macos 53.12% <20.48%> (-0.02%) ⬇️
pytests 1.63% <0.00%> (-0.01%) ⬇️
sanity-checks 1.42% <0.00%> (-0.01%) ⬇️
unittests 65.95% <77.89%> (+0.01%) ⬆️
upgradability 0.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Merged via the queue into near:master with commit 72a159b May 6, 2024
29 checks passed
@tayfunelmas tayfunelmas deleted the compile-metric branch May 6, 2024 22:11
jancionear pushed a commit to jancionear/nearcore that referenced this pull request May 13, 2024
<p>This PR was automatically created by Snyk using the credentials of a
real user.</p><br /><h3>Snyk has created this PR to upgrade react-router
from 6.21.1 to 6.21.3.</h3>

:information_source: Keep your dependencies up-to-date. This makes it
easier to fix existing vulnerabilities and to more quickly identify and
fix newly disclosed vulnerabilities when they affect your project.
<hr/>

- The recommended version is **4 versions** ahead of your current
version.
- The recommended version was released **a month ago**, on 2024-01-18.


<details>
<summary><b>Release notes</b></summary>
<br/>
  <details>
    <summary>Package name: <b>react-router</b></summary>
    <ul>
      <li>
<b>6.21.3</b> - <a
href="https://snyk.io/redirect/github/remix-run/react-router/releases/tag/react-router-native%406.21.3">2024-01-18</a></br><p>[email protected]</p>
      </li>
      <li>
<b>6.21.3-pre.0</b> - <a
href="https://snyk.io/redirect/github/remix-run/react-router/releases/tag/react-router-native%406.21.3-pre.0">2024-01-16</a></br><p>[email protected]</p>
      </li>
      <li>
        <b>6.21.2</b> - 2024-01-11
      </li>
      <li>
        <b>6.21.2-pre.0</b> - 2024-01-09
      </li>
      <li>
        <b>6.21.1</b> - 2023-12-21
      </li>
    </ul>
from <a
href="https://snyk.io/redirect/github/remix-run/react-router/releases">react-router
GitHub release notes</a>
  </details>
</details>


<details>
  <summary><b>Commit messages</b></summary>
  </br>
  <details>
    <summary>Package name: <b>react-router</b></summary>
    <ul>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/44b391ad27dacc07bb1298bd79315c00845d5a2c">44b391a</a>
chore: Update version for release (near#11203)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/82c48435350d239b376e435e8693ea51296c5848">82c4843</a>
Exit prerelease mode</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/1551285c2e821994d01b0a86cd2839c63898dabc">1551285</a>
Change release date</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/8fc701a910788d69c358565c974665384ca462fb">8fc701a</a>
Prep release notes</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/b4bf9271bc3e2a5a1fec73485392563cf3c8a7e2">b4bf927</a>
chore: Update version for release (pre) (near#11196)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/84e5fb9f6df09e8324e43541d2ab660c383a025e">84e5fb9</a>
Enter prerelease mode</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/858f6f125d4dad94483f87fdfb8cc23cbbe68d59">858f6f1</a>
Merge branch &#x27;main&#x27; into release-next</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/b541b8dd5a582b1e95a2fd0111eeac2ab92f738b">b541b8d</a>
Fix NavLink isPending with a basename (near#11195)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/552662ab867fd09e865adb33bb6631207a16ca53">552662a</a>
Remove leftover &#x60;unstable_&#x60; prefix from
&#x60;Blocker&#x60;/&#x60;BlockerFunction&#x60; types (near#11187)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/04d653bcdb5b641485e4b9081fe196cfeac12716">04d653b</a>
Copy remix script to remove prereleases from changelogs (near#11183)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/6837a1111a8ae1c60ab665b9ff2d9eb20b2f2f40">6837a11</a>
Fix date in changelog</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/c418410cb8b7277655d51e8ad2756059ad19fbeb">c418410</a>
Merge branch &#x27;release-next&#x27; into dev</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/274027eba64e614442106678c5a0676834202491">274027e</a>
Fix date in changelog</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/6be8e6c57d347c2dd975fa76b5aef6d39fd71d0e">6be8e6c</a>
Merge branch &#x27;release-next&#x27;</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/4e528c0bcee338c8f26d948affb3092257f6ee85">4e528c0</a>
chore: Update version for release (near#11182)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/c8742d464d539bfdad452d59b7d5ce55b3628980">c8742d4</a>
Exit prerelease mode</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/f9780210d84257c21f628c31de165e91f3474263">f978021</a>
Prep release notes</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/cfce7e61655b46f7894fa8cadd336a66d58d1a00">cfce7e6</a>
Unbold date</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/038ad94452d81d882f1042395b28b0d80c3b4f40">038ad94</a>
Add dates to prior releases in changelog</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/c2c0826c6e4a9e213b4f43bc4e95d4f85bf2d96e">c2c0826</a>
chore: Update version for release (pre) (near#11178)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/5cd3a45d625d947c795bd85aab8026a5e6b29614">5cd3a45</a>
Enter prerelease mode</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/03ce0377f5bed5fca3eb94d875c33eab9414b116">03ce037</a>
Merge branch &#x27;main&#x27; into release-next</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/0ef18963d274ea78592479cab9a48be2c5ec3c6a">0ef1896</a>
Do not attempt to deserialize empty json responses (near#11164)</li>
<li><a
href="https://snyk.io/redirect/github/remix-run/react-router/commit/e0d106bbe765ef12d3ad31e223127e2d08cab8e1">e0d106b</a>
Leverage useId for internal fetcher keys when available (near#11166)</li>
    </ul>

<a
href="https://snyk.io/redirect/github/remix-run/react-router/compare/08cda17f450ebe19481be3fc080d243ec5ef509f...44b391ad27dacc07bb1298bd79315c00845d5a2c">Compare</a>
  </details>
</details>
<hr/>

**Note:** *You are seeing this because you or someone else with access
to this repository has authorized Snyk to open upgrade PRs.*

For more information: <img
src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiI1YjE0ODcwNy0yNzZhLTQ2M2EtOGU0Mi01YTQ2ZGRlOWViNWEiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6IjViMTQ4NzA3LTI3NmEtNDYzYS04ZTQyLTVhNDZkZGU5ZWI1YSJ9fQ=="
width="0" height="0"/>

🧐 [View latest project
report](https://app.snyk.io/org/ecp88/project/98480bdc-d80b-4fd1-89d7-c4c56a706763?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr)

🛠 [Adjust upgrade PR
settings](https://app.snyk.io/org/ecp88/project/98480bdc-d80b-4fd1-89d7-c4c56a706763/settings/integration?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr)

🔕 [Ignore this dependency or unsubscribe from future upgrade
PRs](https://app.snyk.io/org/ecp88/project/98480bdc-d80b-4fd1-89d7-c4c56a706763/settings/integration?pkg&#x3D;react-router&amp;utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr#auto-dep-upgrades)

<!---
(snyk:metadata:{"prId":"5b148707-276a-463a-8e42-5a46dde9eb5a","prPublicId":"5b148707-276a-463a-8e42-5a46dde9eb5a","dependencies":[{"name":"react-router","from":"6.21.1","to":"6.21.3"}],"packageManager":"npm","type":"auto","projectUrl":"https://app.snyk.io/org/ecp88/project/98480bdc-d80b-4fd1-89d7-c4c56a706763?utm_source=github&utm_medium=referral&page=upgrade-pr","projectPublicId":"98480bdc-d80b-4fd1-89d7-c4c56a706763","env":"prod","prType":"upgrade","vulns":[],"issuesToFix":[],"upgrade":[],"upgradeInfo":{"versionsDiff":4,"publishedDate":"2024-01-18T19:49:25.374Z"},"templateVariants":[],"hasFixes":false,"isMajorUpgrade":false,"isBreakingChange":false,"priorityScoreList":[]})
--->

Co-authored-by: snyk-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants