Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tee): add Prometheus metrics to the TEE Prover #2386

Merged
merged 10 commits into from
Jul 5, 2024

Conversation

pbeza
Copy link
Collaborator

@pbeza pbeza commented Jul 4, 2024

What ❔

This commit adds Prometheus metrics to the TEE Prover. Specifically, the following metrics were added:

  • Waiting time for a new batch to be proven
  • Proof generation time
  • Proof submitting time
  • Network error counter
  • Last block number processed

Why ❔

Setting up Prometheus metrics is a prerequisite before rolling them out to staging and testnet environments. Prometheus metrics are useful for monitoring, providing valuable insights into the running system.

Checklist

  • PR title corresponds to the body of PR (we generate changelog entries from PRs).
  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • Code has been formatted via zk fmt and zk lint.

@pbeza pbeza requested review from haraldh and popzxc July 4, 2024 16:42
@pbeza
Copy link
Collaborator Author

pbeza commented Jul 4, 2024

@popzxc @haraldh Please review. Hopefully, this should be an easy one.

I'm not sure if there's an easy way to test it before rolling it out to the staging/testnet environment. If there is, I would appreciate any guidance on how to do it. Thanks!

@pbeza pbeza marked this pull request as ready for review July 4, 2024 16:45
@pbeza pbeza requested a review from RomanBrodetski July 4, 2024 16:47
popzxc
popzxc previously approved these changes Jul 5, 2024
core/bin/zksync_tee_prover/src/api_client.rs Outdated Show resolved Hide resolved
@haraldh haraldh enabled auto-merge July 5, 2024 10:22
@haraldh haraldh added this pull request to the merge queue Jul 5, 2024
Merged via the queue into main with commit 6153e99 Jul 5, 2024
46 checks passed
@haraldh haraldh deleted the tee/tee_prover_metrics branch July 5, 2024 11:00
github-merge-queue bot pushed a commit that referenced this pull request Jul 10, 2024
🤖 I have created a release *beep* *boop*
---


##
[24.9.0](core-v24.8.0...core-v24.9.0)
(2024-07-10)


### Features

* add block timestamp to `eth_getLogs`
([#2374](#2374))
([50422b8](50422b8))
* add revert tests to zk_toolbox
([#2317](#2317))
([c9ad002](c9ad002))
* add zksync_tee_prover and container to nix
([#2403](#2403))
([e0975db](e0975db))
* Adding unstable RPC endpoint to return the execution_info
([#2332](#2332))
([3d047ea](3d047ea))
* **api:** Retry `read_value`
([#2352](#2352))
([256a43c](256a43c))
* Base Token Fundamentals
([#2204](#2204))
([39709f5](39709f5))
* **base-token:** Base token price ratio cache update frequency
configurable
([#2388](#2388))
([fb4d700](fb4d700))
* BWIP ([#2258](#2258))
([75bdfcc](75bdfcc))
* **config:** Make getaway_url optional
([#2412](#2412))
([200bc82](200bc82))
* consensus support for pruning (BFT-473)
([#2334](#2334))
([abc4256](abc4256))
* **contract-verifier:** Add file based config for contract verifier
([#2415](#2415))
([f4410e3](f4410e3))
* **en:** file based configs for en
([#2110](#2110))
([7940fa3](7940fa3))
* **en:** Unify snapshot recovery and recovery from L1
([#2256](#2256))
([e03a929](e03a929))
* **eth-sender:** Add transient ethereum gateway errors metric
([#2323](#2323))
([287958d](287958d))
* **eth-sender:** handle transactions for different operators separately
to increase throughtput
([#2341](#2341))
([0619ecc](0619ecc))
* **eth-sender:** separate gas calculations for blobs transactions
([#2247](#2247))
([627aab9](627aab9))
* **gas_adjuster:** Use eth_feeHistory for both base fee and blobs
([#2322](#2322))
([9985c26](9985c26))
* L1 batch QC database (BFT-476)
([#2340](#2340))
([5886b8d](5886b8d))
* **metadata-calculator:** option to use VM runner for protective reads
([#2318](#2318))
([c147b0c](c147b0c))
* Minimal External API Fetcher
([#2383](#2383))
([9f255c0](9f255c0))
* **node_framework:** Document implementations
([#2319](#2319))
([7b3877f](7b3877f))
* **node_framework:** Implement FromContext and IntoContext derive macro
([#2330](#2330))
([34f2a45](34f2a45))
* **node_framework:** Support shutdown hooks + more
([#2293](#2293))
([2b2c790](2b2c790))
* **node_framework:** Unify Task types + misc improvements
([#2325](#2325))
([298a97e](298a97e))
* **node-framework:** New wiring interface
([#2384](#2384))
([f2f4056](f2f4056))
* **prover:** Add prometheus port to witness generator config
([#2385](#2385))
([d0e1add](d0e1add))
* **prover:** Add prover_cli stats command
([#2362](#2362))
([fe65319](fe65319))
* **snapshots_applier:** Add a method to check whether snapshot recovery
is done ([#2338](#2338))
([610a7cf](610a7cf))
* Switch to using crates.io deps
([#2409](#2409))
([27fabaf](27fabaf))
* **tee:** add Prometheus metrics to the TEE Prover
([#2386](#2386))
([6153e99](6153e99))
* **tee:** TEE Prover Gateway
([#2333](#2333))
([f8df34d](f8df34d))
* Unify and port node storage initialization
([#2363](#2363))
([8ea9791](8ea9791))
* Validium with DA
([#2010](#2010))
([fe03d0e](fe03d0e))
* **vm-runner:** make vm runner report time taken
([#2369](#2369))
([275a333](275a333))
* **zk toolbox:** External node support
([#2287](#2287))
([6384cad](6384cad))
* **zk_toolbox:** Add prover init command
([#2298](#2298))
([159af3c](159af3c))


### Bug Fixes

* **api:** fix log timestamp format
([#2407](#2407))
([e9d63db](e9d63db))
* BWIP race condition
([#2405](#2405))
([8099ab0](8099ab0))
* **config:** Implement proper tests
([#2381](#2381))
([2ec494b](2ec494b))
* **db:** Fix / extend transaction isolation levels
([#2350](#2350))
([404ceb9](404ceb9))
* **en:** Fix panics when queuing sync actions during shutdown
([d5935c7](d5935c7))
* **erc20-test:** only approving baseToken allowance when needed
([#2379](#2379))
([087a3c4](087a3c4))
* **eth-sender:** confirm eth-txs in order of their creation
([#2310](#2310))
([31a1a04](31a1a04))
* **eth-sender:** fix query returning inflight txs
([#2404](#2404))
([6a89ca0](6a89ca0))
* **eth-sender:** missing fix in second query calculating txs unsent txs
([#2406](#2406))
([948b532](948b532))
* **eth-sender:** revert commit changing which type of txs we resend
first ([#2327](#2327))
([ef75292](ef75292))
* Fix rustls setup for jsonrpsee clients
([#2417](#2417))
([a040f09](a040f09))
* **merkle-tree:** Change `LazyAsyncTreeReader::wait()` signature
([#2314](#2314))
([408393c](408393c))
* **merkle-tree:** Fix chunk recovery reporting during tree recovery
([#2348](#2348))
([70b3a8a](70b3a8a))
* **merkle-tree:** Fix connection timeouts during tree pruning
([#2372](#2372))
([d5935c7](d5935c7))
* **object-store:** Consider some token source errors transient
([#2331](#2331))
([85386d3](85386d3))
* **tee:** Introduce a 1 second delay in the batch poll
([#2398](#2398))
([312defe](312defe))
* **vm-runner:** change `processing_started_at` column type to
`timestamp`
([#2397](#2397))
([4221155](4221155))


### Reverts

* "refactor: Rename consensus tasks and split storage (BFT-476)"
([#2364](#2364))
([e67ec5d](e67ec5d))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: Lech <[email protected]>
irnb pushed a commit to vianetwork/via-server that referenced this pull request Jul 12, 2024
## What ❔

This commit adds Prometheus metrics to the TEE Prover. Specifically, the
following metrics were added:
- Waiting time for a new batch to be proven
- Proof generation time
- Proof submitting time
- Network error counter
- Last block number processed

## Why ❔

Setting up Prometheus metrics is a prerequisite before rolling them out
to staging and testnet environments. Prometheus metrics are useful for
monitoring, providing valuable insights into the running system.

## Checklist

- [x] PR title corresponds to the body of PR (we generate changelog
entries from PRs).
- [ ] Tests for the changes have been added / updated.
- [ ] Documentation comments have been added / updated.
- [x] Code has been formatted via `zk fmt` and `zk lint`.
irnb pushed a commit to vianetwork/via-server that referenced this pull request Jul 12, 2024
🤖 I have created a release *beep* *boop*
---


##
[24.9.0](matter-labs/zksync-era@core-v24.8.0...core-v24.9.0)
(2024-07-10)


### Features

* add block timestamp to `eth_getLogs`
([matter-labs#2374](matter-labs#2374))
([50422b8](matter-labs@50422b8))
* add revert tests to zk_toolbox
([matter-labs#2317](matter-labs#2317))
([c9ad002](matter-labs@c9ad002))
* add zksync_tee_prover and container to nix
([matter-labs#2403](matter-labs#2403))
([e0975db](matter-labs@e0975db))
* Adding unstable RPC endpoint to return the execution_info
([matter-labs#2332](matter-labs#2332))
([3d047ea](matter-labs@3d047ea))
* **api:** Retry `read_value`
([matter-labs#2352](matter-labs#2352))
([256a43c](matter-labs@256a43c))
* Base Token Fundamentals
([matter-labs#2204](matter-labs#2204))
([39709f5](matter-labs@39709f5))
* **base-token:** Base token price ratio cache update frequency
configurable
([matter-labs#2388](matter-labs#2388))
([fb4d700](matter-labs@fb4d700))
* BWIP ([matter-labs#2258](matter-labs#2258))
([75bdfcc](matter-labs@75bdfcc))
* **config:** Make getaway_url optional
([matter-labs#2412](matter-labs#2412))
([200bc82](matter-labs@200bc82))
* consensus support for pruning (BFT-473)
([matter-labs#2334](matter-labs#2334))
([abc4256](matter-labs@abc4256))
* **contract-verifier:** Add file based config for contract verifier
([matter-labs#2415](matter-labs#2415))
([f4410e3](matter-labs@f4410e3))
* **en:** file based configs for en
([matter-labs#2110](matter-labs#2110))
([7940fa3](matter-labs@7940fa3))
* **en:** Unify snapshot recovery and recovery from L1
([matter-labs#2256](matter-labs#2256))
([e03a929](matter-labs@e03a929))
* **eth-sender:** Add transient ethereum gateway errors metric
([matter-labs#2323](matter-labs#2323))
([287958d](matter-labs@287958d))
* **eth-sender:** handle transactions for different operators separately
to increase throughtput
([matter-labs#2341](matter-labs#2341))
([0619ecc](matter-labs@0619ecc))
* **eth-sender:** separate gas calculations for blobs transactions
([matter-labs#2247](matter-labs#2247))
([627aab9](matter-labs@627aab9))
* **gas_adjuster:** Use eth_feeHistory for both base fee and blobs
([matter-labs#2322](matter-labs#2322))
([9985c26](matter-labs@9985c26))
* L1 batch QC database (BFT-476)
([matter-labs#2340](matter-labs#2340))
([5886b8d](matter-labs@5886b8d))
* **metadata-calculator:** option to use VM runner for protective reads
([matter-labs#2318](matter-labs#2318))
([c147b0c](matter-labs@c147b0c))
* Minimal External API Fetcher
([matter-labs#2383](matter-labs#2383))
([9f255c0](matter-labs@9f255c0))
* **node_framework:** Document implementations
([matter-labs#2319](matter-labs#2319))
([7b3877f](matter-labs@7b3877f))
* **node_framework:** Implement FromContext and IntoContext derive macro
([matter-labs#2330](matter-labs#2330))
([34f2a45](matter-labs@34f2a45))
* **node_framework:** Support shutdown hooks + more
([matter-labs#2293](matter-labs#2293))
([2b2c790](matter-labs@2b2c790))
* **node_framework:** Unify Task types + misc improvements
([matter-labs#2325](matter-labs#2325))
([298a97e](matter-labs@298a97e))
* **node-framework:** New wiring interface
([matter-labs#2384](matter-labs#2384))
([f2f4056](matter-labs@f2f4056))
* **prover:** Add prometheus port to witness generator config
([matter-labs#2385](matter-labs#2385))
([d0e1add](matter-labs@d0e1add))
* **prover:** Add prover_cli stats command
([matter-labs#2362](matter-labs#2362))
([fe65319](matter-labs@fe65319))
* **snapshots_applier:** Add a method to check whether snapshot recovery
is done ([matter-labs#2338](matter-labs#2338))
([610a7cf](matter-labs@610a7cf))
* Switch to using crates.io deps
([matter-labs#2409](matter-labs#2409))
([27fabaf](matter-labs@27fabaf))
* **tee:** add Prometheus metrics to the TEE Prover
([matter-labs#2386](matter-labs#2386))
([6153e99](matter-labs@6153e99))
* **tee:** TEE Prover Gateway
([matter-labs#2333](matter-labs#2333))
([f8df34d](matter-labs@f8df34d))
* Unify and port node storage initialization
([matter-labs#2363](matter-labs#2363))
([8ea9791](matter-labs@8ea9791))
* Validium with DA
([matter-labs#2010](matter-labs#2010))
([fe03d0e](matter-labs@fe03d0e))
* **vm-runner:** make vm runner report time taken
([matter-labs#2369](matter-labs#2369))
([275a333](matter-labs@275a333))
* **zk toolbox:** External node support
([matter-labs#2287](matter-labs#2287))
([6384cad](matter-labs@6384cad))
* **zk_toolbox:** Add prover init command
([matter-labs#2298](matter-labs#2298))
([159af3c](matter-labs@159af3c))


### Bug Fixes

* **api:** fix log timestamp format
([matter-labs#2407](matter-labs#2407))
([e9d63db](matter-labs@e9d63db))
* BWIP race condition
([matter-labs#2405](matter-labs#2405))
([8099ab0](matter-labs@8099ab0))
* **config:** Implement proper tests
([matter-labs#2381](matter-labs#2381))
([2ec494b](matter-labs@2ec494b))
* **db:** Fix / extend transaction isolation levels
([matter-labs#2350](matter-labs#2350))
([404ceb9](matter-labs@404ceb9))
* **en:** Fix panics when queuing sync actions during shutdown
([d5935c7](matter-labs@d5935c7))
* **erc20-test:** only approving baseToken allowance when needed
([matter-labs#2379](matter-labs#2379))
([087a3c4](matter-labs@087a3c4))
* **eth-sender:** confirm eth-txs in order of their creation
([matter-labs#2310](matter-labs#2310))
([31a1a04](matter-labs@31a1a04))
* **eth-sender:** fix query returning inflight txs
([matter-labs#2404](matter-labs#2404))
([6a89ca0](matter-labs@6a89ca0))
* **eth-sender:** missing fix in second query calculating txs unsent txs
([matter-labs#2406](matter-labs#2406))
([948b532](matter-labs@948b532))
* **eth-sender:** revert commit changing which type of txs we resend
first ([matter-labs#2327](matter-labs#2327))
([ef75292](matter-labs@ef75292))
* Fix rustls setup for jsonrpsee clients
([matter-labs#2417](matter-labs#2417))
([a040f09](matter-labs@a040f09))
* **merkle-tree:** Change `LazyAsyncTreeReader::wait()` signature
([matter-labs#2314](matter-labs#2314))
([408393c](matter-labs@408393c))
* **merkle-tree:** Fix chunk recovery reporting during tree recovery
([matter-labs#2348](matter-labs#2348))
([70b3a8a](matter-labs@70b3a8a))
* **merkle-tree:** Fix connection timeouts during tree pruning
([matter-labs#2372](matter-labs#2372))
([d5935c7](matter-labs@d5935c7))
* **object-store:** Consider some token source errors transient
([matter-labs#2331](matter-labs#2331))
([85386d3](matter-labs@85386d3))
* **tee:** Introduce a 1 second delay in the batch poll
([matter-labs#2398](matter-labs#2398))
([312defe](matter-labs@312defe))
* **vm-runner:** change `processing_started_at` column type to
`timestamp`
([matter-labs#2397](matter-labs#2397))
([4221155](matter-labs@4221155))


### Reverts

* "refactor: Rename consensus tasks and split storage (BFT-476)"
([matter-labs#2364](matter-labs#2364))
([e67ec5d](matter-labs@e67ec5d))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: Lech <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants