-
Notifications
You must be signed in to change notification settings - Fork 43
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: add notes about metrics + error logging (#747)
- Loading branch information
Showing
1 changed file
with
96 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
# Metrics And Logging | ||
|
||
> :warning: NOTE: this document serves as a starting point for debugging and does not provide an exhaustive/definitive answer | ||
The relay exports metrics and chain-specific errors. This document identifies common metrics/logs and potential reasons for behavior. | ||
|
||
## Error Logging | ||
|
||
[`failed to enqeue tx for simulation`](https://github.com/smartcontractkit/chainlink-solana/blob/a2ff2b377b72d06dc85b5242d93bb2f974967145/pkg/solana/txm/txm.go#L129) | ||
|
||
* indicates slow RPCs that are not responding quickly enough | ||
|
||
[`original signature does not match retry signature`](https://github.com/smartcontractkit/chainlink-solana/blob/a2ff2b377b72d06dc85b5242d93bb2f974967145/pkg/solana/txm/txm.go#L301) | ||
|
||
* this could indicate a race condition within the relayer code (please alert developers for investigation) | ||
|
||
[`failed to find transaction within confirm timeout`](https://github.com/smartcontractkit/chainlink-solana/blob/a2ff2b377b72d06dc85b5242d93bb2f974967145/pkg/solana/txm/txm.go#L372) | ||
|
||
* indicates network congestion or poor RPC performance (tx dropped) | ||
|
||
[`simulate: unrecognized error`](https://github.com/smartcontractkit/chainlink-solana/blob/a2ff2b377b72d06dc85b5242d93bb2f974967145/pkg/solana/txm/txm.go#L494) | ||
|
||
* There is usually an additional output within the result parameter of the error: | ||
* `InsufficientFundsForRent`: sender balance too low | ||
* `AccountNotFound`: sender or used account does not exist (if previously existed, could have been garbage collected) | ||
* Additional errors + reasons can be found here: https://github.com/solana-labs/solana/blob/master/sdk/src/transaction/error.rs | ||
|
||
[`failed to enqeue tx`](https://github.com/smartcontractkit/chainlink-solana/blob/a2ff2b377b72d06dc85b5242d93bb2f974967145/pkg/solana/txm/txm.go#L528) | ||
|
||
* indicates slow RPC which does not respond quickly enough to keep up with the incoming stream of transactions | ||
|
||
[`error in ReadAnswer: stale answer data, polling is likely experiencing errors`](https://github.com/smartcontractkit/chainlink-solana/blob/a2ff2b377b72d06dc85b5242d93bb2f974967145/pkg/solana/transmissions_cache.go#L110C21-L110C98) | ||
|
||
* indicates RPC issues (most likely down) | ||
|
||
[`error in ReadState: stale state data, polling is likely experiencing errors`](https://github.com/smartcontractkit/chainlink-solana/blob/a2ff2b377b72d06dc85b5242d93bb2f974967145/pkg/solana/state_cache.go#L114C21-L114C96) | ||
|
||
* indicates RPC issues (most likely down) | ||
|
||
## Metrics | ||
|
||
[`solana_balance`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/monitor/prom.go#L14) | ||
|
||
* provides the SOL balance for keys in the keystore | ||
* low SOL balance will lead to the CL node stop transmitting | ||
|
||
[`solana_cache_last_update_unix`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/monitor/prom.go#L18) | ||
|
||
* tracks last update to cached data (unix timestamp) | ||
* updates should occur at the configured rate (default: 1s), slower updates can indicate RPC latency issues | ||
|
||
[`solana_client_latency_ms`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/monitor/prom.go#L23) | ||
|
||
* tracks duration of each RPC request, separated via label + URLs | ||
* spikes in latency can indicate RPC issues | ||
|
||
[`solana_txm_tx_success`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/txm/prom.go#L10) | ||
|
||
* total of TXs that are confirmed and successfully executed on chain | ||
* this value should consistently increase. If it does not, this could indicate RPC latency or funding issues. | ||
|
||
[`solana_txm_tx_pending`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/txm/prom.go#L16) | ||
|
||
* current TXs that are inflight (not confirmed success or error) | ||
* this value should stay mostly constant - spikes could indicate lagging performance due to slow RPCs. | ||
|
||
[`solana_txm_tx_error`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/txm/prom.go#L22) | ||
|
||
* sum of TXs that have errored for any reason | ||
* depending on the network configuration, this value should either be constant or increase | ||
|
||
[`solana_txm_tx_error_revert`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/txm/prom.go#L26) | ||
|
||
* total of TXs that have been confirmed but error with a revert | ||
* depending on the network configuration, this value should either be constant or increase | ||
|
||
[`solana_txm_tx_error_reject`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/txm/prom.go#L30) | ||
|
||
* total of TXs that have been immediately rejected by the RPC | ||
* value should be near zero, TXs should not be immediately rejected by the RPC. this could indicate faulty RPC or | ||
|
||
[`solana_txm_tx_error_drop`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/txm/prom.go#L34) | ||
|
||
* total of TXs that have been broadcast to the network but was not confirmed within the configured timeout | ||
* an increasing value can indicate RPC latency issues or network congestion | ||
|
||
[`solana_txm_tx_error_sim_revert`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/txm/prom.go#L38) | ||
|
||
* total of TXs that reverted during simulation | ||
* value should not increase rapidly and should be low, if it does it may indicate misconfiguration on the CL node or onchain | ||
|
||
[`solana_txm_tx_error_sim_other`](https://github.com/smartcontractkit/chainlink-solana/blob/4ca9bcc8264d89c7527897e729281e13f37852f1/pkg/solana/txm/prom.go#L38) | ||
|
||
* total of TXs that failed during simulation with an unrecognized error | ||
* value should not increase rapdily and should be low, requires looking through logs for the unrecognized error and diagnosing further from there | ||
|