-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
soroban-rpc: Check DB on health endpoint #449
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, this will be super helpful for clients to leverage the health endpoint.
return HealthCheckResult{Status: "healthy"} | ||
func NewHealthCheck(reader db.LedgerEntryReader) jrpc2.Handler { | ||
return handler.New(func(ctx context.Context) (HealthCheckResult, error) { | ||
if _, err := reader.GetLatestLedgerSequence(ctx); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late comment but I was thinking that it might be better to declare soroban-rpc healthy if the latest ledger close time is relatively recent (e.g. within 20 seconds) because we have seen a bug where soroban-rpc is able to process the history archive snapshot and commit that to the db but then it fails to the prepare range operation on captive core. in that scenario we would have a latest ledger sequence in the db but it could be stale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does that threshold of 20 seconds depend on checkpoint acceleration also, suggestion sounds like a worthwhile internal enhancement if the system can determine the best threshold internally, i.e. don't need to provide externalized config parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, making the threshold configurable makes sense
What
Make sure that the DB is ready in the
getHealth()
endpointWhy
soroban-rpc
is not really ready until we reach the first checkpoint (before which the DB won't be ready, and ingestion won't be able to start)Known limitations
The
getHealth
endpoint now becomes a bit heavier