-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add health check #2286
Add health check #2286
Conversation
@@ -351,6 +360,22 @@ serveWallet | |||
$ \db@DBLayer{..} -> do | |||
|
|||
forM_ settings $ atomically . putSettings | |||
dbSettings <- atomically readSettings | |||
-- initially verity smash server | |||
case poolMetadataSource dbSettings of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more or less on application start. We crash the app if the server isn't reachable.
d7944b3
to
2e3473a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear from the PR description what this health check stuff is supposed to do. From all the discussion on that jira ticket, what did you end up deciding?
Can I suggest that a more user friendly approach would be to not crash when the status check fails? Perhaps log it at warning level and carry on. The smash server may be intermittently unreachable, but that's no reason to prevent starting the wallet.
If Daedalus want to prevent users from setting an incorrect smash url, and go back to the previous value if the new one was wrong, then perhaps provide an endpoint where they can do the status check before changing the setting.
For example, a new endpoint GET /v2/network/information/smash?url=https://smash.awstest.iohkdev.io
. (If the url
query param is missing then it health checks the currently configured smash server.)
@rvl What about just requiring deadalus to do the health check on their end? It isn't complicated and they have more ways to react to incorrect input than we have. I think it's indeed better the wallet just logs this. |
It's possible. All Daedalus would need to know is whether a given smash url works - yes/no. |
2e3473a
to
5db9499
Compare
Should be all set |
cd9c195
to
4660e20
Compare
39b3cf6
to
a4a33df
Compare
manager | ||
except . eitherDecodeStrict @HealthStatusSMASH $ pl | ||
where | ||
runExceptTLog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a tip.
You do not need trace message constructors for both Success and Failure.
Merge the log message constructors to MsgFetchHealthCheckResult (Either FailureType HealthCheckSuccessType)
.
All your logging code will become simpler like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failure here is HTTP failure, success is just "we got an answer" and doesn't mean the server is healthy. I'd like to keep those separate, because they currently don't have the same severity either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you are right in the sense that the response object shouldn't have information that diverges from the HTTP status code it returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever "result" type you choose, it can be pattern matched in the getSeverityAnnotation
definition.
health <- case poolMetadataSource settings of | ||
FetchSMASH uri -> do | ||
let checkHealth _ = do | ||
r <- healthCheck (Just trFetch) (unSmashServer uri) manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to run a health check before fetching data?
If the server is healthy it will return data. If it's not healthy it will return some kind of error from the fetch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to run a health check before fetching data?
Yes, otherwise you're fetching data from an unknown server that may respond with whatever data. SMASH is configurable and we don't control where it points to.
That's the whole point of this ticket: https://jira.iohk.io/browse/ADP-496
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the ticket description says "Daedalus has no way to know whether such a [SMASH] url points to a valid server."
So this change should just provide a way for Daedalus to check that a SMASH URL is correct before applying the setting.
There is no real need to call the health check API before fetching data. Why does everything become more complicated when a health check function is added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this change should just provide a way for Daedalus to check that a SMASH URL is correct before applying the setting.
The ticket says that we want to know whether it points to a valid server, not whether the URL is valid. That's not the same thing. We are dealing with user input here. We don't want to start fetching stuff randomly and then show errors. We want to know ahead of time whether a URL points to an actual SMASH server and that server is ready to accept requests. And if it isn't, daedalus can reject the setting. If we just apply it and then error out when actual metadata is fetched, then that's poor user experience.
f9ece70
to
0280ee2
Compare
4be89fb
to
5737a43
Compare
5737a43
to
6072366
Compare
6072366
to
b35b1b5
Compare
b35b1b5
to
f546051
Compare
Points have been addressed or are now outdated.
f546051
to
3c55011
Compare
@KtorZ I rebased, maybe have a quick look again. Not the first time I messed up a rebase... |
3c55011
to
8533f1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors merge
Build succeeded: |
Issue Number
ADP-496
Details