Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health check #2286

Merged
merged 1 commit into from
Dec 1, 2020
Merged

Add health check #2286

merged 1 commit into from
Dec 1, 2020

Conversation

hasufell
Copy link
Contributor

@hasufell hasufell commented Oct 30, 2020

Issue Number

ADP-496

Details

  • Verify url and SMASH health on app start
  • Verify url and SMASH health on settings change
  • Expose SMASH health check proxy endpoints

@hasufell hasufell added the IMPROVEMENT Mark a PR as an improvement, for auto-generated CHANGELOG label Oct 30, 2020
@hasufell hasufell requested a review from KtorZ October 30, 2020 16:56
@hasufell hasufell self-assigned this Oct 30, 2020
@@ -351,6 +360,22 @@ serveWallet
$ \db@DBLayer{..} -> do

forM_ settings $ atomically . putSettings
dbSettings <- atomically readSettings
-- initially verity smash server
case poolMetadataSource dbSettings of
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less on application start. We crash the app if the server isn't reachable.

@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from d7944b3 to 2e3473a Compare October 30, 2020 17:10
Copy link
Contributor

@rvl rvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear from the PR description what this health check stuff is supposed to do. From all the discussion on that jira ticket, what did you end up deciding?

Can I suggest that a more user friendly approach would be to not crash when the status check fails? Perhaps log it at warning level and carry on. The smash server may be intermittently unreachable, but that's no reason to prevent starting the wallet.

If Daedalus want to prevent users from setting an incorrect smash url, and go back to the previous value if the new one was wrong, then perhaps provide an endpoint where they can do the status check before changing the setting.

For example, a new endpoint GET /v2/network/information/smash?url=https://smash.awstest.iohkdev.io. (If the url query param is missing then it health checks the currently configured smash server.)

@hasufell
Copy link
Contributor Author

@rvl What about just requiring deadalus to do the health check on their end? It isn't complicated and they have more ways to react to incorrect input than we have. I think it's indeed better the wallet just logs this.

@rvl
Copy link
Contributor

rvl commented Nov 11, 2020

It's possible. All Daedalus would need to know is whether a given smash url works - yes/no.
They may prefer better separation of concerns -- or may not care.
We can also basically proxy the smash health check, which is easy enough.

@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from 2e3473a to 5db9499 Compare November 11, 2020 11:27
@hasufell
Copy link
Contributor Author

Should be all set

@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from cd9c195 to 4660e20 Compare November 12, 2020 11:25
lib/core/src/Cardano/Wallet/Api/Types.hs Outdated Show resolved Hide resolved
lib/core/src/Cardano/Wallet/Api/Types.hs Outdated Show resolved Hide resolved
lib/shelley/src/Cardano/Wallet/Shelley.hs Outdated Show resolved Hide resolved
lib/shelley/src/Cardano/Wallet/Shelley/Api/Server.hs Outdated Show resolved Hide resolved
lib/shelley/src/Cardano/Wallet/Shelley/Pools.hs Outdated Show resolved Hide resolved
specifications/api/swagger.yaml Outdated Show resolved Hide resolved
@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from 39b3cf6 to a4a33df Compare November 13, 2020 11:54
@hasufell
Copy link
Contributor Author

smash_health

rvl
rvl previously requested changes Nov 13, 2020
lib/core/src/Cardano/Pool/Metadata.hs Outdated Show resolved Hide resolved
lib/core/src/Cardano/Pool/Metadata.hs Show resolved Hide resolved
lib/core/src/Cardano/Pool/Metadata.hs Outdated Show resolved Hide resolved
manager
except . eitherDecodeStrict @HealthStatusSMASH $ pl
where
runExceptTLog
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a tip.
You do not need trace message constructors for both Success and Failure.
Merge the log message constructors to MsgFetchHealthCheckResult (Either FailureType HealthCheckSuccessType).
All your logging code will become simpler like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failure here is HTTP failure, success is just "we got an answer" and doesn't mean the server is healthy. I'd like to keep those separate, because they currently don't have the same severity either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you are right in the sense that the response object shouldn't have information that diverges from the HTTP status code it returns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever "result" type you choose, it can be pattern matched in the getSeverityAnnotation definition.

lib/core/src/Cardano/Pool/Metadata.hs Outdated Show resolved Hide resolved
health <- case poolMetadataSource settings of
FetchSMASH uri -> do
let checkHealth _ = do
r <- healthCheck (Just trFetch) (unSmashServer uri) manager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really necessary to run a health check before fetching data?
If the server is healthy it will return data. If it's not healthy it will return some kind of error from the fetch.

Copy link
Contributor Author

@hasufell hasufell Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really necessary to run a health check before fetching data?

Yes, otherwise you're fetching data from an unknown server that may respond with whatever data. SMASH is configurable and we don't control where it points to.

That's the whole point of this ticket: https://jira.iohk.io/browse/ADP-496

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the ticket description says "Daedalus has no way to know whether such a [SMASH] url points to a valid server."
So this change should just provide a way for Daedalus to check that a SMASH URL is correct before applying the setting.

There is no real need to call the health check API before fetching data. Why does everything become more complicated when a health check function is added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this change should just provide a way for Daedalus to check that a SMASH URL is correct before applying the setting.

The ticket says that we want to know whether it points to a valid server, not whether the URL is valid. That's not the same thing. We are dealing with user input here. We don't want to start fetching stuff randomly and then show errors. We want to know ahead of time whether a URL points to an actual SMASH server and that server is ready to accept requests. And if it isn't, daedalus can reject the setting. If we just apply it and then error out when actual metadata is fetched, then that's poor user experience.

@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch 3 times, most recently from f9ece70 to 0280ee2 Compare November 13, 2020 13:43
@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from 4be89fb to 5737a43 Compare November 19, 2020 16:29
@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from 5737a43 to 6072366 Compare November 27, 2020 18:20
@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from 6072366 to b35b1b5 Compare November 30, 2020 17:01
@hasufell hasufell requested a review from KtorZ November 30, 2020 17:01
@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from b35b1b5 to f546051 Compare November 30, 2020 17:58
lib/core/test/unit/Cardano/Wallet/Api/Malformed.hs Outdated Show resolved Hide resolved
@KtorZ KtorZ dismissed rvl’s stale review December 1, 2020 08:49

Points have been addressed or are now outdated.

@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from f546051 to 3c55011 Compare December 1, 2020 11:37
@hasufell
Copy link
Contributor Author

hasufell commented Dec 1, 2020

@KtorZ I rebased, maybe have a quick look again. Not the first time I messed up a rebase...

@hasufell hasufell force-pushed the jospald/ADP-496/verify-smash-urls branch from 3c55011 to 8533f1c Compare December 1, 2020 11:53
Copy link
Member

@KtorZ KtorZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors merge

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Dec 1, 2020

Build succeeded:

@iohk-bors iohk-bors bot merged commit 9b5e568 into master Dec 1, 2020
@iohk-bors iohk-bors bot deleted the jospald/ADP-496/verify-smash-urls branch December 1, 2020 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IMPROVEMENT Mark a PR as an improvement, for auto-generated CHANGELOG
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants