Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Check if storage is working when returning the response to a faucet healthcheck request. #1189

Merged
merged 2 commits into from
Jun 30, 2022

Conversation

philippecamacho
Copy link
Collaborator

Closes #1186.

I am not sure if the storage requests are neutral.
Also I have an error when I run the tests locally. Something similar to the error described in #1179.

@CLAassistant
Copy link

CLAassistant commented Jun 29, 2022

CLA assistant check
All committers have signed the CLA.

@sveitser
Copy link
Contributor

sveitser commented Jun 30, 2022

I think this should do the job.

If only the remove steps fails (unlikely) then subsequent inserts would not actually write to disk anymore because if the key is already in the index it doesn't insert it again until it has received the grants and is removed. This could be avoided by using a random key instead.

    /// Add an element to the persistent index.
    ///
    /// Returns `true` if the element was inserted or `false` if it was already in the index.
    fn insert(&mut self, key: UserPubKey) -> Result<bool, FaucetError> {
        if self.index.contains_key(&key) {
            // If the key is already in the index, we don't have to persist anything.
            return Ok(false);
        }
...

Another concern would be the faucet becoming unresponsive if the /healthcheck endpoint gets hit often because of the lock. This could be avoided by checking if we can write to disk in some other way (for example by creating a random file). The good thing about the current implementation is however that it exercises a lot more of the machinery we use in the faucet.

Overall I think it may reduce problems or alert us earlier so I'd be happy to try like this.

I'd be curious to get @jbearer's opinion though.

@jbearer
Copy link
Member

jbearer commented Jun 30, 2022

I don't think lock contention is a big problem. The time this healthcheck endpoint spends with the lock is nothing compared to how long it takes to build a transaction. Plus the healthcheck is only called every few seconds, I believe.

If only the remove step fails, then we will actually end up sending a grant to the default address, which wastes a small amount of funds and time, but it's not too bad given that this should be extremely rare. And when we send the grant to the default address, we will remove it from the queue, so subsequent healthchecks will start doing the correct thing again.

Only suggestion is I think we should log something at ERROR level if either of these writes fails, so that if the healthcheck fails we will know why, and especially if only one of the writes fails, we will know it happened.

@philippecamacho philippecamacho force-pushed the feat/t1186-try-storage-healthcheck-faucet branch from 83f952f to 5b62743 Compare June 30, 2022 16:37
@philippecamacho
Copy link
Collaborator Author

Only suggestion is I think we should log something at ERROR level if either of these writes fails, so that if the healthcheck fails we will know why, and especially if only one of the writes fails, we will know it happened.

Right, done in 5b62743.

Copy link
Contributor

@sveitser sveitser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@philippecamacho philippecamacho merged commit 8971a97 into main Jun 30, 2022
@philippecamacho philippecamacho deleted the feat/t1186-try-storage-healthcheck-faucet branch June 30, 2022 20:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Faucet: Check if storage is working when doing healthcheck
4 participants