Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements auto-register for the watchtower-client #164

Merged
merged 2 commits into from
Dec 13, 2022

Conversation

sr-gi
Copy link
Member

@sr-gi sr-gi commented Nov 30, 2022

Description

The current version of the watchtower-plugin does not auto-renew subscriptions once they expire or get out of appointment slots. This may be annoying and some users may even be unaware that the data is not being back-up anymore due to the subscription needing a renewal.

This PR fixes this. The approach for the fix is to use the Retirer for it given the logic for unreachable towers and towers with a subscription issue was pretty similar. The way this works from now on is the following:

If an appointment is sent to the tower and there is a subscription error, the data is added to pending, the tower is flagged as subscription_error and the Retrier is notified (so far nothing changed). The Retirer will now try to renew the subscription if the state of the tower is SubscriptionError before sending any further data though. On failure, the behavior will be the same as before, just giving up, however, on success the pending data will be sent to the tower.

Notice that, from now on, if the Retrier gets a subscription error from the tower it won't give up until a re-register has been attempted, therefore this also covers subscription issues faced while retrying pending data.

Notice there is one case in which the Retrier will need an additional cycle to fix a subscription issue and that is if the tower is stopped in such a state. This is due to the tower state not being stored in the database but deduced based on what data is loaded from it on bootstrap. In the case of some pending appointments being found, the tower is flagged as TemporaryUnreachable and retried straightaway. This means that, under these conditions, the Retrier will:

  • Assume the tower is TempoaryUnreachable when it is really SubscriptionError
  • Attempt to send pending data to the tower
  • Receive a SubscriptionError from the tower
  • Backoff (wait for the next retry cycle)
  • Try to re-register
    • On success: Try to send the pending data again
    • On failure: Give up

Bug fixes

This PR is also fixing a client-side bug related to re-registering. The client was checking that, on re-register, the available slot count assigned to the new subscription was higher than the total count from the last subscription instead of the remaining count.

@sr-gi sr-gi added feature request New feature or request Seeking Code Review review me pls cln-plugin Stuff related to watchtower-plugin labels Nov 30, 2022
@sr-gi sr-gi force-pushed the 69-auto-renew-v2 branch 4 times, most recently from 3b33da8 to ab44760 Compare December 5, 2022 09:18
@mariocynicys mariocynicys self-assigned this Dec 5, 2022
@sr-gi sr-gi marked this pull request as ready for review December 5, 2022 09:28
@sr-gi sr-gi requested a review from mariocynicys December 5, 2022 09:29
@sr-gi
Copy link
Member Author

sr-gi commented Dec 5, 2022

The necessity of an additional backoff cycle to address subscription issues when restarting can be addressed by implementing cln rpc logic so the plugin can be aware of the current chain tip. Once that data is available the client will be able to do client-side subscription expiry validation, so we'll be able to properly flag towers without probing them.

I've already left a TODO comment on the code regarding this.

@sr-gi
Copy link
Member Author

sr-gi commented Dec 6, 2022

Also, something that came to mind when implementing this: maybe we should put this feature under a conf flag so it can be disabled (specially for testing). If so, I think it should be enabled by default.

@sr-gi sr-gi mentioned this pull request Dec 7, 2022
@sr-gi sr-gi force-pushed the 69-auto-renew-v2 branch 4 times, most recently from 0c271c9 to c0c49c6 Compare December 11, 2022 18:29
Copy link
Collaborator

@mariocynicys mariocynicys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,
but one thing that needs to be updated tho is RetrierStatus variants' docs. They regard only temporary unreachable and unreachable states, but since retriers can now work on subscription error-ed towers, this should be listed there as well.

state.set_tower_status(tower_id, TowerStatus::TemporaryUnreachable);
}
let e = to_cln_error(e);
log::info!("{}", e);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is redundant since to_cln_error already logs it.

Copy link
Member Author

@sr-gi sr-gi Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that comes from a rebase :(

Comment on lines 337 to 340
assert!(matches!(
wt_client.add_update_tower(tower_id, &updated_tower_info.net_addr, &receipt_same_slots),
Err(SubscriptionError::Slots)
));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is repeated. Similar one is in the line above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, comes from a rebase, will fix it

@mariocynicys
Copy link
Collaborator

The necessity of an additional backoff cycle to address subscription issues when restarting can be addressed by implementing cln rpc logic so the plugin can be aware of the current chain tip. Once that data is available the client will be able to do client-side subscription expiry validation, so we'll be able to properly flag towers without probing them.

@sr-gi The additional backoff cycle need not to be handled, right? it's done by the backoff strategy already?

@sr-gi
Copy link
Member Author

sr-gi commented Dec 12, 2022

The necessity of an additional backoff cycle to address subscription issues when restarting can be addressed by implementing cln rpc logic so the plugin can be aware of the current chain tip. Once that data is available the client will be able to do client-side subscription expiry validation, so we'll be able to properly flag towers without probing them.

@sr-gi The additional backoff cycle need not to be handled, right? it's done by the backoff strategy already?

Currently it is handled by the backoff strategy, but it properly implemented it won't require a cycle that will always fail. We need client-side subscription validation for that, which requires querying bitcoind via lightningd's rpc.

This helps reducing some of the WTClient is queried only to get the tower status
@sr-gi
Copy link
Member Author

sr-gi commented Dec 12, 2022

@mariocynicys issues should have been fixed

Copy link
Collaborator

@mariocynicys mariocynicys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sr-gi sr-gi merged commit ae91cd5 into talaia-labs:master Dec 13, 2022
@sr-gi sr-gi removed the Seeking Code Review review me pls label Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cln-plugin Stuff related to watchtower-plugin feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants