Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve TopicClient connection resiliency #1400

Merged
merged 2 commits into from
Aug 12, 2024

Conversation

cprice404
Copy link
Contributor

At some point in time a refactor seems to have made the TopicClient less resilient to network interruptions. It may have happened when we made changes to disable keepalives by default (due to lambda issues), or during some refactor work to add topics support to the web SDK.

This commit does the following:

  • Add configuration options for grpc keepalives and enable them in the default topicclient config
  • Adds a good deal of trace-level logging to help debug what is happening when subscriptions are interrupted
  • Re-work the error-handling and "end of stream" logic a bit to improve connection resiliency.

I tested this on my laptop by disabling my wifi after the subscription was created. Prior to these changes, the network issues were not detected and the connection was left in a broken state. After these changes, the issues are detected and the connection is re-established successfully.

@cprice404 cprice404 requested review from a team, malandis and pgautier404 August 3, 2024 01:36
@cprice404 cprice404 force-pushed the experimenting-with-topic-reconnection branch from f89046f to 9bdbcb1 Compare August 7, 2024 20:03
Copy link
Contributor

@anitarua anitarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested locally as well, reconnects after wifi off/on, but does not stay active after an idle period of 5min, which is what we aimed for with dart and swift sdks

@@ -32,6 +32,11 @@ export class Default extends TopicClientConfiguration {
transportStrategy: new StaticTopicTransportStrategy({
grpcConfiguration: new StaticTopicGrpcConfiguration({
numClients: 1,
// TODO: when we introduce lambda configurations, we do not want to enable keepalives, because they
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be worth it to just add the lambda config now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah good call.

Comment on lines 192 to 193
// restartedDueToError: false,
// firstMessage: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Base automatically changed from experimenting-with-topic-reconnection to main August 12, 2024 18:44
At some point in time a refactor seems to have made the TopicClient
less resilient to network interruptions. It may have happened when
we made changes to disable keepalives by default (due to lambda issues),
or during some refactor work to add topics support to the web SDK.

This commit does the following:

* Add configuration options for grpc keepalives and enable them
  in the default topicclient config
* Adds a good deal of trace-level logging to help debug what is
  happening when subscriptions are interrupted
* Re-work the error-handling and "end of stream" logic a bit to
  improve connection resiliency.

I tested this on my laptop by disabling my wifi after the subscription
was created. Prior to these changes, the network issues were not
detected and the connection was left in a broken state. After these
changes, the issues are detected and the connection is re-established
successfully.
@cprice404 cprice404 force-pushed the experimenting-with-topic-reconnection2 branch from a013d8a to 6d2907a Compare August 12, 2024 19:08
Copy link
Contributor

@malandis malandis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and recommend deleting dead code vs commenting out.

@cprice404
Copy link
Contributor Author

I tested locally as well, reconnects after wifi off/on, but does not stay active after an idle period of 5min, which is what we aimed for with dart and swift sdks

thanks for testing! what happened after 5 mins?

Copy link
Contributor

@anitarua anitarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Tested again and it was fine after 5min of being idle. I somehow didn't see the unsubscribe line in the topic-subscribe.ts example 🤦🏻‍♀️

@cprice404 cprice404 merged commit 93eb4ef into main Aug 12, 2024
13 checks passed
@cprice404 cprice404 deleted the experimenting-with-topic-reconnection2 branch August 12, 2024 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants