Watch stops working after 4 days and receives duplicate events on delete pod #1239
-
Describe the bug
Because of the issue we faced earlier about the connection being closed because of inactivity, we put the watcher in an infinite loop so that the connection is re-established every time it is lost.
Is this an expected behaviour? We are keeping a count of how many runs were triggered, and how many completed, and receiving the modified 'succeeded' event again is throwing off our count. Is there something wrong in the implementation that can be corrected to stop receiving this event twice? Kubernetes C# SDK Client Version Server Kubernetes Version Dotnet Runtime Version To Reproduce Expected behavior Where do you run your app with Kubernetes SDK (please complete the following information):
|
Beta Was this translation helpful? Give feedback.
Replies: 8 comments 4 replies
-
1 most likely your version is garbage collected. that is why restarting fixed the issue. (version var reset) 2 are those modified events just something related to pod state change? for example, terminating? |
Beta Was this translation helpful? Give feedback.
-
So will we have to catch 410 explicitly, and it won't be caught in the generic Exception block? If it will be caught, it should enter the loop again, right?
|
Beta Was this translation helpful? Give feedback.
-
i mean your
|
Beta Was this translation helpful? Give feedback.
-
I see. Let me make that change and observe. Thanks for your reply! |
Beta Was this translation helpful? Give feedback.
-
@tg123 , do you think it's better to always initialize the list with resourceversion null, or should I set the resource version to null if we receive a 410?
Or should I catch the exception and set the resource version to null, as below -
How would the behavior differ in these 2 scenarios? The one difference that I understood if that if I initialize the list without a resource version and then I watch on it, I will receive the 'Added' type event (when pod gets created) for all the existing pods in the namespace, regardless of what the current status of those pods are. Is this understanding correct? Will there be any other implication? |
Beta Was this translation helpful? Give feedback.
-
you can update |
Beta Was this translation helpful? Give feedback.
-
@tg123, I made the recommended update as below -
But the same thing happened again on Sunday March 19 3:10 pm UTC. After the update received on 3:10 pm, the watcher stopped receiving any further events and no exception was logged. How can I figure out what went wrong? Is there a way to force this method to restart every 30 minutes?
|
Beta Was this translation helpful? Give feedback.
-
The issue was resolved by removing ConfigureAwait(false) in the startup during initialization. The watcher is no longer stopping. |
Beta Was this translation helpful? Give feedback.
The issue was resolved by removing ConfigureAwait(false) in the startup during initialization.
The above code remains the same.
In startup, we earlier had
service.WatchPodsAsync().ConfigureAwait(false);
and this was changed to
service.WatchPodsAsync()
The watcher is no longer stopping.