Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 using local flag instead of setting client to nil #87

Merged
merged 1 commit into from
Oct 31, 2024

Conversation

skeeey
Copy link
Member

@skeeey skeeey commented Oct 28, 2024

Summary

Currently, we set the client to nil when a client start reconnecting, this may cause nil point

2024-10-17T18:53:15.218Z	ERROR	generic/baseclient.go:88	the cloudevents client is disconnected, ping handler error: PINGRESP timed out
open-cluster-management.io/sdk-go/pkg/cloudevents/generic.(*baseClient).connect.func1
	/go/pkg/mod/open-cluster-management.io/[email protected]/pkg/cloudevents/generic/baseclient.go:88
2024-10-17T18:53:15.219Z	ERROR	generic/baseclient.go:164	failed to receive cloudevents, error while opening the inbound connection: use of closed network connection
open-cluster-management.io/sdk-go/pkg/cloudevents/generic.(*baseClient).subscribe.func1.1
	/go/pkg/mod/open-cluster-management.io/[email protected]/pkg/cloudevents/generic/baseclient.go:164
2024-10-17T18:54:18.841Z	ERROR	generic/baseclient.go:88	the cloudevents client is disconnected, read tcp 10.132.64.71:39424->20.150.164.45:8883: use of closed network connection
open-cluster-management.io/sdk-go/pkg/cloudevents/generic.(*baseClient).connect.func1
	/go/pkg/mod/open-cluster-management.io/[email protected]/pkg/cloudevents/generic/baseclient.go:88
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x2056149]

In this pr, we use local flag to mark the client not ready

@skeeey
Copy link
Member Author

skeeey commented Oct 28, 2024

/assign @qiujian16

for {
if c.currentClient() == nil {
if retrying {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe have a clientReady() bool func, and check here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be not necessary, because the retrying flag always set after the client closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using client ready instead of retrying flag


for {
if cloudEventsClient != nil {
if startReceiving {
Copy link
Member

@qiujian16 qiujian16 Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use clientReady here also, the subscribe/send may change at the same time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we always start the subscribe after the client connected, this is controlled by restartReceiverSignal channel in the base client internal, so I did not use the clientReady here

but for send part, it will be called by outside, so when we send an event, we need check the client if ready or not

if cloudEventsClient == nil {
c.RLock()
defer c.RUnlock()
if !c.clientReady {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use c.isClientReady() here also

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed this part...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@qiujian16
Copy link
Member

/approve
/lgtm

@openshift-ci openshift-ci bot added the lgtm label Oct 31, 2024
Copy link

openshift-ci bot commented Oct 31, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qiujian16, skeeey

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit f50d6e8 into open-cluster-management-io:main Oct 31, 2024
10 checks passed
@skeeey skeeey deleted the flag branch November 1, 2024 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants