Fix Nacos reconnection failed continuously issue with message "com.alibaba.nacos.common.remote.exception.RemoteException: errCode: 500, errMsg: Unknown payload type:ServerCheckResponse" but actually the connection is already established. #1676
+18
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix Nacos reconnection failed continuously issue with message "com.alibaba.nacos.common.remote.exception.RemoteException: errCode: 500, errMsg: Unknown payload type:ServerCheckResponse" but actually the connection is already established because PayloadRegistry without initializing due to incorrect ClassLoader when creating Nacos client
What type of PR is this?
Bug
What this PR does / why we need it?
Nacos reconnection always failed because PayloadRegistry without initializing due to incorrect ClassLoader when creating Nacos client, This PR will fix it.
Which issue(s) this PR fixes?
Fixes # Nacos client will reconnect to Nacos server when connection is lost and it will depend on PayloadRegistry initializing correctly so RpcClient can found response class to deserialize the response data.
Here is the error stack trace profiling by Arthas.
com.alibaba.nacos.common.remote.exception.RemoteException: errCode: 500, errMsg: Unknown payload type:ServerCheckResponse at com.alibaba.nacos.common.remote.client.grpc.GrpcUtils.parse(GrpcUtils.java:133) at com.alibaba.nacos.common.remote.client.grpc.GrpcClient.serverCheck(GrpcClient.java:198) at com.alibaba.nacos.common.remote.client.grpc.GrpcClient.connectToServer(GrpcClient.java:307) at com.alibaba.nacos.common.remote.client.RpcClient.reconnect(RpcClient.java:498) at com.alibaba.nacos.common.remote.client.RpcClient.lambda$start$2(RpcClient.java:339) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
but actually the reconnect response from Nacos server is success with 200 status code.
body_=@Any[ serialVersionUID=@Long[0], cachedUnpackValue=null, TYPE_URL_FIELD_NUMBER=@Integer[1], typeUrl_=@String[], VALUE_FIELD_NUMBER=@Integer[2], value_=@LiteralByteString[<ByteString@478c0716 size=98 contents="{\"resultCode\":200,\"errorCode\":0,\"connectionId\":...">], memoizedIsInitialized=@Byte[-1], DEFAULT_INSTANCE=@Any[], PARSER=@[com.alibaba.nacos.shaded.com.google.protobuf.Any$1@4e975aba], serialVersionUID=@Long[1], alwaysUseFieldBuilders=@Boolean[false], unknownFields=@UnknownFieldSet[], memoizedSize=@Integer[-1], memoizedHashCode=@Integer[0], ],
Here is the code from GrpcUtils.parse method.
And the REGISTRY_REQUEST map from PayloadRegistry is empty but it is initialized
[arthas@1]$ ognl -classLoaderClass com.huaweicloud.sermant.core.classloader.FrameworkClassLoader "@com.alibaba.nacos.common.remote.PayloadRegistry@REGISTRY_REQUEST" @HashMap[isEmpty=true;size=0] [arthas@1]$
Does this PR introduce a user-facing change?
No
Checklist