Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about ShdrClient implementation #62

Open
AndreiShenets opened this issue Apr 22, 2024 · 6 comments
Open

Questions about ShdrClient implementation #62

AndreiShenets opened this issue Apr 22, 2024 · 6 comments

Comments

@AndreiShenets
Copy link

Hi @PatrickRitchie,

I am investigating a bug when ShdrClient constantly disconnects and reconnects to our machine. I found a few suspicious places in the ShdrClient implementation and I would like to discuss and confirm if they are bugs with you before actually proceed with further actions.

The first potential issue is with ShdrClient.ProcessResponse method. It assumes that incoming char buffer always contains full data strings delimited with \n. But what if read from stream happens in the middle of receiving and you get in the buffer partial data string? The "end" of the string come only with next buffer and next char array. I do not see that this case is handled and from what I see partial chunk is non processed in current chunk and the "end" is not processed in next chunk. So one line is dropped and we have data loss.
In case of complete unluckness with timings and small amount of data you can potentially have a lot or even all data items corrupted
So the question is: why are you sure that the situation cannot happen?

The second potential issue is on line 276 of ShdrClient:

image

In case when DataAvailable and data is read from the stream you still delay further processing by 1 millisecond. I assume that if a machine sends too much data or too often it can lead to the stream buffer overflow as code will not be able to process data fast enough.

I would expect to have Thread.Sleep(1) under line 273 inside of else statement to have "Process data as fast as possible, wait one millisecond otherwise".

The probability of the issue is very low as thread block is only 1 millisecond, machines usually do not send so much data and read buffer is big enough.

@PatrickRitchie
Copy link
Contributor

Thanks for the information and sorry for the delay in response.

What is the average interval that the ShdrClient disconnects? Is it closer to every 10 seconds or 10 minutes?

I would first check to make sure the Timeout is set appropriately for the ShdrClient class. I did just realize that the Heartbeat is not configurable for the ShdrClient so that is something I will add to the next version. The Heartbeat could cause a disconnection if the Adapter has a maximum heartbeat or something that is less than the Agent heartbeat.

The first potential issue you mentioned could cause a disconnect but the buffer is set to 1MB so unless the data being sent is close to that size, I wouldn't think that would be the issue. I do agree that this could (and should) be improved and I will add this to a list of future improvements.

The second potential issue you mentioned shouldn't cause an issue with disconnects but as you said, could limit the rate at which data can be read. This code could be improved as you stated and I will add it to the list of future improvements.

@AndreiShenets
Copy link
Author

I repeatably see sequences like following in logs:

ShdrClinet has been disconnected at 04/24/2024 10:17:35 +00:00 with Disconnected from 192.168.0.11 on Port 7878
ShdrClinet has been connected at 04/24/2024 10:17:45 +00:00 with Connected to Adapter at 192.168.0.11 on Port 7878
ShdrClinet got connection error at 04/24/2024 10:18:24 +00:00 with Unable to write data to the transport connection: Broken pipe.
System.IO.IOException: Unable to write data to the transport connection: Broken pipe.
 ---> System.Net.Sockets.SocketException (32): Broken pipe
   at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   --- End of inner exception stack trace ---
   at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   at MTConnect.Shdr.ShdrClient.ListenForAdapter(CancellationToken cancel)

According to that I would say that

  • it takes always 10 seconds to reconnect which is suspicious because I would expect immediate reconnection if there is no error reported
  • after 39-40 seconds I always get broken pipe

I do not see issues with ShdrClient code that can cause such behavior so I would say that something wrong with machine or network but still looks strange.

@PatrickRitchie
Copy link
Contributor

Thank you for the information.

  • The reconnect interval is set to what is configured in the "ReconnectInterval" parameter. This can be set to a smaller value if you want it to reconnect quicker.
  • Yes getting a broken pipe error is strange. I would curious to know what the error is on the adapter side if it has logs.

Please let me know if you find out what the issue is and I will add the previous suggestions to a To Do list for future updates.

@PatrickRitchie PatrickRitchie moved this from Todo to In Progress in MTConnect.NET-Issues May 16, 2024
@PatrickRitchie PatrickRitchie moved this from In Progress to Todo in MTConnect.NET-Issues May 16, 2024
@PatrickRitchie PatrickRitchie moved this from Todo to Waiting for Response in MTConnect.NET-Issues May 16, 2024
@AndreiShenets
Copy link
Author

We are still investigating so if we find something I will post the info here.

The logs I mentioned at the beginning are from "Adapter". Actually I use ShdrClient to connect directly to machine. So my module is technically an adapter 😄

@PatrickRitchie
Copy link
Contributor

If that is the case, you may want to look at using an Embedded Agent which would prevent you from having to use SHDR altogether. I've got a few examples:

This was the original intent of writing an Agent in .NET was to be able to embed it directly in the "Adapter" and remove the need to use SHDR.

@AndreiShenets
Copy link
Author

Thank but I am not implementing Agent or Adapter as what they should be. So I use connectivity part for my module to do required logic and forward data further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Waiting for Response
Development

No branches or pull requests

2 participants