Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread Safety #3

Open
janxious opened this issue Aug 20, 2018 · 0 comments
Open

Thread Safety #3

janxious opened this issue Aug 20, 2018 · 0 comments

Comments

@janxious
Copy link

The agent is not thread safe. Quoting a customer who investigated with us:

Looking at the code that does not surprise me at all.

https://github.com/Instrumental/instrumental_agent-java/blob/master/src/main/java/com/instrumentalapp/Connection.java#L53-L57

Look here:

Every time something is send, it will make a new thread if the old one was either finished or was not yet made. For one this is not thread safe, since two thread could make the worker thread, resulting in one or more rogue workers. Secondly, the worker is only needed in async mode, so why start a thread every time you send a metric? I say “every time” because that is basically what will happen. Because you start the worker before you actually submit the metric to the messages queue, it will most likely have stopped before the queue gets filled. Only the next metric will probably trigger the actual transmission of the metric. Also, the sending of the queues messages is “loss-ey”. While the synchronous version keeps trying until it succeeds, the async version does not. When the connection drops of the remote side is not available you loose data.

and context from instrumental's thinking:

That said, the worker thread spinups could use a lock of some kind given the nature of the stream locking taking place. It's possible you see something I don't, but the worker shouldn't be stopped before messages are added to the queue unless the connection is receiving a shutdown command.

One of my coworkers and I were looking at the shutdown code for this agent and we don't like the way it works in terms of how it is flushing. It flushes the socket stream but not the message queue, which is not correct.

For the purposes of your app, have you tried introducing a delay after submitting a metric, rather than before? If that works, it might be a workable patch for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant