Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise publish serialization exception early #140

Closed
wants to merge 4 commits into from

Conversation

mtmk
Copy link
Collaborator

@mtmk mtmk commented Sep 30, 2023

When publishing, if there are serialization errors we need to throw the exception at the point of publish call being made.

This changes buffer handling quite a bit.

When publishing, if there are serialization errors we need to throw
the exception at the point of publish call being made.

This changes buffer handling quite a bit.
@mtmk
Copy link
Collaborator Author

mtmk commented Sep 30, 2023

@jasper-d @caleblloyd this resolves the exception #138 issue but changes serialization buffer location which must be considered in serialization / API discussions #137

@mtmk mtmk marked this pull request as draft October 1, 2023 11:37
Copy link
Contributor

@jasper-d jasper-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasper-d @caleblloyd this resolves the exception #138 issue but changes serialization buffer location which must be considered in serialization / API discussions #137

I think serializing early is beneficial, not only because it fixes the exception being observed here. It also enables serialization work to happen concurrently (for multiple publish operations) instead of serializing all serialization work inside WriteLoopAsync. This should positively impact throughput.

return result;
}

public override void Write(ProtocolWriter writer)
{
writer.WritePublish(_subject!, _replyTo, _headers, _value, _serializer!);
writer.WritePublish(_subject!, _replyTo, _headers, new ReadOnlySequence<byte>(_buffer.WrittenMemory));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, _serializer and _value aren't needed anymore and PublishCommand doesn't need to be generic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, _buffer or rather it's underlying array should be freed here. Otherwise, we would keep a potentially huge buffer allocated after sending a large message.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also enables serialization work to happen concurrently (for multiple publish operations) instead of serializing all serialization work inside WriteLoopAsync

Good point.

Also, _buffer or rather it's underlying array should be freed here.

The idea of keeping the _buffer around to avoid GC since it'd be pooled with the command object.

On the other hand we could leave out the buffer management to end developer and just accept ReadOnlySequence or Memory or something or even an IMemoryOwner?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, @jasper-d I think I misunderstand your freeing buffer comment (wasn't looking at where in code you commented). You have a point. I thought about this as well. We could resize the underlying array to be smaller or enforce a size limit as it would be by the server anyway.

@caleblloyd
Copy link
Collaborator

Would need to see benchmarks on what this does to allocations. But this puts us in an odd spot. As it stands without this PR we have:

  1. WaitUntilSent=true - you'll get Serialization Exceptions and Send Exceptions. You follow the message all of the way through.
  2. WaitUntilSent=false - Serialization Exceptions and Send Exceptions would need to be handled asynchronously (we may need to add a hook for that)
  3. You could also serialize on your own then send the bytes to catch Serialization Exceptions synchronously and Send Exceptions asynchronously

After this PR, 2 becomes exactly the same as 3 and we introduce an allocation per serialized model.

Would rather use some sort of pipeline with a serialization buffer size, and maybe an option for WaitUntilSerialized

@mtmk
Copy link
Collaborator Author

mtmk commented Oct 3, 2023

Introduced a buffer pool to soften the blow on allocations when serialization is done early but didn't seem to help.

| Method                  | Iter | Mean        | Error        | StdDev      | Gen0     | Gen1    | Gen2    | Allocated |
|------------------------ |----- |------------:|-------------:|------------:|---------:|--------:|--------:|----------:|
| WaitUntilSentTrue       | 64   |  2,828.9 us |  1,637.15 us |    89.74 us |        - |       - |       - |    6996 B |
| WaitUntilSentFalse      | 64   |    161.5 us |     39.35 us |     2.16 us |        - |       - |       - |     602 B |
| WaitUntilSentFalseEarly | 64   |    216.9 us |     74.63 us |     4.09 us |  14.4043 |  7.0801 |       - |  200043 B |
| WaitUntilSentTrue       | 1000 | 43,930.1 us | 48,173.78 us | 2,640.57 us |        - |       - |       - |  105673 B |
| WaitUntilSentFalse      | 1000 |    723.5 us |    113.32 us |     6.21 us |   2.9297 |       - |       - |   48479 B |
| WaitUntilSentFalseEarly | 1000 |  1,136.7 us |    175.79 us |     9.64 us | 183.5938 | 91.7969 | 35.1563 | 2507530 B |

WaitUntilSentFalseEarly: Early serialization enabled when using Post Publish method.

@jasper-d
Copy link
Contributor

jasper-d commented Oct 4, 2023

Introduced a buffer pool to soften the blow on allocations when serialization is done early but didn't seem to help.

If I understand this correctly, the issue is that WaitUntilSent = false does not handle backpressure, so serialization of all messages will happen more or less all at once, even if the channel is full?

So for WaitUntilSentFalseEarly we end up serializing up to 1000 messages in parallel (apparently way less in reality, presumably because buffers are already returned to the pool while still serializing messages), for each allocating a FixedArrayBufferWriter backed by a 64k array by default?

So maybe it would be better to fix the send logic in a way that WaitUntilSent = true is just as fast (or faster) than WaitUntilSent = false and that option could be removed entirely?

On top of that, FixedArrayBufferWriter could be replaced with an implementation that is backed by MemoryPool/ArrayPool<byte>.Shared and creates a linked list of ReadOnlySequenceSegment<byte>, starting with a small-ish buffer (say 1k - 4k) and appending increasingly larger buffers as needed to avoid wasting to much memory. All the heavy-lifting (growing/shrinking the buffer pool(s) etc.) would then be handled by the BCL and here we would only need to pool segments if profitable.

@mtmk
Copy link
Collaborator Author

mtmk commented Oct 4, 2023

So maybe it would be better to fix the send logic in a way that WaitUntilSent = true is just as fast (or faster) than WaitUntilSent = false and that option could be removed entirely?

On top of that, FixedArrayBufferWriter could be replaced with an implementation that is backed by MemoryPool/ArrayPool.Shared and creates a linked list of ReadOnlySequenceSegment, starting with a small-ish buffer (say 1k - 4k) and appending increasingly larger buffers as needed to avoid wasting to much memory. All the heavy-lifting (growing/shrinking the buffer pool(s) etc.) would then be handled by the BCL and here we would only need to pool segments if profitable.

I think that is a good idea and I also would like to wrap this PR before it grows too big and place these ideas in a new issue or under the existing serialization improvements issue.

My proposal to have progress is to solve the issue at hand, which is handling serialization exceptions when publishing.

  • Publish with WaitUntilSent=true will throw the serialization exceptions
  • Publish with WaitUntilSent=false will pass the serialization exceptions to a handler

@mtmk
Copy link
Collaborator Author

mtmk commented Oct 5, 2023

@jasper-d @caleblloyd closing this PR as an experiment. I've referenced this PR from relate issues but please feel free to create other issues etc. Thank you for the input and ideas.

Changes are copied to #144 and #145 please review them instead.

@mtmk mtmk closed this Oct 5, 2023
@mtmk mtmk deleted the publish-serialization-exception-handling branch October 5, 2023 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants