-
Notifications
You must be signed in to change notification settings - Fork 15.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Async version of CSharp protobuf #3166
Comments
I would strongly suggest not using More worryingly, it may require additions to interfaces, which basically can't happen due backwards compatibility. You may well want to look at an |
@jskeet Thanks for your answer. I have created a prototype and it actually uses a separate IAsyncMessage :) interface to prevent breaking compatibility with existing code. Regular async methods are where the problem lies. One would want to minimize the amount of code in the regular async methods and limit the regular async methods only to the slow-path. Hot path should not do any GC-allocations and my intuition says it is possible. However not without ValueTask. How would the optimized pattern look
It is only possible to efficiently implement Writing without ValueTask because write methods have no return values and we can use Task.CompletedTask in the fast path to avoid heap allocations. BTW. It will require bumping up net45 to net46 AFAIR. It is impossible to efficiently implement Reading without ValueTask because sooner or later you need to use Task.FromResult() which will do the heap allocation. And there will be at least one heap allocation for EVERY value read. What does not sound reasonable. ValueTasks should be working without any problems on top of .NET Standard and I see no point in maintaining library version compatible with something older than .NET Standard 1.0. BTW. This project is already depending on .NET Standard 1.0, so I really don't get the point about ValueTasks. Or am I missing something? Regarding the generated code (the protobuf, not the async ;) ). I think it would be a good idea to add an option like Async that will cause generate IAsyncMessage instead of IMessage to retain the reverse compatibility with existing user and maybe in future make it default if it will make any sense. |
"ValueTasks should be working without any problems on top of .NET Standard" - not without adding a new dependency. Adding any dependency to a widely-used library is fraught with problems. My aim wouldn't be nearly as aggressive as yours of removing all heap allocations. If you really want to do that, just do all the reading asynchronously into a I can see a benefit in having optional async code generated and async code using just Of course, the Protobuf team could go ahead with that anyway - I'm just a contributor - but I'd strongly recommend against it. |
I see you point. However I am not sure whether Unity isn't going towards .NET Standard as well (but I don't use Unity). OK. So we basically expect different things from this library. I expect ultra-high performance serialization library where it's worth to use latest developments in .NET CoreFX. And you expect widely supported and widely compatible. But it is impossible to have both of these goals in a single library. I had many perf ideas including using for example buffers rental (I have seen new byte[1024] in code way too many times :) - I understand now - buffer rental requires System.Buffers... ), adding streaming serialization support, better struct support. But this is exactly the opposite direction you want this library to go. I am aware that you are the author of the original code, and I definitely don't want to push against you. So I will probably end up creating an unsupported fork. BTW. How does MemoryStream avoid heap allocations as it is itself allocated on heap (even on LOH)? And with it's reallocation algorithm it is the most useless of doom class in entire .NET Framework responsible for most OutOfMemoryException people have due to LOH fragmentation. But I get your point about pre-buffering (using something better than MemoryStream) the data and using the current synchronous code. I have even mentioned this approach in the Rationale. What I wanted, is to use as many stack allocations as possible. C# is not Java and you can do a lot of perfwork here. To sum up: doing async without ValueTask is basically non-sense in this specific case. If you can't accept ValueTasks, I understand. And I will not pursue this PR here. |
In terms of MemoryStream and the heap allocations: you'd load the data into a single MemoryStream (which could potentially be pooled and reused) - but then you'd know that everything would be synchronous, so you'd just use the sync code instead of async, so you wouldn't end up with lots of tasks being allocated. I still don't accept that it's pointless to do async without |
If you really think it's not pointless to do async without ValueTask in this specific case, then I can create a PR with the 1st step of my Proposal (except with Task instead of ValueTask), and then I can do performance optimizations in my fork. We can then do then some benchmarks to see if they are worth and maybe it will be easier to make an informed decision about ValueTasks based on some hard number. Does this sound reasonable? |
Yes, that sounds reasonable. I would say that it's possible that even without |
@jskeet |
For the support libraries? Yes, that might make sense. For generated code, it'll be much simpler to keep it all in one file. |
I have created a PR to help tracking changes. The current code is focused on correctness - not performance. Feel free to drop any comments. |
@jskeet What is you opinion should it be a CLI option or File option or both? I have started with a FileOption. But now I am not sure... BTW. Even if the async option is enabled I am generating code like this to ensure backwards compatibility.
It is really important for the pre-generated classes for well-known types. |
Hmm. It should definitely be a CLI option: two users may want to use the same .proto file, with one of them generating async code and one not. Using the NET35 definition seems fairly reasonable - certainly required for the Google.Protobuf library - but I'd be tempted to do this using partial classes: public sealed partial class MyMessageName : pb::IMessage<MyMessageName> {
// Non-async stuff
}
#if !NET35
public sealed partial class MyMessageName : pb::IAsyncMessage<MyMessageName> {
// Async methods
}
#endif Aside for anything else, that will mean that the diff between "async-enabled" and "async-disabled" is a single block of code. I'd be tempted to use another preprocessor symbol as well, so users can define something for other builds where they want non-async-only, e.g. #if !NET35 && !PROTOBUF_NO_ASYNC
...
#endif Just thinking aloud here though... |
@jskeet Hmmm really nice idea with this partial (in single file)... I will definitely do this. But I would prefer to defer this in time, because this is a non-breaking change, if you agree. I was thinking about this preprocessor symbol. However to be honest not sure whether PROTOBUF_NO_ASYNC alone is not a better option than NET35 as there are NET35 implementations of async (AsyncBridge). Please let me know and I will correct everything to match. BTW. I am currently not adding conditional compilation sections to *.Async.cs because I believe you will skip entire files in NET35/PROTOBUF_NO_ASYNC versions. |
I would advise against making it a file option as well - I really don't think there's any need for it. Deferring the organizational change: I agree it's non-breaking, but I think it would be worth doing before we get as far as merging. But it can definitely be done after everything else is working... Compilation symbols: yes, we could easily just go with I'm not sure how you were suggesting we skip the entire files for compiling for .NET 3.5 though - while we could do that within the project file, I think it would be cleaner to do it in the files themselves. |
Thanks for your quick answers, they help me progress faster.
I have started some work with testing - I basically try to reuse the synchronous tests. |
@jskeet Let's focus this PR on correctness and do all the perf-work in separate PRs. |
I'll see whether I can get time to review the PR later this week. I've been thinking about the parser part, and I wonder whether it may be feasible to do it all with extension methods on the parser instead. But let's get to that later... |
@jskeet I don't think it's feasible at all, because it requires access to the internal state. It's only feasible as much as some of the synchronous could be moved to an extension class and even if we do it's a terrible moment because I will want to do some perf-work focusing on replacing as many async functions as possible with their synchronous versions (still returning Tasks - don't worry ;) ) (to avoid compiler-generated state-machines and unnecessary allocations). I have added all the missing documentation. When my PR passes sanity checks I will now remove the [Do not merge]. I think that the first part is quite complete and ready for review. |
We could provide internal access to whatever state we needed. I'll bear it in mind when I review the PR. I still don't know when I'm going to have time to do it, but I'll have a look at it later this week with any luck. |
@jskeet |
Not yet. I started, but had to move onto other things. I'll come back to it when I can. |
I've started looking now, mostly at the class/interface hierarchy around messages and parsers. Are there reasons why we need to have a single parser for both, rather than having an async parser entirely separate from a sync parser? I wouldn't normally be suggesting that, but basically Separating the two would also leave more room for the option of not generating the synchronous code at all - so you could have:
We'd need to think about JSON parsing for the async case... I wonder whether that could wait for the moment though. As you've looked at this more than me, you may have some reason for doing it that way though - in which case, please share :) (There's the slight memory cost of twice as many parsers of course...) |
I also feel that the class hierarchy is a bit sloppy... but had no better idea. To be honest I feel like it would take a considerable amount of time to get before we get async version to be as fast as synchronous version in a synchronous scenario, but this might happen eventually (I really hope so). But that's a good future-proof idea. You very often have synchronous parsing even in full async async scenarios for example sub-message parsing (Any-type). OK. It was 2 week ago so I don't remember all the details, but I will try to recall most important reasons:
On the other hand I don't like original class hierarchy as well. For example I don't see a good reason to have So my recommendation would be:
And here we save yet-another-class-factory - with much cleaner hierarchy. Async JSON parsing should be quite trivial to implement as it is reflection based... UPDATE: UPDATE2: I have also recalled yet another reason for a single parser field. I was thinking about scenarios where some messages are generated with async and some are not and idea behind creating separate parser class, but placing it on the same Parser Property was to allow nice cooperation between those 2 versions. UPDATE3: |
We can't just start removing classes from the hierarchy without that being a breaking change requiring a new major version - that's simply not going to happen. We could build the async parts based on interfaces though, so As for the existence of the non-generic I think it's worth trying a few different ideas here... if we want to keep the single parser per message, I think creating an |
I definitely agree. Maybe it should be considered during next major version? However that's definitely good point we are not bound by this limitation for AsyncMessageParser - so that's a good idea to start with IAsyncMessageParser. and go through the MessageParser hierarchy today and maybe in the new major version end-up replacing MessageParser with IMessageParser. OK. I will update my PR. Let's keep fingers crossed I will not need to regenerate descriptor-protos to pass the tests. |
There's a lot we might consider for the next major version, but I wouldn't personally expect that to be before 2020. |
;) fair enough |
I agree. The only reason to tackle them together is that both require significant API redesign and both redesigns might be affecting each other. However IMO it's way to early to tackle So let's focus on async only. Just to be sure (those names are very similar ;) ) this specific issue is for |
I am strongly considering killing my own proposal at least in the current version. Implementing it efficiently will cost a lot of effort while I believe that from the practical standpoint it would be much better idea to implement flow with a following pseudo-code:
I think it might be beneficial to implement core parsing using spans without any async or stream reading inside. This should fit very well recommended scenarios like gRPC where messages should be considerably small and huge/async communication should be done using streaming. It also should be more efficient in terms of concurrent system and DDoS protection - this approach strongly decreases GC-pressure and working set size of requests being still in-transfer, which cannot be handled, |
I think it's definitely worth trying to work out an endpoint in terms of all the possible options, then consider how we want to get there - it would be a real shame to make a change in one direction that then limited us for another option. I'll be learning a lot more about |
I just discovered some recently added code we have on our project that blocked the common pool (a major faux pas) on a call to While I'm sure your concerns about doing this properly and elegantly are correct, understand that it will never be perfect, and the longer you delay, the more chances there are for us litle-protobuf-people to screw up our codebases with your API. Perhaps some code in In other words, what if you drop the implementation changes and instead simply add a little static class
or, alternatively, implement the semantics you were suggesting around a |
I had some spare time to implement and run tens of experiments in area of parsing using
Conclusions:
I will try to start implementing a proper PR soon, unless you see some downsides to this approach. I think we can also improve in future |
I forgot describe what a |
I think its fine to add support for Async as long as it doesn't replace the existing synchronous APIs. The code I write normally has custom threading implementations and I don't want to rely on the thread pool used by Async. |
Now that Dotnet 5 and 6 are out and there is consensus about where the language is moving, many of the concerns raised early on in this issue may be resolved. Definitely would love to see this implemented. |
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment. This issue is labeled |
Async reading from streams is still missing and important. |
I've removed the "inactive" label, but I won't have any time to do work here beyond reviewing. I'm quite happy for someone else to put some time in, but:
I have no problem with the basic premise of adding async support - I'd just caution that many features that sound simple to start with prove extremely difficult within the confines of not accepting breaking changes. |
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment. This issue is labeled |
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it. This issue was closed and archived because there has been no new activity in the 14 days since the |
I would like to implement support for async/await versions of all methods dealing with I/O.
Rationale
Mixing async and sync code in C# is very inefective thread-wise. It is a good practise to have entire code top-down async, but the worst case-scenario is when synchronous code calls async code like in case of Stream.Read and Stream.Write.
Today the only way to efficiently use protobuf in async code is to copy buffers to the memory and then perform deserialization synchronously. What makes it virtually impossible to use protobuf in streaming scenarios with near-to-constant memory.
Background
Lately there have been many improvements to the TPL to allow more efficient (allocation-wise) handling of many scenarious especially parser-like where most of methods return synchronously. ValueTask and Task.CompletedTask are the most important of those improvements.
High-level plan
ReturnType MethodName(Arguments)
to
async ValueTask MethodNameAsync(Arguments, CancellationToken cancellationToken)
void MethodName(Arguments)
to
async Task MethodNameAsync(Arguments, CancellationToken cancellationToken)
And all references in async methods to those methods with await CalledMethodAsync(..., cancellationToken)
This way we have perfectly correct implementation (but not optimal, especially allocation-wise).
Replace all calls to Stream.Read/Write with proper ReadAsync/WriteAsync
After testing this version start optimizations:
a. Remove constructs like async await = where possible
b. Try to use as much of ValueTask and Task.CompletedTask as possible.
c. Try to inline some async calls to avoid generation of separate state-machines
Modify code-generator to generate async variants.
I would be happy to create PR for this proposal.
However I believe optimization will be a bit longer process, going far beyond this single PR.
The text was updated successfully, but these errors were encountered: