Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CSharp] Consider implementing arena allocation in C# #3530

Closed
mkosieradzki opened this issue Aug 18, 2017 · 5 comments
Closed

[CSharp] Consider implementing arena allocation in C# #3530

mkosieradzki opened this issue Aug 18, 2017 · 5 comments
Assignees

Comments

@mkosieradzki
Copy link
Contributor

Background

Objects deserialized by protocol buffers have very often very similiar life time and can strongly benefit from Arena allocation as in C++ version. (https://developers.google.com/protocol-buffers/docs/reference/arenas)

C# 7.0 introduces new powerful feature called ref returns and ref locals (https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/ref-returns) practically making structs a first class citizen.

Benefits

Using arena allocation can strongly decrease pressure on the GC during deserialization (and together with some other perf-works can remove completely GC from the deserialization path).

Drawbacks

This feature requires dramatic changes to the generated code - it requires to generate an additional struct for every message contract. Structs are required for custom memory management and we do not want to break compatibility and APIs for pre-existing code. Also there should be some conversion possible from structs to classes if user decides to keep his/her struct a bit longer (at least longer than arena lifetime).

POC

I have created some early POC structs allowing to implement arena allocation for protocol buffers deserialization and they seem to do the job, however as I told I see no even remote possibility to do this on classes.

Open questions

Are you interested in accepting such a PR?

There is also a decision to be made whether we would like to prefer performance: i.e. refs everywhere, limited safety checks over preventing developer from hurting himself by freeing arena and still trying to use some remaining refs.

@mkosieradzki
Copy link
Contributor Author

Hi I had some time today and I have created a prototype implementation.

I have implemented VERY basic prototype using Address book contract.

From my very trivial "benchmarks"
~100KB size message
~250KB arena size
10000 repetitions (deserialization) including arena clearing
.NET Core 2.0

I can observe:
10-15% performance gain deserialization takes less time (time-wise) - even in a single threaded test (I can expect much more serious impact in multithreaded environment)
0 (!) Garbage collections

I was using a lot of managed memory. There is still a huge field for optimization. But current results look quite promising.

Data:
Original PB:
Elapsed: 25824ms GC0=1021 GC1=470 GC2=0 PeakMem=21815296
My PB:
Elapsed: 23422ms GC0=0 GC1=0 GC2=0 PeakMem=15925248

You can preview my prototype implementation here: https://github.com/mkosieradzki/protobuf/tree/arena-allocator

This is just a PoC!!!

@jskeet You might be interested in this one ;).

@jskeet
Copy link
Contributor

jskeet commented Sep 8, 2017

I'm interested, but I have very little time - protobuf is only a very small part of my work these days, I'm afraid. I don't know when I'll have time to take a close look at this.

@mkosieradzki
Copy link
Contributor Author

Quick update: I have run the same test on a different computer in "Release" mode:
Elapsed: 6307ms GC0=989 GC1=459 GC2=0 PeakMem=21917696
Elapsed: 5383ms GC0=0 GC1=0 GC2=0 PeakMem=16003072

With compiler/JIT optimizations the difference is even bigger.

@mkosieradzki
Copy link
Contributor Author

Update: I am simultaneously experimenting with the new language features (and System.Memory package):
https://github.com/mkosieradzki/protobuf/tree/arena-allocator-spans

However I am getting significant performance regressions there (hopefully due to the invalid compiler version). However I have succeeded in implementing version that can possibly enforce safety (and without using unsafe code outside of arena allocator).

@jskeet
Copy link
Contributor

jskeet commented Mar 17, 2023

@mkosieradzki: It feels to me like this is probably a dead-end - there's so much code that assumes that we're using classes, I find it hard to see how this could be anything other than a complete rewrite - in terms of design, generator and library. It's something we might consider if we were ever going go to rewrite the whole .NET support from scratch (and almost certainly base it on records) but it seems very unlikely to happen any time soon.

I'll close this issue now, but please feel free to add a comment with a reason for re-opening it if you feel strongly.

@jskeet jskeet closed this as completed Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants