C# ByteString, CodedInputStream performance improvements desired for messages with large byte array field #4206

d79ima · 2018-01-21T21:05:43Z

Currently if you have a message with an array of bytes field that translates into a ByteString immutable object.
ByteString can't be created without incurring a GC memory allocation and an array copy.
When you deserialize a ByteString using CodedInputStream, the same penalty is paid.
Furthermore, when you then go to consume the bytes you received you pay this same penalty yet again because of ByteString.ToByteArray().

In my application these penalties are adding up to excessive GC pressure and memory copies which is reducing my message throughput substantially. This leaves me some unatractive options for my .NET C# application, either use C++ protobuffs communication layer or send my large byte array messages natively without using protobuffs.

It would be really nice if you allowed an option to have more fine grained control over the "bytes" field such that the code generator would spit out just a native byte[] type field or a new class (i.e. UnsafeByteString) which bypasses the above mentioned penalties and user takes on full responsibility for the fact that it is a mutable unsafe byte array.
CodedInputStream also needs to provide an option for user to inject his own byte array allocator so that user application can have fine grained control over byte array memory allocations to minimize GC penalties.

I think this is really important to allow for truly high performance C# protobuf apps.

jtattermusch · 2018-02-26T10:47:27Z

Somewhat related gRPC issue: grpc/grpc#14378

wanton7 · 2018-05-23T04:57:39Z

I would want to have support for unsafe ByteString that would support setting it's length as well. When sending chunked data, last chunk is usually smaller.

@d79ima are you are aware that if you allocate 85000 bytes or more in .NET Framework allocation goes directly to LOH (Large Object Heap) and skips GC generations? It will add lot of GC pressure and LOH can only be compacted manually. So if you want to send binary data it should done in less than 85000 byte chunks.

dpsenner · 2019-02-16T19:29:12Z

I'm seconding this wish for a garbage free api.

smflorentino · 2019-05-17T15:00:05Z

+1, we are observing the same problem. How do we move forward on this?

jtattermusch · 2019-05-18T08:45:49Z

#5888

drew-512 · 2019-06-26T20:12:10Z

A simple enhancement that would address most of this issue is if the C# gpc library had a way to pool no longer needed ByteString (like Go's sync.Pool) and so then could be internally reused. This would prevent the crazy firehose of ByteStrings constantly being created that the GC has to chase.

If I were implementing it, I'd make 4-8 size ByteStrings pools/buckets internally so the internal faux-allocator chooses from a pool such that waste isn't excessive. And there could be a manual polling call to drain them or a periodic task that slowly drains 1-5% of each bucket every 10-60 secs. That way no timestamps/tracking is needed, it covers the lion's share of cases, and there are no freak/worst-case scenarios. That extension could be added and tested in a couple days by one person. I'd volunteer to do if someone here told me there was a good chance of the enhancement being picked up.

Please help us Protobuf heroes! We absolutely love protobufs and we're astonished they're not used more. I do think that will change as interoperability is recognized as more important.

Drew

jtattermusch · 2019-06-28T07:46:33Z

I believe the way forward is implementing #5888 (which allows parsing messages directory from a ReadOnlySequence).

For avoiding excessive allocations in ByteString, that's a separate problem and we'd need to come up with a separate design. The main problem of using array pools with ByteString is that ByteString wasn't designed to be disposable, so there's no way of expressing that the rented array can be returned to the pool.

drew-512 · 2019-06-28T17:08:54Z

Hi!

Maybe I'm not following correctly, but my vision was that a gpb consumer would explicitly drop off ByteStrings that are known to be no longer used (Return() in C# parlance). For example, we are processing a firehose of messages all the time loaded w/ ByteStrings, but after we unmarshal them, we would return those (so grpc/gpb could reuse them). Sure, that reuse will waste some here and there (as an alloc request needs 800 and the bucket it draws from the 1024 bucket), but w/ that bucket approach, most waste is addressed and the GC workload would vanish. It's a totally worthy trade imo to drop the GC workload in exchange for ByteStrings wasting part of their allocation. Basically, if we all agree a ByteString is basically a temp/scrap object, then lets treat them that way and assume ppl will extract their data and Return them to the pool if they care about performance.

Sure, I love the idea of 5888, but it seems to be a larger bolder ask of development, whereas adding a ByteString Return() call that funnels out into 6-12 ArrayPool buckets seems doable, no-risk, and poses no burden on people who don't use the enhancement.

Hopefully I'm making sense or maybe I'm missing what you're saying.

dpsenner · 2019-06-29T11:36:02Z

Would something similar to Stream.Read(byte[], int, int) work?

https://docs.microsoft.com/en-us/dotnet/api/system.io.stream.read

jtattermusch · 2019-07-08T13:30:27Z

I'm having trouble imagining some API additions for ByteString that would make it possible to manage them more efficiently:

the most efficient future way of parsing protobufs is going to be from "ReadOnlySequence" (as mentioned above), but ReadOnlySequence is an ephemeral data source and in many implementations, it will become invalid once parsing the message is done (so we cannot remember the "reference" for the input data and make ByteString read from there).
currently ByteString is not disposable, so renting an array from a pool and returning it back once not needed is problematic, because it's not clear when to return the rented array from the pool and also messages are recursive and they don't have a concept of disposing (and traversing messages to dispose all the byteStrings seems like a very clumsy approach).

Because ReadOnlySequence is ephemeral, the only approach I could see is somehow making ByteString parsing customizable (an advanced API) and optionally make ByteString only remember a Slice of the input ReadOnlySequence (which will become invalid as soon as the input ROS becomes invalid). After that, the user would need to make sure to extract the data from byteString themselves before the input ROS is invalidated. This could be useful in some high-performance scenarios, but it's sound like an api that's very specialized and hard to use, so I'm not a fan.

JamesNK · 2019-12-19T06:11:11Z

I think there are two scenarios for creating ByteString:

Deserialization via CodedInputStream/CodedInputReader. The ByteString is being created from a byte[]/ReadOnlySequence<byte> buffer. Because the buffer is owned by someone else, e.g. web server, and it is likely to be reused, we have to create a copy of the data inside the ByteString. The best way to avoid allocations here is using an array pool, but there is no way to track message lifetime. A possible way to solve this problem is to provide an array lifetime management API to the CodedInputXXX. It will handle providing arrays used by ByteStrings it creates out of a pool. Those arrays are added to a list. At the end of its lifetime scope, e.g. a web request, the lifetime management API will release all the arrays in the list.
Manually creating the ByteString for serialization via CodedOutputStream/CodedOutputWriter. In this situation the developer owns the byte array, and can choose how to create the ByteString. To avoid allocations/copies I think there should be a new way to create ByteString: ByteString.FromMemory(ReadOnlyMemory<byte> bytes). This will simply set an internal ROM<byte> field. There are existing internal methods like this on ByteString, although they are internal:

protobuf/csharp/src/Google.Protobuf/ByteString.cs

Lines 57 to 78 in 6263268

    
           /// <summary> 
        
           /// Unsafe operations that can cause IO Failure and/or other catastrophic side-effects. 
        
           /// </summary> 
        
           internal static class Unsafe 
        
           { 
        
               /// <summary> 
        
               /// Constructs a new ByteString from the given byte array. The array is 
        
               /// *not* copied, and must not be modified after this constructor is called. 
        
               /// </summary> 
        
               internal static ByteString FromBytes(byte[] bytes) 
        
               { 
        
                   return new ByteString(bytes); 
        
               } 
        
           } 
        
           /// <summary> 
        
           /// Internal use only.  Ensure that the provided array is not mutated and belongs to this instance. 
        
           /// </summary> 
        
           internal static ByteString AttachBytes(byte[] bytes) 
        
           { 
        
               return new ByteString(bytes); 
        
           }

To emphasis its unsafe nature, the name could alternatively be ByteString.Unsafe.FromMemory(ReadOnlyMemory<byte> bytes).

Adding this new way of creation would sacrifice the current safety ByteString has of it always being immutable. I don't know if this is important or not. If it is, perhaps ByteString could have a flag on it to identify whether it is an "unsafe" or not?

#1 complex to do.
#2 is pretty simple.

@davidfowl @JunTaoLuo @shirhatti

davidfowl · 2019-12-19T07:17:10Z

It's it possible to codegen a different type besides ByteString for the unsafe "optimized" variant?

kbegiedza · 2020-10-06T09:19:46Z

Is there any progress / workaround for this issue?

I belive that, as @JamesNK mentioned, new public Unsafe method should solve this problem.

cc @jtattermusch

JamesNK · 2020-10-06T09:27:33Z

I wrote a PR that allows a ByteString to be created without copying the data. It matches what is available in Java - #7645

No progress on merging it right now.

Alois-xx · 2020-11-02T09:25:49Z

When will the change arrive? For gRPC messages with large payloads (read images) a GC friendly design is a must.

kbegiedza · 2020-11-02T09:28:03Z

@Alois-xx still no ETA, you can check related PR: #7645

tactical-drone · 2020-11-17T06:02:04Z

We need this please.

…improvement for fixing memiry spikes, looks this is a known issue for bytestring. bytesring is not designed for disposeble, refer to: protocolbuffers/protobuf#4206

jtattermusch added the c# label Feb 26, 2018

JamesNK mentioned this issue Jan 6, 2020

DO NOT MERGE: add span_based_serialization design #6874

Closed

JamesNK mentioned this issue Oct 6, 2020

Change ByteString to use memory and support unsafe create without copy #7645

Merged

jtattermusch closed this as completed in #7645 Nov 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C# ByteString, CodedInputStream performance improvements desired for messages with large byte array field #4206

C# ByteString, CodedInputStream performance improvements desired for messages with large byte array field #4206

d79ima commented Jan 21, 2018 •

edited by jtattermusch

Loading

jtattermusch commented Feb 26, 2018

wanton7 commented May 23, 2018

dpsenner commented Feb 16, 2019

smflorentino commented May 17, 2019

jtattermusch commented May 18, 2019

drew-512 commented Jun 26, 2019 •

edited

Loading

jtattermusch commented Jun 28, 2019

drew-512 commented Jun 28, 2019 •

edited

Loading

dpsenner commented Jun 29, 2019

jtattermusch commented Jul 8, 2019

JamesNK commented Dec 19, 2019 •

edited

Loading

davidfowl commented Dec 19, 2019

kbegiedza commented Oct 6, 2020

JamesNK commented Oct 6, 2020

Alois-xx commented Nov 2, 2020

kbegiedza commented Nov 2, 2020

tactical-drone commented Nov 17, 2020

C# ByteString, CodedInputStream performance improvements desired for messages with large byte array field #4206

C# ByteString, CodedInputStream performance improvements desired for messages with large byte array field #4206

Comments

d79ima commented Jan 21, 2018 • edited by jtattermusch Loading

jtattermusch commented Feb 26, 2018

wanton7 commented May 23, 2018

dpsenner commented Feb 16, 2019

smflorentino commented May 17, 2019

jtattermusch commented May 18, 2019

drew-512 commented Jun 26, 2019 • edited Loading

jtattermusch commented Jun 28, 2019

drew-512 commented Jun 28, 2019 • edited Loading

dpsenner commented Jun 29, 2019

jtattermusch commented Jul 8, 2019

JamesNK commented Dec 19, 2019 • edited Loading

davidfowl commented Dec 19, 2019

kbegiedza commented Oct 6, 2020

JamesNK commented Oct 6, 2020

Alois-xx commented Nov 2, 2020

kbegiedza commented Nov 2, 2020

tactical-drone commented Nov 17, 2020

d79ima commented Jan 21, 2018 •

edited by jtattermusch

Loading

drew-512 commented Jun 26, 2019 •

edited

Loading

drew-512 commented Jun 28, 2019 •

edited

Loading

JamesNK commented Dec 19, 2019 •

edited

Loading