-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BigQuery] High Number of Memory Allocations when migrating from legacy insertAll to Storage Write API. #10034
Comments
That looks fine to me - I can't see anywhere that we'd end up holding on to memory there, modulo my thought below. I'd be tempted to create a new stream somewhat more often than once every 12 hours, but that's a different matter. One aspect I hadn't considered before is that your code doesn't appear to be reading from the responses stream from the gRPC stream (see here in the original sample code). I wonder whether that means that data is getting buffered... I don't know enough about the BigQuery Storage Write API, but that might explain things. Please could you try including some code to read from the response stream in the background, and see if that makes any difference? (If this is the problem, we'll definitely want to update the sample to emphasize it more.) |
Hi @jskeet , Happy to do that but I've just trying a slightly different tack; deployed a minor change in which we fetch the default stream every time we want to do a write. The saver pulses every 20 seconds or so, so every 20 seconds it would fetch the default stream, write, and then mark it as complete. So we basically do:- var appendRowsStream = _bigQueryWriteClientBuilder.AppendRows();
await appendRowsStream.WriteAsync(new AppendRowsRequest
{
ProtoRows = protoData,
WriteStream = _writeStreamName,
});
await appendRowsStream.WriteCompleteAsync(); Inside the while loop. I'm wondering like you if the missing call to Cheers, |
We try to write the data as you go - we definitely don't wait for a It's a shame that your test code doesn't reproduce the problem at the moment... but I'll give it a try anyway and see if I can see anything odd going on in the BufferedClientStreamWriter in terms of it retaining objects when it shouldn't, for example. |
I think there is an element of time at play here, the migration to the Storage Write API clearly leaks (need to find a better word) from our monitoring tooling POV. But it's not immediate for example, I left the code overnight on DEMO and one of the boxes jumped at 21:00 for no apparent reason I'm the only one in this app company-wide so it has to be the recent changes. I'll let you know what my different tack reveals if anything. Cheers, |
One interesting idea: in your real code, do you have multiple threads calling We wouldn't see that in the sample code, as you're waiting for one write to complete before creating another one... but if your real code is more event-based, so you could have parallel threads writing and waiting, that could explain it. (When running your test code in a debugger, |
We don't have multiple threads calling
I can try and reflect over that ( Cheers, |
Right. In your real code, do you await the task immediately, or queue it somewhere? (Even if you await it, could another job use the same thread to write again?) In the commented-out code, it does look like you're awaiting it directly, and that only that loop will write to it...
That would be really useful, thanks. |
It is awaited directly, and only that loop invokes the write to BQ. Cheers, |
Right - thanks for all the extra detail. I don't see how we could end up buffering anything unless there's a bug in BufferedClientStreamWriter - but that's always a possibility. It might be worth something logging each time you call |
So I managed to get So I guess that is just a signal to that instance of the stream that "I'm done with you please clean yourself up". Cheers, |
Yes, the "it's complete" part is added as an extra entry in the queue here, which is spotted in the processing here. The fact that it's never removed from the queue (because there's no continuation) may or may not be intentional - it's been 6 years since I wrote the first version of this code... but it shouldn't do any harm, and definitely wouldn't explain the high memory usage you're seeing. It could be that this queuing is a red herring, of course... but it would be nice to rule it out. Just to check my own thought process here:
For the second part, of that, was that profiler running against a "real" system (sending messages) or just the early benchmark which did nothing but create the API messages? If it's the latter, it's understandable - but if you're getting a lot of ByteStrings building up in the real system (whether live or demo) then that would suggest it has to be something building up late on, rather than in your own internal queue. |
Correct. See here for legacy insert and here for modern insert. I did note that this:- new ProtoSchema
{
ProtoDescriptor = WatchtowerRecord.Descriptor.ToProto(),
}; Also caused allocations on every invocation hence why I hoisted it up into the constructor.
Correct and it was just an early benchmark which did nothing but create the API messages. The reason behind this is legacy insert is well battle tested it has been pushing ~50k total_requests per day / ~150 million total_rows per day / ~100GB total_input_bytes per day since It was a simple diff of the legacy insert vs modern insert, see what actually had changed after which a console app to reproduce and put through dotMemory. That being said, it's clear now that My guess is that having a long lived default stream and writing to it over a period of time (lets say hours) without ever calling
I still want to do this, but I don't want to change too much too quickly when faced with a new problem. Perhaps one more night. Cheers, |
Yes, that was something I was going to suggest - although it wouldn't be doing a lot.
It might cause issues in the BigQuery Storage API, but it shouldn't cause any local memory issues - because we don't hold on to the messages locally. I don't know much about the BigQuery Storage API, but it sounds like it should be fine to fetch and write to the default stream frequently. If you wanted to move to use application streams, I suspect your "create stream, write, complete request" every 20 seconds might cause issues, and instead you'd want to keep it open for some time - but I'd probably guess at an hour being a reasonable time rather than "the whole lifetime of the process". I still really don't understand why you were getting local memory issues though... next time you're making a change, if you could add the logging as per this earlier comment, that would be good to know:
(If you're not already logging the length of your own queue, that would be good to know too.) |
Hi @jskeet, Apologies, I've been AFK. I now have the additional logging (see attached zip). The logs are from our DEMO environment which is low traffic, so where This is the code that I logged out:- var sw = Stopwatch.StartNew();
try
{
sw.Restart();
var protoData = new AppendRowsRequest.Types.ProtoData
{
WriterSchema = _writerSchema,
Rows = new ProtoRows
{
SerializedRows = { list.Select(x => x.ToByteString()) },
},
};
var timeToCreateRows = sw.Elapsed.TotalMilliseconds;
sw.Restart();
var appendRowsStream = _bigQueryWriteClientBuilder.AppendRows();
var timeToFetchDefaultStream = sw.Elapsed.TotalMilliseconds;
sw.Restart();
await appendRowsStream.WriteAsync(new AppendRowsRequest
{
ProtoRows = protoData,
WriteStream = _writeStreamName,
});
var timeToWriteAsync = sw.Elapsed.TotalMilliseconds;
sw.Restart();
await appendRowsStream.WriteCompleteAsync();
var timeToWriteCompleteAsync = sw.Elapsed.TotalMilliseconds;
if (Settings.Config.IsEnabled("OtherBigQueryStatsSaverLogsOn"))
{
Log.Warn(new LogDetails($"Aggregated {length} individual saves to be {localConcurrentDictionary.Count} saves")
{
ObjectDump = new
{
count = list.Count,
duration = sw.Elapsed.TotalMilliseconds,
remaining = _concurrentQueue.Count,
beforeAggr = length,
afterAggr = localConcurrentDictionary.Count,
sizeOfRequest = list.Sum(x => x.CalculateSize()),
timeToCreateRows,
timeToFetchDefaultStream,
timeToWriteAsync,
timeToWriteCompleteAsync
}
});
}
} Cheers, |
Thanks, I'll have a look at the logs tomorrow. |
There's not many, I'll need to rinse and repeat on production either later on tonight or first thing in the morning. Unfortunately, because it logs a lot of production it's not something I can leave on unattended. |
Thanks for those logs. The log entry at 2023-03-15 17:19:25.312785 which seems to take 20 seconds is slightly worrying - but I wonder whether that was while the server was busy with other initialization tasks. I'm glad you mentioned the load testing for the period where "remaining" didn't stay at 0. But otherwise, this doesn't seem to show anything particularly concerning. Given the short timeframe, I assume you didn't spot any memory issues during this run either? Will look forward to further logs - sorry that it's taking a while to get to the bottom of this. |
Hi @jskeet , I'm not sure I understand where you are getting 20 seconds from? All those time are in milliseconds (e.g. I've switched on the logs for prod since The good news is I don't see any memory issues right now, so we might have fixed by fetching the default stream and completing it around a write. Cheers, |
Ah it's no big deal fella, it's being investigated and it's not uncommon for me to spend a few days on platform issues like this. It's actually pretty fun! 👍🏽 Cheers, |
Ah, whoops - for some reason I was assuming seconds. 20ms really shouldn't be a problem :) Glad to hear completing the write looks like it may have fixed things. As I mentioned before, I don't know whether it's really appropriate to stream just a single request each time - but it does sound like you're well within the limits. You might want to consider streaming for somewhat longer (e.g. 10 minutes or an hour) but this isn't an area of expertise for me. (I still don't understand why it would change the memory usage though....) |
I still think there is a very gentle incline in memory consumption. It's hard to see over a day, but if we graph it over two weeks you can see the upward slopes. You can ignore the spikes entirely, those are post-deploy spikes which I think are to do with NET6 itself they recover after a period of time. The real test will be the weekend on our DEMO environment. You can see that 2023-03-10 through to 2023-03-13 is flat. And after 2023-03-13 is when I started the migration to Storage Write API. Cheers, |
I've commandeered a DEMO box and I'm going to let:-
Run for an hour or so, to see if that will tell us anything useful. https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-counters#examples |
So that googleapis/google-cloud-dotnet#10034 is easier to diagnose
Not sure how useful this is going to be, but at least it gives us a baseline. |
I rolled out the reflection code to log out Cheers, |
This seems to "leak" I use that word very loosely, as it is more than likely going onto Gen0. For googleapis/google-cloud-dotnet#10034
So I can get this to happen:- 0893/5000 35,939 kb
0894/5000 35,969 kb
0895/5000 35,998 kb
0896/5000 36,014 kb
0897/5000 36,058 kb
0898/5000 36,092 kb
0899/5000 36,098 kb
0900/5000 36,149 kb
0901/5000 36,169 kb
0902/5000 36,211 kb
0903/5000 36,237 kb
0904/5000 36,271 kb
0905/5000 36,287 kb
0906/5000 36,339 kb
0907/5000 36,373 kb
0908/5000 36,399 kb
0909/5000 36,425 kb
0910/5000 36,458 kb
0911/5000 36,466 kb
0912/5000 36,518 kb
0913/5000 36,555 kb
0914/5000 36,586 kb
0915/5000 36,620 kb
0916/5000 36,646 kb
0917/5000 36,671 kb
0918/5000 36,696 kb
0919/5000 36,768 kb
0920/5000 36,786 kb
0921/5000 36,837 kb
0922/5000 36,871 kb
0923/5000 36,905 kb
0924/5000 36,931 kb
0925/5000 36,963 kb
0926/5000 36,970 kb
0927/5000 37,012 kb
0928/5000 37,056 kb
0929/5000 37,064 kb With this commit. But I think that is sort of expected as there is a lot of byte[] allocation churn when adding to Cheers, |
Absolutely! I'm very much a fan of "one change at a time" :) Btw, GAX doesn't use MemoryStream at all. Google.Protobuf does, but only in ByteString.FromStream/FromStreamAsync, which I don't think will be in use here. Admittedly my MemoryStream theory is purely speculative at the moment anyway :) |
So while we wait for hours to pass by, I'm poking about the Google.Protobuf src and looking for factors of 131072. And I can only really find:- Neither of which are multiplied at all from what I can see. |
Poking around the leaky box memory dump using dotnet-dump and running I have no idea if any of that is relevant, or if it is just confirmation bias. But do we expect |
I'm on the edge of my knowledge here, so take all of this with a truck load of salt. But I find it sus that it finds 802 handles, and then can't find them. Weird. |
Me either! Weirdly enough, I can't even find that (or CallSafeHandle) in the new code. They're both here though for 2.46: https://github.com/grpc/grpc/tree/v1.46.x/src/csharp/Grpc.Core/Internal Just to check - was this memory dump taken when using Grpc.Core, or Grpc.Net.Client (the default)? (Don't be fooled by the name of Grpc.Core.Internal - that can be seen with both implementations.) |
The 400 -> 200 batch size change was deployed to our DEMO environment from 16:15. The slow leak is still there, it doesn't look it has changed at all. I've no idea why the shape - I'm calling them sawtooth's - have elongated after 18:00.
Yes, I've made this change, and only enabled it on one box. So I'm leaving 104 as completely disabled, 103 is consuming the response, 102 & 101 are left as is.
Taken with Cheers, |
Just to check, when you added the Grpc.Core reference, you did also set the GrpcAdapter when building the client, right? |
Yes.
Totally missed this - this sounds like a great thing to try tomorrow. Cheers, |
I think we have a winner; 101 - as normal Next steps are to bump back up to chunks of 400 and roll out response consumption to all boxes. It's rather black box at the moment, I'd love to know where this slow leak is actually occurring. Cheers, |
Okay, so presumably our working theory is that if we don't consume the responses, then in the real implementation they gradually build up and are never garbage collected. In your fake implementation presumably the response never actually gets populated, or is eligible for garbage collection. I suspect it should be possible to repro this with just a local gRPC client and server (with a simple streaming API definition) - but that's definitely not something I'd expect you to look into. If just consuming the gRPC response works, I'd stick with that :) I can ask my colleagues in the gRPC team to take a look. |
No, as far as I can tell the response is there and can be read. But the main factor here is time, none of the benchmarks we ran have been over six hours. I would like to fire and forget, I don't necessarily care about the response in this circumstance, It's been interesting! Cheers, |
Oh, I just noticed that So currently I'm doing this to consume the response stream:- await foreach (var appendRowsResponse in appendRowsStream.GetResponseStream())
{
} I'm wondering if I should actually be doing this? await using (var asyncResponseStream = appendRowsStream.GetResponseStream())
{
await foreach (var appendRowsResponse in asyncResponseStream)
{
}
} I'm also wondering if the simple act of foreach'ing over it is implicitly disposing of it? |
Hmm... I don't think actually disposing of it, but I suspect that iterating over everything has the same effect. The latter approach is probably a better approach until I can confirm for sure that the former approach is okay... although I'd really hope it is, just from a usability perspective. |
That's very good to see :) I'm planning on trying to reproduce the problem with a local server and local client next week - will you have graphs from over the weekend just to confirm that it's stayed this way? |
Yeah mate, we have graphs going back four years 😄 I'm confident that this is the fix as it normally only required 4-6 hours for the memory baseline to start drifting. Cheers for your help 👍🏽, |
Morning - this is the weekend. You can see the slow ramp at the very beginning but once the response consumption fix was deployed you can see that the slow ramp was eliminated. There is a lot of churn in Gen0 with all the byte arrays so it is still more spiky than insertAll. But the leak is fixed as far as I can tell! 👌 Cheers, |
Excellent! I'm glad that the immediate issue is fixed - now we're a little clearer about it, I'll work on reproducing it for the gRPC team. |
I've managed to reproduce this remarkably easily. I'm talking with the gRPC team about what the next steps are - but fundamentally, reading the responses is worth doing. Disposing of the underlying gRPC call prevents the leak, but appears to be "unfriendly" to the connection in some way. I'd recommend code like this: var stream = client.AppendRows();
// With this in place, the call will be disposed even if something fails.
using var grpcCall = stream.GrpcCall;
// Make your writes here
await stream.WriteAsync(request);
await stream.WriteCompleteAsync();
// And read the responses at the end
await using var responses = stream.GetResponseStream();
await foreach (var response in responses) {} That's making the assumption that writes can succeed without previous responses having been read. I expect that's the case, but I don't have concrete evidence for that. |
I think perhaps there is dependent on use case; in this case (and actually all cases where we use BQ and gRPC) we don't care about the response at all. It is very much fire and forget. await using var responses = stream.GetResponseStream();
await foreach (var response in responses) {} I'm fine to do that, but maybe the docs need to be updated to explicitly warn about what happens if response consumption is absent? Alternatively, if I could configure the BQ client to be "fire-and-forget" that would be ideal too. |
We'll absolutely be doing some combination of documentation and code changes. Ideally gRPC would be able to clean up in the background aside from anything else, but if we at least make the type returned by |
Yeah I gotta admit I was of the opinion that the gRPC in the background would clean itself up after the method invocation had finished. Ah well. Do you mind tagging me when the documentation is updated? No rush, just keen to understand more of how we should be using BQ gRPC calls. |
Sure. It may well be in multiple stages though, as each individual component or piece of documentation is updated. |
Hmmm, the mystery continues! It's been rock solid for a few days and I'm just checking the graphs post-easter-eggs and the slow growth has re-appeared again. To be clear, the code around consuming the response and disposal is still in and enabled. I'll dig into it more post-lamb :) Hope you are having a nice Easter. Cheers, |
Ahhh, it seems like BQ was "down" and the exception path in that scenario somehow leaks? We have a few logs on our DEMO environment that seem to line up at first glance. I'm used to the legacy insertAll method logging a lot and timing out (5 seconds). The gRPC stuff has been relatively rock solid and highly performant thus far. I've attached the raw logs. Looks like all the exceptions fall into one the following seven:- "Grpc.Core.RpcException: Status(StatusCode=""Aborted"", Detail=""Closing the stream because server is restarted. This is expected and client is advised to reconnect. Entity: projects/indy-test-watchtower/datasets/DEMO_watchtower/tables/watchtower_v2/streams/_default"")
"Grpc.Core.RpcException: Status(StatusCode=""Cancelled"", Detail=""Call canceled by the client."")
"Grpc.Core.RpcException: Status(StatusCode=""Cancelled"", Detail=""CANCELLED"", DebugException=""Grpc.Core.Internal.CoreErrorDetailException: ""CANCELLED"""")
"Grpc.Core.RpcException: Status(StatusCode=""Internal"", Detail=""Internal error encountered. Entity: projects/indy-test-watchtower/datasets/DEMO_watchtower/tables/watchtower_v2/streams/_default"")
"Grpc.Core.RpcException: Status(StatusCode=""Unavailable"", Detail=""502:Bad Gateway"")
"Grpc.Core.RpcException: Status(StatusCode=""Unavailable"", Detail=""Error starting gRPC call. HttpRequestException: An error occurred while sending the request. IOException: The request was aborted. IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.. SocketException: An existing connection was forcibly closed by the remote host."", DebugException=""System.Net.Http.HttpRequestException: An error occurred while sending the request.
"Grpc.Core.RpcException: Status(StatusCode=""Unavailable"", Detail=""Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled: Task is overloaded (memory-protection) go/tr-t."") I guess in the case of any Cheers, |
I wouldn't expect that to help, although I guess it could. Do you definitely have the code that disposes the underlying gRPC call, even if there's an error while reading the responses? This line of code should do it: using var grpcCall = stream.GrpcCall; If you haven't already got that, I'd add it. (We'll make If you do already have that, I don't know where the leak has come from... |
No we don't. I've added it now 👍🏽 This is what we have post-add:- using var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromSeconds(5));
var appendRowsStream = _bigQueryWriteClientBuilder.AppendRows(CallSettings.FromCancellationToken(cancellationTokenSource.Token));
// as per https://github.com/googleapis/google-cloud-dotnet/issues/10034#issuecomment-1502784359
using var grpcCall = appendRowsStream.GrpcCall;
var sizes = new List<int>();
// ToByteString creates a lot of memory churn on Gen0
// The average watchtower record when protobuf is around 162.4876265161838294 bytes (sample size of 115215).
// so we now chunk over the list in chunks of 400 which would put us at around 64,800 bytes for this list in it's entirety.
// previously the average list size is 453047.083047893235 bytes (sample size of 115319)
// which was going onto the LOH
foreach (var records in list.Chunk(400))
{
sizes.Add(records.Sum(x => x.CalculateSize()));
var protoData = new AppendRowsRequest.Types.ProtoData
{
WriterSchema = _writerSchema,
Rows = new ProtoRows
{
SerializedRows = { records.Select(x => x.ToByteString()) },
},
};
await appendRowsStream.WriteAsync(new AppendRowsRequest
{
ProtoRows = protoData,
WriteStream = _writeStreamName,
}).ConfigureAwait(false);
}
await appendRowsStream.WriteCompleteAsync().ConfigureAwait(false);
// this is disposable and recommended here: https://github.com/googleapis/google-cloud-dotnet/issues/10034#issuecomment-1491902289
await using (var asyncResponseStream = appendRowsStream.GetResponseStream())
{
// need to consume the response even if we don't do anything with it
// other slow ass memory leak: https://github.com/googleapis/google-cloud-dotnet/issues/10034#issuecomment-1491449182
await foreach (var appendRowsResponse in asyncResponseStream)
{
}
} Cheers, |
Environment details
Microsoft Windows Server 2019 Standard 10.0.17763 N/A Build 17763
<TargetFramework>net6.0</TargetFramework>
Steps to reproduce
Additional context: protocolbuffers/protobuf#12217
We have recently migrated from the legacy insertAll to the Storage Write API and we've found that a marked increased in memory (see screenshot in link above). The migration is 1:1 as close as can be.
I've diff'ed the methods
ExpensiveStreamingInsert
andStorageWriteApi
from this file https://github.com/indy-singh/bigquery_storage_write_api_high_memory_allocation/blob/c3b50881b61ccaa8fddfea01945d7f67563cab15/bigquery_storage_write_api_high_memory_allocation/WatchTowerBigQuerySaver.cs (this is the actual production code) and nothing looks super suspicious. Online diff: https://www.diffchecker.com/gAJLZUKz/Which leads to me believe we are using the Storage Write APIs incorrectly (at construction).
We basically keep one
AppendRowsStream
open/alive for the duration of the application lifetime (which is typically 12+ hours)Is this incorrect? https://cloud.google.com/bigquery/docs/write-api-best-practices seems to imply what we are doing is correct.
Thanks for any help.
Cheers,
Indy
The text was updated successfully, but these errors were encountered: