Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"engine: cache-max-memory-size exceeded" when using WriteApiAsync #164

Closed
digital-spinner opened this issue Feb 18, 2021 · 17 comments
Closed
Labels
wontfix This will not be worked on
Milestone

Comments

@digital-spinner
Copy link

digital-spinner commented Feb 18, 2021

Steps to reproduce:

  1. Download and run provided example project on Windows 10 x64 1909 or newer (Visual Studio 2019 solution) in debug or release mode.
    InfluxSimpleLoadTest.zip
  2. When running please observe Windows system RAM usage.

Expected behavior:
RAM should not be eaten indefinitely and after app shutdown it should be released.

Actual behavior:
Windows OS uses RAM indefinitely before app crashes.
When the writeApiAsync is got before each point write (see commented piece of code) than the RAM is eaten indefinitely but the app won't crash and everything runs until Windows OS hangs due to RAM usage issue.

Unhandled exception. InfluxDB.Client.Core.Exceptions.HttpException: unexpected error writing points to database: engine: cache-max-memory-size exceeded: (1075474000/1073741824)
   at InfluxDB.Client.Api.Service.WriteService.PostWriteAsyncWithIRestResponse(String org, String bucket, Byte[] body, String zapTraceSpan, String contentEncoding, String contentType, Nullable`1 contentLength, String accept, String orgID, Nullable`1 precision)
   at InfluxDB.Client.Api.Service.WriteService.PostWriteAsyncWithHttpInfo(String org, String bucket, Byte[] body, String zapTraceSpan, String contentEncoding, String contentType, Nullable`1 contentLength, String accept, String orgID, Nullable`1 precision)
   at InfluxDB.Client.Api.Service.WriteService.PostWriteAsync(String org, String bucket, Byte[] body, String zapTraceSpan, String contentEncoding, String contentType, Nullable`1 contentLength, String accept, String orgID, Nullable`1 precision)
   at InfluxDB.Client.WriteApiAsync.WriteData(String org, String bucket, WritePrecision precision, IEnumerable`1 data)
   at InfluxDB.Client.WriteApiAsync.WritePointsAsync(String bucket, String org, List`1 points)
   at Examples.Program.WriteBatchAsync(Int32 taskNo, Int32 pointsToWrite) in C:\Users\Test\Desktop\InfluxSimpleLoadTest\InfluxSimpleLoadTest\Program.cs:line 48
   at Examples.Program.Main(String[] args) in C:\Users\Test\Desktop\InfluxSimpleLoadTest\InfluxSimpleLoadTest\Program.cs:line 33
   at Examples.Program.<Main>(String[] args)

Specifications:

  • Client Version: 1.15.0
  • InfluxDB Version: 2.0.3
  • Platform: Windows 10 x64 for client, docker @ Ubuntu 20.04.2 - InfluxDB server
@bednar
Copy link
Contributor

bednar commented Feb 18, 2021

Hi @digital-spinner,

thanks for using our client.

Did you try to reuse the writeApiAsync across Tasks? The better approach will be create one instance of InfluxDB.Client and WriteApiAsync for all Tasks.

Regards

@digital-spinner
Copy link
Author

Thank you for your concern and answer!

I have tried to use one client and one writeApiAsync per task as in the example I have attached which results in the exception I have posted.

When I tried to get the new WriteApiAsync before each WritePointsAsync(...) method, then there is no exception but the RAM is eaten up to the point the Windows can't run anymore. The RAM usage of example console app stays at stable level the whole time it is running.

In my example you can set the parallelTask variable to only 1 and the result will be actually the same as when using multiple tasks.

@bednar
Copy link
Contributor

bednar commented Feb 19, 2021

The problem could caused by inefficient uses of PointData structure. The PointData is immutable => every new field or tag cause copying whole structure.

I've used a version with creating the LineProtocol:

private static Task WriteSomethingRecord(WriteApiAsync writeApiAsync, Random randomGenerator)
{
    var lineProtocol = new StringBuilder();
    lineProtocol.Append("measurement,tagName=tagValue ");

    int i = 1;

    for (; i <= 100; i++)
    {
        lineProtocol.Append($"something_{i:D4}={randomGenerator.Next()}i,");
    }

    for (; i <= 200; i++)
    {
        lineProtocol.Append($"something_{i:D4}={randomGenerator.NextDouble()},");
    }

    for (; i <= 300; i++)
    {
        lineProtocol.Append($"something_{i:D4}=");
        lineProtocol.Append('"');
        foreach (var c in GenerateRandomString(randomGenerator, 64 * 1024))
        {
            switch (c)
            {
                case '\\':
                case '\"':
                    lineProtocol.Append("\\");
                    break;
            }

            lineProtocol.Append(c);
        }
        lineProtocol.Append('"');
        lineProtocol.Append(',');
    }
    
    // remove last ','
    lineProtocol.Remove(lineProtocol.Length - 1, 1);
    lineProtocol.Append(' ');
    
    var epoch = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc);
    lineProtocol.Append((BigInteger) (DateTime.UtcNow - epoch).TotalMilliseconds);
    
    return writeApiAsync.WriteRecordAsync(bucket, org, WritePrecision.Ms, lineProtocol.ToString());
}

and then the RAM usage is consistent. Today I don't have access to Win machine, so I did the test on MacOS. Could you test it with LineProtocol?

@digital-spinner
Copy link
Author

digital-spinner commented Feb 19, 2021

Thank you. I have tested line protocol implementation as above. The issue persist. Please take a look onto attached screenshot.
I can't find what is actually eating up the RAM in the system even using Process Explorer - hmmm, maybe I don't know how to do it properly. The RAM usage of the app itself is kept at stable level.

image

image

After shutting down the test application and Visual Studio system needs to be restarted because RAM usage is not going to it's previous level.

image

image

@digital-spinner
Copy link
Author

I have been testing the same examples also on Linux (Ubuntu 20.04) client OS inside vmware VM and found that the RAM usage there is not a problem, but the same exception occurs randomly. But I had to wait for it - it appeared after transferring about 30GB of data through the network interface.

@bednar
Copy link
Contributor

bednar commented Feb 22, 2021

The exception is thrown by InfluxDB. Does your hardware fit the sizing guideline? https://docs.influxdata.com/influxdb/v1.8/guides/hardware_sizing/

@digital-spinner
Copy link
Author

Thank you for your answer. Now I understand that my bug report contains two issues actually.

About the exception - now I understand what is going on and maybe how to handle it properly - will need to do more tests and research though.

About the RAM usage / leak? on Windows - can you confirm the issue is real as I described?

@bednar
Copy link
Contributor

bednar commented Feb 24, 2021

Hi @digital-spinner,

Today, I had enough time investing your issue on Win platform and I got same results - too high RAM usage.

The RAM usage is in correlation with size of LineProtocol - If we reduce amount of string fields than the usage is lower.

Results

visualstudio2

visualstudio3

Testing code

Program.cs.txt

Regards

@digital-spinner
Copy link
Author

digital-spinner commented Feb 24, 2021

Thank you! But did you noticed that this will never recover to the base state (before running the testing code)? I mean it will not release the used RAM. In my case only the reboot of Windows machine solves the issue.

@bednar
Copy link
Contributor

bednar commented Feb 25, 2021

It looks that problem is caused by High Usage of Non-Paged Pool:

RamMap

but the process InfluxSimpleLoadTest.exe doesn't consume it:

ProcessExplorer

TaskManager

Could you check it?

@digital-spinner
Copy link
Author

digital-spinner commented Mar 1, 2021

Yes this is the behavior I'm experiencing. And the RAM can be released only by rebooting the Windows.

Moreover there is one strange thing.. When I run this code from Rider IDE as a first thing after the system reboot than I couldn't observe RAM usage problem.. but when I run it even from the console like 'dotnet run --configuration Release' than the issue reappears. Of course the same happen when running from Visual Studio.

Rider:
01 - RUN FROM RIDER

Console:
02 - RUN FROM CONSOLE

So now I'm scratching my head even more.

@digital-spinner
Copy link
Author

It looks like wdnf (Windows Defender Network Filter) is causing the issue when InfluxDb.Client is running.

WDNF-STOP

@bednar
Copy link
Contributor

bednar commented Mar 2, 2021

@digital-spinner interesting news... Did you try to disable Windows Defender Network Inspection Service?

@digital-spinner
Copy link
Author

digital-spinner commented Mar 2, 2021

@bednar yes, disabling Windows Defender helps. Options to disable on the left side of screen. Firewall is still enabled as default.

DEFENDER-OFF

@bednar
Copy link
Contributor

bednar commented Mar 2, 2021

Currently, I don't how we can bypass this protection and scanning. Maybe you could create a custom rule for your Windows Defender.

@digital-spinner
Copy link
Author

OK, thank you for your answer. I may report the bug to the Windows dev team then - IMO Windows Defender should not be working like this. Thank you for your support!

@bednar
Copy link
Contributor

bednar commented Mar 3, 2021

Thanks for your cooperation 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants