Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory is not deallocating after using gdal translate in python #7908

Open
mtobby opened this issue Jun 6, 2023 · 5 comments
Open

Memory is not deallocating after using gdal translate in python #7908

mtobby opened this issue Jun 6, 2023 · 5 comments

Comments

@mtobby
Copy link

mtobby commented Jun 6, 2023

I was testing the gdal translate functionality In python using memory_profiler.
When I look at the result of the profiler I see that the memory allocated isn't released at all.
This is example for the code I used:

from osgeo import gdal
from memory_profiler import profile

gdal.AllRegister()
gdal.UseExceptions()

@profile
def translate():
    ds = gdal.Open("input.tif")
    ds2 = gdal.Translate(
        "output" + str(0) + ".tif", ds, options="-co NUM_THREADS=ALL_CPUS"
    )
    ds2 = None
    ds = None

And I get the following result from the profiler:

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    20     88.2 MiB     88.2 MiB           1   @profile
    21                                         def translate():
    22     88.2 MiB      0.0 MiB           1       ds = gdal.Open("input.tif")
    23    230.4 MiB    142.2 MiB           2       ds2 = gdal.Translate(
    24     88.2 MiB      0.0 MiB           1           "output" + str(0) + ".tif", ds, options="-co NUM_THREADS=ALL_CPUS"
    25                                             )
    26    230.4 MiB      0.0 MiB           1       ds2 = None
    27    230.4 MiB      0.0 MiB           1       ds = None

It looks like the function started with memory usage of about 80 MiB and ended up with the usage of 230.4MiB.
Should it be like this ?
I am using gdal 3.4.1.

Thanks in advance,
Tobby

@rouault
Copy link
Member

rouault commented Jun 6, 2023

GDAL doesn't generally leak memory. If you put your code in a loop, you should hopefully see that RAM usage remains stable at the max value you've noted. It is probably caches kept around. Typically the worker threads are kept in a global pool, and each thread will take at least 2 MB.

@IdanAviv89
Copy link
Contributor

IdanAviv89 commented Jun 6, 2023

Is there a way to free this memory ?
Something like gdal close, that will close the entire gdal without needing to kill the entire program.
The problem I noticed is that after converting big raster, the memory usage increase significantly.

@rouault
Copy link
Member

rouault commented Jun 6, 2023

Is there a way to free this memory ?

not really. Some of it might be to due to RAM fragmentation: cf https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading (although I don't think this is the case here)

@mtobby
Copy link
Author

mtobby commented Jun 6, 2023

GDAL doesn't generally leak memory. If you put your code in a loop, you should hopefully see that RAM usage remains stable at the max value you've noted. It is probably caches kept around. Typically the worker threads are kept in a global pool, and each thread will take at least 2 MB.

I tested the translate in a loop, you are right, the memory usage is stable.
Like @IdanAviv89 said this might be a problem with handling large raster.

@rouault
Copy link
Member

rouault commented Jun 6, 2023

Like @IdanAviv89 said this might be a problem with handling large raster.

The peak memory usage should be around the size defined by GDAL_CACHEMAX (by default 5% of usable RAM). Cf https://gdal.org/user/configoptions.html . So this is user controllable. You can generally manipulate rasters much larger than memory with GDAL

AlexandreBrown added a commit to AlexandreBrown/WildfirePrediction that referenced this issue Aug 11, 2024
- Changed memory allocator from default to tcmalloc
  - Known fix for GDAL multithreading high RAM usage, see https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading
- Set max cache size to 0, see OSGeo/gdal#7908 (comment)
- Added manual release of gdal Dataset to free up memory faster
- Added debug logging of RAM usage to better understand RAM usage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants