Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unlimited memory usage when calling several Image and Region methods from Python #2705

Closed
tao558 opened this issue Aug 25, 2021 · 9 comments · Fixed by #2707
Closed

Unlimited memory usage when calling several Image and Region methods from Python #2705

tao558 opened this issue Aug 25, 2021 · 9 comments · Fixed by #2707
Assignees
Labels
type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances

Comments

@tao558
Copy link

tao558 commented Aug 25, 2021

Description

The below Python programs seem to require unlimited memory.

Steps to Reproduce

This is a family of seemingly related issues, which have roughly the same setup. Run the below code to set up a few examples

import itk

Dimension = 2
PixelType = itk.F
ImageType = itk.Image[PixelType, Dimension]

image = ImageType.New()

size = itk.Size[Dimension]()
size.Fill(500)

index = itk.Index[Dimension]()
index.Fill(50)

RegionType = itk.ImageRegion[Dimension]
region = RegionType()
region.SetIndex(index)
region.SetSize(size)

image.SetRegions(region)
image.Allocate()

With the above setup, run one of the following code cells and monitor memory usage. They all consume an unlimited amount of memory:

1.) Region::IsInside with tuple argument with x or y >= 256 or <= -6

x, y = 256, 255
for i in range(10000000000000):
    region.IsInside((x + (i % 2), y + (i % 2)))

2.) Image::TransformPhysicalPointToIndex with tuple argument with x or y >= 256 or <= -6

x, y = 256, 255
for i in range(10000000000):
    image.TransformPhysicalPointToIndex((x + (i % 2), y + (i % 2)))

3.) Image::TransformPhysicalPointToIndex with itk.Point argument with ANY value of x or y (as far as I can tell)

x, y = 0, 0
for i in range(10000000000):
    p = itk.Point[PixelType, Dimension]()
    p[0] = x + (i % 2)
    p[1] = y + (i % 2)
    image.TransformPhysicalPointToIndex(p)

4.) Image::TransformIndexToPhysicalPoint with tuple argument with x or y >= 256 or <= -6

x, y = 256, 255
for i in range(10000000000):
    image.TransformIndexToPhysicalPoint((x + (i % 2), y + (i % 2)))

Swapping tuples for itk.Index arguments for cases 1 and 4 resolves the issue. From this source:

"Python preallocates small integers in a range of -5 to 256. This allocation happens during initialization and since we cannot update integers (immutability) these preallocated integers are singletons and are directly referenced instead of reallocating. This means every time we use/creates a small integer, python instead of reallocating just returns the reference of preallocated one.". Thanks @brad-t-moore

This gives a hint to the above behavior.

Expected behavior

Actual behavior

Reproducibility

Versions

ITK 5.2.0

Environment

Python 3.6.9, Ubuntu 18.04

Additional Information

Valgrind output for case 1:

==27314== LEAK SUMMARY:
==27314==    definitely lost: 64 bytes in 1 blocks
==27314==    indirectly lost: 0 bytes in 0 blocks
==27314==      possibly lost: 306,216 bytes in 200 blocks
==27314==    still reachable: 31,665,705 bytes in 36,532 blocks
==27314==                       of which reachable via heuristic:
==27314==                         stdstring          : 445,212 bytes in 5,969 blocks
==27314==         suppressed: 0 bytes in 0 blocks

Valgrind gives the same summary when swapping a tuple out for an itk.Index argument in case 1

@tao558 tao558 added the type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances label Aug 25, 2021
@dzenanz
Copy link
Member

dzenanz commented Aug 25, 2021

This sounds like some unsigned char overflow. I don't know why would uchar be used instead of int by the wrapping or anything else involved here.

@tao558
Copy link
Author

tao558 commented Aug 25, 2021

Something else noteworthy... Replacing the content of the for loop with:

for i in range(10000000000000):
    p = itk.Index[Dimension]()
    p[0] = x + (i % 2)
    p[1] = y
    region.IsInside(p)

Fixes the memory issues...

@hjmjohnson
Copy link
Member

I am curious if the automatic conversion of a set "(a,b)" to an itk.Index is leaking memory.

@tao558
Copy link
Author

tao558 commented Aug 25, 2021

The odd thing is, both approaches give the same leak summary from valgrind...

@tao558
Copy link
Author

tao558 commented Aug 25, 2021

Updating the title and changing the description quite a bit, as this seems to be a family of issues.

@tao558 tao558 changed the title ImageRegion::IsInside uncapped memory usage Unlimited memory usage when calling several Image and Region methods from Python Aug 25, 2021
@blowekamp
Copy link
Member

How big is the memory leak? This might be related?

Here is some information on how Python handles immutable objects like integers maybe some tuples?
https://freecontent.manning.com/mutable-and-immutable-objects/

Try changing it from a tuple to a list? from an int to a float?

@hjmjohnson
Copy link
Member

hjmjohnson commented Aug 26, 2021 via email

@tao558
Copy link
Author

tao558 commented Aug 26, 2021

Hi All,

Its tough to tell how big the memory leak is, as using a tuple (which causes the issue) and using itk.Index (which does not cause the issue) both give the same Valgrind summaries.

Changing from a tuple to a list still causes the same issue.

Explicitly calling gc.collect() doesn't seem to do any good either, unfortunately, as it constantly returns 0.

Let me know if there are other questions that I can answer, thanks.

@brad-t-moore
Copy link
Contributor

FYI - we think it's due to the use of PySequence_GetItem in Wrapping/Generators/Python/PyBase/pyBase.i. According to Python's documentation it returns a new reference and therefore should have an associated Py_DECREF after its use. Working on compiling and verifying a PR now.

Note, this memory issue should impact any method that is using the PyBase.i sequence/vec wrapper code. Our hypothesis is that the Python reuse of small integers and tuples was suppressing the memory leak. For me, after calling gc.collect() the memory would eventually (like minutes) be released.

@brad-t-moore brad-t-moore self-assigned this Aug 30, 2021
brad-t-moore added a commit to brad-t-moore/ITK that referenced this issue Aug 30, 2021
Fixes InsightSoftwareConsortium#2705.  Calls to PySequence_GetItem have been updated in pyBase.i to decrement the reference count correctly.
brad-t-moore added a commit to brad-t-moore/ITK that referenced this issue Aug 31, 2021
Fixes InsightSoftwareConsortium#2705.  Calls to PySequence_GetItem have been updated in pyBase.i to decrement the reference count correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants