Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NULL pointer error #7

Closed
aluo-x opened this issue Mar 10, 2021 · 17 comments
Closed

NULL pointer error #7

aluo-x opened this issue Mar 10, 2021 · 17 comments

Comments

@aluo-x
Copy link

aluo-x commented Mar 10, 2021

  embree.BufferType.Vertex, # buf_type
  File "embree.pyx", line 528, in embree.Geometry.set_new_buffer
  File "embree.pyx", line 453, in embree.array_from_ptr
  File "embree.pyx", line 449, in embree.typed_mv_from_ptr
ValueError: Cannot create cython.array from NULL pointer

It seems like after processing many files, embree encounters a null ptr error.

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

Tagging @sampotter
This seems to occur exactly after 497 calls to device = embree.Device()

@sampotter
Copy link
Owner

Haha, weird! Thanks for pointing this out.

Could you please provide a minimal example I can use to reproduce this? Also, some basic information about what system you're running this on would be helpful (OS, version of Cython, version of Embree, etc.).

Probably the issue has to do with the way I'm using the Embree C API (the docs are good but can be a little vague).

@sampotter
Copy link
Owner

@aluo-x, in fact, I can guess what it is---Embree probably doesn't expect the user to create an infinite number of devices. A more common pattern is to create just a couple of devices and use them for everything. I would restructure your code a bit so that isn't creating tons and tons of devices.

Are your devices getting garbage collected? If they aren't, it may make sense to manually delete them, although this is a bit ugly.

At any rate, I've implemented a pretty minimal amount of error handling in the Cython wrapper. I imagine if I check the error codes being returned by Embree, the issue will become clear. But I will do that once you send a simple example I can test with.

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

Basic system info:

Linux 5.9.4-995.native x86_64
AMD 3960X processor
Python 3.8.5
gcc 10.2.1
Cython 0.29.21
Working on providing a minimal example, will take a couple of minutes.

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

I actually did try manually deleting them. It was the first thought that popped into my head.

I've currently tried the following:

  1. Deleting scene, geometry, device at the end of a call
  2. Reimporting the entire object (via .reload)

For reusing the same device, do I need to modify this line in any way?
https://github.com/sampotter/python-flux/blob/master/flux/shape.py#L67

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

Specifically in regards to commit, attach_geometry etc.

@sampotter
Copy link
Owner

Hmm, got it.

When I use that code it's pretty simple:

  1. Create an instance of TrimeshShapeModel (in my case, I will typically only have one (big) trimesh I'm interested in)
  2. Call e.g. check_vis_1_to_N or get_direct_irradiance many times

So, only one device is ever created

If you have a big scene full of many different objects, I would not recommend creating a different TrimeshShapeModel for each of them. I believe Embree has ways of handling a complicated scene while still only using one Embree device, but I haven't learned how to do this since it's outside of my use case. But Embree has many tutorials---it might be a good idea to check their tutorials and see if you can find one that roughly matches what you're trying to do and see how the API is used. If you then want to try to do it using python-embree and find that some of the functions aren't wrapped, I'll be happy to help you out getting what you need.

But in the mean time, please do send the minimal example for this issue so I can at least fix this bug. :-)

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

So my current approach is indeed to create a new TrimeshShapeModel for each object, however it is failing after the creation of 497 such objects. I currently delete each object after using it.

This behavior was surprising, since I initially believed embree to be stateless.

I'm currently working on a minimal example. Will upload in a couple of min :)

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

import numpy as np
import embree
verts = np.array([[-0.08710144, -0.07980939,  0.12013079],[-0.08710144, -0.07885062,  0.12013079],[-0.05561856, -0.07980939,  0.12751411]])
faces = np.array([[0,1,2]])
def intersect(V, F):
    device = embree.Device()
    geometry = device.make_geometry(embree.GeometryType.Triangle)
    scene = device.make_scene()
    scene.set_flags(4)
    vertex_buffer = geometry.set_new_buffer(
        embree.BufferType.Vertex,  # buf_type
        0,  # slot
        embree.Format.Float3,  # fmt
        3 * np.dtype('float32').itemsize,  # byte_stride
        V.shape[0],  # item_count
    )
    vertex_buffer[:] = V[:]
    index_buffer = geometry.set_new_buffer(
        embree.BufferType.Index,  # buf_type
        0,  # slot
        embree.Format.Uint3,  # fmt
        3 * np.dtype('uint32').itemsize,  # byte_stride,
        F.shape[0]
    )
    index_buffer[:] = F[:]
    geometry.commit()
    scene.attach_geometry(geometry)
    geometry.release()
    scene.commit()
    return True

for k in range(1000):
    intersect(verts, faces)

I should correct myself, it actually fails after 994 calls. I noticed my code was actually calling twice.

@sampotter
Copy link
Owner

Great, thanks.

I have to take off for the day but will fix this when I get back later.

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

Much appreciated!

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

So quick and dirty fix:

make device a global variable, reuse in function.
After performing ray casting, release the scene.

I've processed a few thoursand objects, and the results look correct. Only downside is that I'm apprehensive about using this in a multiprocessing case, I suspect weird stuff could happen when multiple processes try to modify the same device object.

@aluo-x
Copy link
Author

aluo-x commented Mar 10, 2021

A more robust fix:

  • Expose rtcReleaseDevice via the device class.
  • Call device.release() after finishing the ray cast

Tested to 4000~ objects with no errors. I don't currently have time to push a fix, but it is a two line addition to the pyx file:

def release(self):
        rtcReleaseDevice(self._device)

which I call after finishing with a TrimeshShapeModel instance.

@sampotter
Copy link
Owner

Oops, yep, that would be a good thing to wrap. :-) Thanks for catching that. I went ahead and made that change and pushed it.

@aluo-x
Copy link
Author

aluo-x commented Mar 11, 2021

Thanks! Closing this bug now.

@aluo-x aluo-x closed this as completed Mar 11, 2021
@aluo-x
Copy link
Author

aluo-x commented Mar 12, 2021

Leaving a note for other people who may encounter this issue, when processing multiple meshes, you will need to keep track of both the scene & device object in python. After completing any necessary computation, you will need to call

self.scene.release()
self.device.release()

This way allows you to reuse a scene & device when performing raycasting w/ multiple different ray instances, and avoids having to rebuild the acceleration structures each time (if you choose to create & destroy the device for every new ray instance, even if the geometry is static).

Edit: typo, missing words

@sampotter
Copy link
Owner

Thanks for adding your helpful note, @aluo-x. If you're interested, check out #8...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants