-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bind_map/bind_vector: allow customizing the getitem return_value_policy #452
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes me uncomfortable to ship something that dangerous, but I do understand the need, and you did document the caveats well. But I would appreciate if you could add concrete examples of very innocent-looking programs that cold crash the interpreter.
About iterators in bind_map/bind_vector: yes, these are also bad interfaces and will need to be done. I was thinking of having a custom iterator that reindexes into the arrays/maps instead of using original STL iterators. It would need to store a strong reference to the Python object along with the current index or key.
The nb::make_*_iterator()
API will also need a big fat warning about the dangers of using it and concurrently modifying the underlying data structure.
In the case that this API is used to create an unsafe container, should it be a sort of "structural static read-write view"? In the sense that we would simply not bind functions that are almost certainly going to segfault your process: insertion, deletion, etc. |
I added some examples of innocuous-looking UB.
That could guard against some of the dangers of concurrent modification during iteration, but it basically just reduces the iterator-safety problem to the struct Inner { int v; };
struct Outer { Inner inner; };
nb::class_<Inner>(m, "Inner").def_rw("v", &Inner::v);
nb::class_<Outer>(m, "Outer").def_rw("inner", &Outer::inner);
nb::bind_vector<std::vector<Outer>>(m, "OuterVec"); After some Python code like: vec = ext.OuterVec()
for i in range(10):
elem = ext.Outer()
elem.inner.v = i
vec.append(elem)
values = [elem.inner for elem in vec if elem.inner % 2 == 0] you now have an array of Currently
If the user has read our extensive warnings and decided to do the unsafe thing anyway, I don't really think it's our place to try to prevent them from doing what they want to do. It's perfectly possible to use a reference-semantic vector in ways that don't crash your program, even without forbidding insertions and deletions; C++ developers do it all the time! I also think it's telling that |
56d7e93
to
e80edb1
Compare
c30294a
to
af57451
Compare
voicing my support for this feature #564 |
d7117a4
to
983d6c0
Compare
Sorry about taking a while to get to this. This change looks good, and I can't come up with a better way of doing things. Thank you for the nice documentation and helpful changes as always @oremanj . |
FYI: 0e1fe84 makes a similar change for the iterators. The STL vector/map bindings pass their |
I'm as much a fan as anyone of safe-by-default, but I also maintain some code that relied on the old reference semantics for bind_map/bind_vector, including taking the appropriate level of care not to store the item-reference Python objects past the lifetime of their C++ referent. In order to avoid needing to maintain a local copy of these binding functions just to change one word, I'm hoping you'll consider this PR, which keeps the safe default but allows it to be changed using a 2nd template argument to bind_map/bind_vector. It also includes some expanded discussion in the docs, which reflects that the reference semantics are much more unsafe for std::vector (where inserting or erasing any element can invalidate references to all elements) than they are for std::map/unordered_map (where the only thing that can invalidate references to an element is removing that specific element).
This changes the default getitem rv_policy from
std::is_pointer_v<Value> ? rv_policy::take_ownership : rv_policy::copy
torv_policy::automatic_reference
. The main difference is that a non-intrusive-refcounting pointer will usereference
instead oftake_ownership
, which seems like an obvious win to me. (When intrusive refcounting is used, the rv_policy is always overridden to betake_ownership
innb_type_put_common()
, so it doesn't matter what we pass in that case.) A secondary difference is that if the map iterator returns a temporary value (rather than a reference) we can now move out of it. (Such iterators don't currently work for other reasons, but I'm about to put up a PR that fixes those.)While working on this, I noticed that the container iterators still use the
reference_internal
policy. I'm not sure if you intended this or not; I can see the argument that someone is less likely to hold onto an object they got during iteration, but it's certainly not impossible. I didn't make any changes here, but if you'd like to, I would suggest using the samegetitem_policy
for the iterator result types as well.