Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentaion about when we free memory #311

Closed
kngwyu opened this issue Dec 16, 2018 · 11 comments · Fixed by #893
Closed

Documentaion about when we free memory #311

kngwyu opened this issue Dec 16, 2018 · 11 comments · Fixed by #893

Comments

@kngwyu
Copy link
Member

kngwyu commented Dec 16, 2018

I got an issue in rust-numpy repo today.
See the below code:

fn main() {
    let v = vec![0; 100_000];
    let gil = Python::acquire_gil();
    let py = gil.python();
    loop {
        let py_arr = PyArray2::from_vec2(py, &vec![v.clone()]).unwrap();
        ();
        // &PyArray2 drops and nothing happens
    }
    // GILGruard drops and Python objects are freed
}

In such a case, I think most users assume that py_arr is freed when an iteration of the loop ends, but truly, it's freed after GILGuard drops.

So my suggestions are:

  • We should have a document that clearly demonstrates objects are freed after GILGuard drops
  • How about adding a 'freed-after-drop' object wrapper? But I'm not sure we can do it.
@inv2004
Copy link

inv2004 commented Dec 16, 2018

Probably 'freed-after-drop' can cause slowness. I suppose it would be fine to free mem by calling something like gil.clear() manually sometimes.

@konstin
Copy link
Member

konstin commented Dec 16, 2018

I agree that we should improve the documentation, we especially need need a proper explanation of our GC interacting types in the guide.

How about adding a 'freed-after-drop' object wrapper? But I'm not sure we can do it.

It would be possible with a PyOwned<T, 'py> type, which would be very useful to have in generaö. Implementing that in sound way would a huge effort however and requires major changes to both pyo3 and rust-numpy.

GILGuard::clear should relatively easy to implement and would be quite useful.

@kngwyu
Copy link
Member Author

kngwyu commented Dec 26, 2018

I found

loop {
    let _pool = pyo3::GILPool::new();
    ...
}

also works to decrease ref counts.

@andersk
Copy link
Contributor

andersk commented Aug 30, 2019

Wait a minute. Since anyone can create and drop a new GILPool to clear the shared ReleasePool at any time, from any thread, with or without the GIL held, doesn’t that defeat the entire supposed safety purpose of the ReleasePool?

Proposed fix: #585.

@kngwyu
Copy link
Member Author

kngwyu commented Aug 30, 2019

Since anyone can create and drop a new GILPool to clear the shared ReleasePool at any time, from any thread, with or without the GIL held, doesn’t that defeat the entire supposed safety purpose of the ReleasePool?

I don't think so.
GILPool::drop does nothing when it is used without Python objects.

@andersk
Copy link
Contributor

andersk commented Aug 30, 2019

GILPool::drop does nothing when it is used without Python objects.

And what if there are Python objects in the pool? We could be calling it from another thread, exposing us to all kinds of race conditions. Or, we could have leaked a previous GILPool (leaking is safe in Rust). Or we could just still be holding on to a previous GILPool and only dropping it after releasing the GIL.

@kngwyu
Copy link
Member Author

kngwyu commented Aug 31, 2019

We could be calling it from another thread, exposing us to all kinds of race conditions.

Requiring Python prevent this? Really?

@andersk
Copy link
Contributor

andersk commented Aug 31, 2019

I’m not sure which part is unclear, so I’ll try to spell it out (sorry if I’m repeating the obvious):

  • GILPool::drop calls ReleasePool::drain which calls Py_DECREF.
  • Any caller of Py_DECREF is required to hold the global interpreter lock.
  • The purpose of the global interpreter lock is to prevent race conditions between multiple threads operating on the Python interpreter’s memory, which would lead to undefined behavior.
  • The purpose of the Python marker type is to assert that we’re holding the global interpreter lock.
  • Although there is an assume_gil_acquired method to obtain a Python marker without acquiring the global interpreter lock, it is correctly marked as unsafe.

Part of what makes Rust Rust is that it should not be possible, under any circumstances, to trigger undefined behavior in code that does not explicitly use an unsafe {} block. If a standard library function allowed that, it would be dealt with as a security vulnerability (like CVE-2019-12083 was). Rust libraries should strive to maintain this guarantee.

Thus, when this code leads to undefined behavior, it’s PyO3’s fault:

thread::spawn(|| {
    drop(pyo3::GILPool::new());
})

But when this code leads to undefined behavior, it’s clearly the user’s fault.

thread::spawn(|| {
    let py = unsafe { pyo3::Python::assume_gil_acquired() };
    drop(pyo3::GILPool::new(py));
})

Makes sense?

@kngwyu
Copy link
Member Author

kngwyu commented Aug 31, 2019

Thank you for your explanation, but I'm still not sure how GILPool behaves in multithread situation 🤔 (actually the indices it has can cause problems, even if GIL is correctly acquired).
Sorry for my confusing question, and please don't mind this.
Maybe I resolve my question myself by reading the code carefully 🙄

@benkay86
Copy link
Contributor

This was closed by #893, which closed a soundness hole in Python::acquire_gil(), but with respect to this issue, the documentation surrounding when we free memory is still sparse, as evidenced by continued discussion/confusion in #1056.

Would there be interest in a PR to include some examples like these in the guide and possibly adding more breadcrumbs to the API documentation for Python and Py<T>?

@davidhewitt
Copy link
Member

New documentation is always welcome and appreciated. I never got around to writing more because I really do want to explore better ways to do this soon. (see #1308)

I think at this point it's not realistically on the cards as something I'll tackle in 0.15, so additional documentation will be valuable for users for at least one release cycle. And most likely when writing changes I'll iterate on existing docs rather than throw them away.

benkay86 added a commit to benkay86/pyo3 that referenced this issue Aug 17, 2021
benkay86 added a commit to benkay86/pyo3 that referenced this issue Aug 17, 2021
benkay86 added a commit to benkay86/pyo3 that referenced this issue Aug 18, 2021
benkay86 added a commit to benkay86/pyo3 that referenced this issue Aug 18, 2021
benkay86 added a commit to benkay86/pyo3 that referenced this issue Aug 18, 2021
benkay86 added a commit to benkay86/pyo3 that referenced this issue Aug 18, 2021
davidhewitt added a commit that referenced this issue Aug 21, 2021
* Improve API docs regarding when we free memory, resolves #311

* Add chapter to guide about when we free memory, resolves #311

* Fix typos in documentation

Co-authored-by: David Hewitt <[email protected]>

* Add links from guide to docs.rs

* Update guide/src/memory.md

Co-authored-by: David Hewitt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants