Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix early destruction of objects returned from Rust (#138) #165

Closed
wants to merge 1 commit into from

Conversation

thebigG
Copy link
Contributor

@thebigG thebigG commented Mar 10, 2023

Hi there,

I probably need to test this more thoroughly. Just opening a pull request for collaboration and such.

Basically it came down to for some reason(I'm still learning about this rust port of godot) ac02be1 triggers this behavior. And this PR reverts that commit(except for the tests).

This patch fixes my use case, but I'll try to write unit tests and test in general to make sure we don't miss any other edge cases.

Hope this PR is clear enough.

Thanks
Lorenzo

@Bromeon
Copy link
Member

Bromeon commented Mar 10, 2023

Thanks!

The commit you referred to occured as part of #112, which was motivated in fixing ref-count related problems. I wrote tests that failed before and passed after the PR. You understand I cannot just revert the changes that evidently brought improvements, while the CI of this pull request is red.

It's well possible that there is a mistake in #112 though, but we need to approach it more scientifically:

  • show a new test case that only works when reverting (some of) the changes
  • find out why the assumptions made back then are wrong

So before that, can you exclude that there is a configuration problem on your side? Are you using Godot 4.0-stable and latest gdextension from master?

@Bromeon
Copy link
Member

Bromeon commented Mar 10, 2023

We still have the comment // FIXME is inc_ref needed here?, which could be related to the problem. If we solve it, then properly this time, meaning no std::mem::forget hack. That would mean we'd need to increment the reference counter in that case.

Could you try to distill the problem that occurs in #138 into something testable, namely an #[itest] test case? That would help a lot with having a base from which we can debug.

@@ -163,7 +163,7 @@ macro_rules! impl_signature_for_tuple {
.unwrap_or_else(|e| return_error::<$R>(method_name, &e));

// FIXME is inc_ref needed here?
// std::mem::forget(ret_val);
std::mem::forget(ret_val);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the only thing that's needed to stop #138 from happening. It worked on my machine at least. Even the sanitizer build was happy, whereas before it would complain about a heap-use-after-free (it seems the array would get dropped here.)

I just ran a full-ci after making this change too, and everything is green.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FIXME points to the open question: is mem::forget is the right approach, or should we increment the ref-count?

This just disables the destructor, but I think the proper way would be to create a (ref) copy and then run the destructor. I'm not sure if there's a difference though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just disables the destructor, but I think the proper way would be to create a (ref) copy and then run the destructor. I'm not sure if there's a difference though.

So from what i can tell, the destructor of Array is just a call to unref internally. So there would be no difference between calling inc_ref, then dropping it, and just mem::forget. At least for Array (and i believe other ref-counted types too).

But either way, the purpose of mem::forget is to ensure the destructor isn't run. And that's what we want here right? We're returning the value, so the destructor should not be run.

Copy link
Member

@Bromeon Bromeon Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's leave it like this then.

@thebigG nitpick, could you align the // vertically and add a space between // and Note:?

@Bromeon
Copy link
Member

Bromeon commented Mar 12, 2023

@thebigG Could you implement the suggestion from @SayakS (only mem::forget)? Please also add a comment that we want to prevent the destructor from being run, since we pass ownership to the caller.

Would still be good to include the test case that triggers the original problem -- i.e. red before this change, green after. Thanks!

@thebigG
Copy link
Contributor Author

thebigG commented Mar 12, 2023

@Bromeon Will do.

@thebigG thebigG changed the title -Attempts to fix godot-rust/gdextension#138. -Attempts to fix gdext/gdextension#138. Mar 12, 2023
@thebigG thebigG changed the title -Attempts to fix gdext/gdextension#138. -Attempts to fix godo-rust/gdext#138. Mar 12, 2023
@thebigG thebigG changed the title -Attempts to fix godo-rust/gdext#138. -Attempts to fix godot-rust/gdext#138. Mar 12, 2023
@thebigG
Copy link
Contributor Author

thebigG commented Mar 12, 2023

I'm not sure if the tests approach I took is the best. I have code duplication. So perhaps you guys have some feedback on how to make that better :)

@thebigG
Copy link
Contributor Author

thebigG commented Mar 12, 2023

Quick note not directly related to this PR:

Do we currently have a way in the tests to catch the following kinds of errors from Godot(?):

   -- object_instance_id_when_freed ... ok
ERROR: Condition "slot >= slot_max" is true. Returning: nullptr
   at: get_instance (./core/object/object.h:972)

It so happens in my array test, the error with ref_count does get "caught "(though it is caught only because there is a segfault, it looks like anyway, if one tries to access the array in the assertion). But I suspect there might be certain "errors" from Godot in the future that just get "printed" and might get difficult to validate those errors in unit tests....

Copy link
Member

@Bromeon Bromeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see why you add a whole new file InheritTests.gd. Could you not simply add the test case test_vector_array_return_from_user_func to the existing ManualFfiTests.gd?

Do we currently have a way in the tests to catch the following kinds of errors from Godot(?):

Unfortunately not, Godot doesn't expose error handlers to GDExtension. But in Godot, the philosophy is that many errors can be avoided by previously checking some condition, although many times at extra performance cost.

Please also squash the commits to 1 🙂

Comment on lines 3 to 27
var _assertion_failed: bool = false

## Asserts that `what` is `true`, but does not abort the test. Returns `what` so you can return
## early from the test function if the assertion failed.
func assert_that(what: bool, message: String = "") -> bool:
if what:
return true

_assertion_failed = true
if message:
print("assertion failed: %s" % message)
else:
print("assertion failed")
return false

func assert_eq(left, right, message: String = "") -> bool:
if left == right:
return true

_assertion_failed = true
if message:
print("assertion failed: %s\n left: %s\n right: %s" % [message, left, right])
else:
print("assertion failed: `(left == right)`\n left: %s\n right: %s" % [left, right])
return false
Copy link
Member

@Bromeon Bromeon Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should just be able to reuse TestSuite, as it's done here:

extends TestSuite

Or even better, as mentioned above just move it to ManualFfiTests.gd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, as mentioned above just move it to ManualFfiTests.gd.

This behavior is only triggered when the function is called from a script, which inherits the object/node the function belongs to. This was verified here as well:#138 (comment)

Which is why I can't move this code to ManualFfiTests.gd. Unless multiple inheritance is allowed in GDScript(?)

I don't really see why you add a whole new file InheritTests.gd. Could you not simply add the test case test_vector_array_return_from_user_func to the existing ManualFfiTests.gd?

No

@thebigG thebigG force-pushed the 138_ref_count_fixes branch from ccfae72 to 418f6d3 Compare March 19, 2023 04:27
@thebigG
Copy link
Contributor Author

thebigG commented Mar 19, 2023

@Bromeon merged all commits into one. Let me know of any other feedback you might have :). I know the unit test adds an extra file, but can't think of another way of doing this....

Copy link
Member

@Bromeon Bromeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some comments:

  • The new file could have a variable of type TestSuite, meaning you could reuse that code via composition. Then just declare two one-liner functions assert_that and assert_eq which forward to self.test_suite.<method>.
  • Since the test must extend ArrayTest in order to reproduce the behavior, please comment this rationale at the beginning of the file (e.g. why this choice, why we can't reuse the asserts, etc).
  • Each file must start with the license header on the very top, see other .gd files 🙂

@Bromeon Bromeon added bug c: core Core components labels Mar 19, 2023
@thebigG thebigG force-pushed the 138_ref_count_fixes branch from 418f6d3 to 489888b Compare March 19, 2023 22:58
Copy link
Member

@Bromeon Bromeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally good, but some typos need fixing.


extends ArrayTest

var test_siute: TestSuite = TestSuite.new()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Comment on lines 9 to 11
# Since in order to reproduce the bahevaior discovered in https://github.com/godot-rust/gdext/issues/138
# We must inherit a Godot Node. Because of this, we can't just inherit TesSuite like the rest of the tests.
# Because of that, we get the TestSuite code via composition rather than iheritance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 orthographic erros, please re-read comments quickly 🙂

@@ -163,7 +163,7 @@ macro_rules! impl_signature_for_tuple {
.unwrap_or_else(|e| return_error::<$R>(method_name, &e));

// FIXME is inc_ref needed here?
// std::mem::forget(ret_val);
std::mem::forget(ret_val);
Copy link
Member

@Bromeon Bromeon Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's leave it like this then.

@thebigG nitpick, could you align the // vertically and add a space between // and Note:?

@Bromeon Bromeon changed the title -Attempts to fix godot-rust/gdext#138. Fix early destruction of objects returned from Rust (#138) Mar 20, 2023
@thebigG thebigG force-pushed the 138_ref_count_fixes branch from 489888b to 30384e5 Compare March 22, 2023 04:47
@thebigG
Copy link
Contributor Author

thebigG commented Mar 22, 2023

Addressed all concerns. Thanks a lot for the great feedback @Bromeon 👍 .

@Bromeon
Copy link
Member

Bromeon commented Mar 22, 2023

You still write "bahevaior" and CI is also failing, due to another typo. Please double-check these things and make sure CI is green, this would save a lot of time for both of us.

thebigG added a commit to thebigG/gdextension that referenced this pull request Mar 22, 2023
@thebigG thebigG force-pushed the 138_ref_count_fixes branch from 30384e5 to 8b7b786 Compare March 22, 2023 12:36
@thebigG thebigG force-pushed the 138_ref_count_fixes branch from 8b7b786 to 4906bea Compare March 22, 2023 13:26
@Bromeon
Copy link
Member

Bromeon commented Mar 22, 2023

Thanks a lot!
bors r+

bors bot added a commit that referenced this pull request Mar 22, 2023
165: Fix early destruction of objects returned from Rust (#138) r=Bromeon a=thebigG

Hi there,

I probably need to test this more thoroughly. Just opening a pull request for collaboration and such.

Basically it came down to for some reason(I'm still learning about this rust port of godot) ac02be1 triggers this behavior.  And this PR reverts that commit(except for the tests).


This patch fixes my use case, but I'll try to write unit tests and test in general to make sure we don't miss any other edge cases.


Hope this PR is clear enough.

Thanks
Lorenzo

Co-authored-by: thebigG <[email protected]>
@bors
Copy link
Contributor

bors bot commented Mar 22, 2023

Build failed:

@Bromeon
Copy link
Member

Bromeon commented Mar 22, 2023

CI failed. I cannot merge this change as it is now -- while it fixes UB in some cases, it introduces memory leaks in others 😢

Clearly this needs a bit more thought...

@lilizoey
Copy link
Member

CI failed. I cannot merge this change as it is now -- while it fixes UB in some cases, it introduces memory leaks in others cry

Clearly this needs a bit more thought...

Just to note here too, i did some digging in a discord thread and it does seem like mem::forget is the right solution for this issue still, it's just that there is also a separate issue causing a memory leak which is only visible when we don't crash.

Basically godot is giving us an initialized Array pointer for the return value, and we currently just overwrite the pointer with our own array, which leaks the initialized array godot gave us a pointer to.

Bromeon pushed a commit that referenced this pull request Mar 26, 2023
(cherry picked from commit 4906bea)
Bromeon pushed a commit that referenced this pull request Mar 26, 2023
(cherry picked from commit 4906bea)
lilizoey pushed a commit to lilizoey/gdextension that referenced this pull request Apr 11, 2023
(cherry picked from commit 4906bea)
bors bot added a commit that referenced this pull request Apr 12, 2023
204: Fix various issues with pointer calls r=Bromeon a=sayaks

Combines #165 and #200 with more complete fixes.

# Overview

Properly clones strings and shares arrays/dictionaries in pointer calls to ensure they're not prematurely freed.
Frees the value currently in the `ret` pointer in pointer calls, to avoid memory leak.
Moves `ret_val` into `ret`, to avoid premature destruction.
Changes integer conversions in pointer calls to use `as`, since that seems like the right way of converting the values we get from godot. `try_from` lead to wrong conversions sometimes.
Objects inheriting from `RefCounted` now use `ref_get_object` and `ref_set_object` in virtual method calls. 

# Details

Add `from_arg_ptr` and `move_return_ptr` to `GodotFfi` that will be used when converting argument pointers in pointer calls, and moving things into a pointer.
Add `CallType` so `from_arg_ptr` and `move_return_ptr` can have different behavior depending on whether it's a virtual method call or not.
Remove `write_sys` in `GodotFfi`.
Override `from_arg_ptr` in `GodotString`, `NodePath`, `StringName`, `Array`, `Dictionary`, `Gd`.
Override `move_return_ptr` in `Gd`.
Rename `try_write_sys` to `try_return`, and have it use `move_return_ptr` instead of `write_sys`.
Rename `try_from_sys` to `try_from_arg`, and have it use `from_arg_ptr` instead of `from_sys`.
Add `u8`, `u16`, `u32` to the ptrcall tests.
Add a couple of test of virtual methods.
Fix a test for virtual method with return values to actually call the virtual method as a virtual method.

closes #195 
closes #189 



Co-authored-by: Jan Haller <[email protected]>
Co-authored-by: thebigG <[email protected]>
Co-authored-by: Lili Zoey <[email protected]>
@bors bors bot closed this in 3d0e3d2 Apr 12, 2023
@Bromeon
Copy link
Member

Bromeon commented Apr 12, 2023

Merged as part of #204.

@Bromeon Bromeon added the ub Undefined behavior label Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug c: core Core components ub Undefined behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ERROR: Condition "!success" is true when returning an Array from gdextension
3 participants