Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference to freed object can return a random different one #32383

Closed
jaykyburz opened this issue Sep 27, 2019 · 39 comments · Fixed by #36189
Closed

Reference to freed object can return a random different one #32383

jaykyburz opened this issue Sep 27, 2019 · 39 comments · Fixed by #36189

Comments

@jaykyburz
Copy link

jaykyburz commented Sep 27, 2019

Godot version & OS/device including version:
3.1.1 Windows 10.

Issue description:
I think this is a simple explanation of an issue I have on my project. I have raised this issue without an example project because it will take some time to reproduce in a simple form and I would like some godot experts to look at what I am describing, and let me know if the issue is already logged and known (and perhaps fixed) before I spend more time working on a reproduction cases


Say I have a RobotFactory class and it can instantiate Robots, but it is only allowed to have one active in the world at once. It saves the instantiated robot in a var.

var my_robot = RobotScene.instance()

The robot walks around the world for a while an then is destroyed by something so in its own script it calls queue_free() on itself.

Sometime later in the RobotFactory we call is_instance_valid(my_robot) to see if we should create a new robot.

In my project I have found that sometimes is_instance_valid(my_robot) returns true after my_robot has been freed, but that it is pointing to some other random object in the scene that has been instantiated. Perhaps sometime after the queue_free, but before is_instance_valid call.

The only thing I think could be happening is that there is some kind of underlying pooling system and my_robot is pointing to somewhere that is being reused after the robot is freed.

Please let me know if this is a known issue, is already logged, or if I should do some more work to explain the issue better.


Bugsquad edit: Keyword: previously freed instance

@KoBeWi
Copy link
Member

KoBeWi commented Sep 27, 2019

Oooh, it had totally the same thing. Good to know I'm not alone ;_;

In my case, I had a Control node with script that calls queue_free() inside _ready() on some condition. Then another node referenced it in an onready variable. Later in code I had something like:

if is_instance_valid(my_control):
    my_control.do_stuff()

And it raised an error ~ "No method do_stuff in class ParticleMaterial". Like, what XD
It just sometimes takes random object or resource from the scene. But now always, it depends on when I use it. I tried to make reproduction project, but I couldn't >_> Although I'm instancing lots of objects in my scene using ResourceInteractiveLoader and I didn't test it with minimal project yet.

Ultimately I ended with this ;_;

if is_instance_valid(my_control) and my_control is Control:
    my_control.do_stuff()

@Zylann
Copy link
Contributor

Zylann commented Sep 27, 2019

Could it be that the object ID got re-used? Also, are there threads involved? (I mean, anywhere, not just scripts).

@bojidar-bg
Copy link
Contributor

Can confirm on a game I made some time ago without threads.
In it, I had enemies with the following snipped of code:

func ....():
        if target == null or not is_instance_valid(target):
                var players = get_tree().get_nodes_in_group("player")
                if players.size() == 0: return
                target = players[randi() % players.size()]

It worked -- enemies targeted random players, and switched when their target died.
But, when all players died, and a new enemy spawned, all other enemies targeted that enemy, due to the reference being reused.

@jaykyburz
Copy link
Author

jaykyburz commented Sep 27, 2019

Attached is a sample project the demonstrates the issue. It's basically what I described above.

One factory produces red cubes and keeps a reference to the cube it last created. It points to its reference using the pink squashed sphere in the screen shot below. The red factory produces cubes every 4 seconds but they are freed after 1 second so there is 3 seconds where the is_instance_valid(my_instance) should return false.

There is a second factory producing blue cubes every 1 second.

image
We find that every so often, is_instance_valid(my_instance) on the red factory returns true when it should not, but is pointing to one of the blue cubes.
image

IsInstanceValid.zip

Update: on reading my comment above I realise that saying is_instance_valid should not return true is wrong, because it is pointing to a valid instance, its just not the instance I put in the variable. So strictly speaking is_instance_valid is working. But, I discovered is_instance_valid command when reading here on git hub about ways to detect if an instance we have has been freed. I don't know how to do that if is_instance_valid is not the correct way.

@reduz
Copy link
Member

reduz commented Sep 27, 2019

This is a known problem, but it requires some major changes to fix. 4.0 will change the way objects are referenced so this (and other similar issues) can be properly fixed.

@jaykyburz
Copy link
Author

jaykyburz commented Sep 27, 2019

Thanks for the update.

I think if this is going to be a 4.0 issue I had better remove is_instance_valid from my project. I've only used it 11 times so far. Normally its AI's keeping track of enemy its following or shooting at. It could lead to very subtle wired bugs with AI's walking off in random directions or shooting at the wrong thing. I might try and implement a "about_to_free" signal and emit it before calling queue_free. Then anybody with a reference can listen for the signal and set the var holding the instance to null.

We might consider removing is_instance_valid in 3.2, or having a warning when you use it, or at the very least update the help. All it says at the moment is "Returns whether instance is a valid object (e.g. has not been deleted from memory)." which is not true.

@vnen
Copy link
Member

vnen commented Sep 27, 2019

Pretty sure there's already a report about this.

Edit: it's this one: #22395

@KoBeWi
Copy link
Member

KoBeWi commented Oct 12, 2019

I found a workaround for some cases. Let's say you have an object stored in a variable called my_instance. You can just do

my_instance.connect("tree_exited", self, "set", ["my_instance", null])

Then instead of using is_instance_valid() you can just do if my_instance, which will now reliably point to null if object is freed. This of course breaks if you use remove_child.

@jaykyburz
Copy link
Author

jaykyburz commented Oct 12, 2019

I wrote a little helper class and now try never to save a reference to a node directly.

extends Node
class_name NodeRefrence

var __ref:Node

func get_ref()->Node:
	return __ref

func set_ref(ref:Node):
	clear_ref()
	__ref = ref
	__ref.connect("tree_exiting", self, "ref_tree_exiting")

func clear_ref():
	if __ref:
		__ref.disconnect("tree_exiting", self, "ref_tree_exiting")
	__ref = null

func ref_tree_exiting():
	__ref = null

Then to use this you have some code that looks something like this.

var target_ref = NodeRefrence.new()
func _ready():
    target_ref.set_ref(some_node)
    ....
    var target = target_ref.get_ref()

@akien-mga
Copy link
Member

See #22395 for other example projects that reproduce it.

It sucks that we can't have a proper fix for this before 4.0 :/

@akien-mga akien-mga changed the title refrence to freed spatial returns random object Reference to freed object can return a random different one Jan 13, 2020
@lawnjelly
Copy link
Member

When I tried jaykyburz's test project I found that using WeakRef instead of a raw object pointer also fixed it for me.

i.e.

var my_robot_ref : WeakRef

my_robot_ref = weakref(bot)

	if my_robot_ref:
		var ref = my_robot_ref.get_ref()
		if ref:
			$RedPointer.global_transform = ref.global_transform

WeakRef internally uses an ObjectID rather than pointer, which is looked up when you call get_ref().

If you store both the original pointer and a weakref in the test project, if you remote debug it, you can clearly see the point at which the pointer starts pointing to the wrong object.

@ttencate
Copy link
Contributor

ttencate commented Jan 4, 2021

@akien-mga

It sucks that we can't have a proper fix for this before 4.0 :/

Agreed. Maybe it would be good to add some warnings to the 3.x docs? I would expect to find such warnings in these places:

@Yekez
Copy link

Yekez commented Mar 21, 2021

I had an ultimate case in my game. I was not aware of such a thing before reading about this dangling var issue. My script was "queue_free()"ing multiple objects and reasigning new objects to those same variables. Took me pretty long time to find out what was wrong. Documenting this somewhere, if not fixing it, might help people a lot.

@remram44
Copy link
Contributor

remram44 commented May 10, 2021

I'm running into this again and wondering if the docs shouldn't be improved. @lawnjelly mentions that weakref works in this case (does it?) (#32383 (comment)) but the doc for WeakRef only says:

If this object is not a reference, weakref still works, however, it does not have any effect on the object.

It is not clear what "still works" means. Does get_ref() correctly returns null if the object is gone, even if it has been re-allocated? Or does it just do the same as a pointer?

@lawnjelly
Copy link
Member

lawnjelly commented May 10, 2021

EDIT: outdated, it seems the issue is fixed, and remram is likely using an outdated version of the engine.

@jaykyburz
Copy link
Author

This issue is the reason I moved away from Godot and the fact that it's still not fixed is why I won't come back. I really love what Godot could have been, but such a serious issue lingering for years is really unprofessional. And then to double down and make debug and release have different behaviors is unbelievable.

@Zylann
Copy link
Contributor

Zylann commented May 10, 2021

If this object is not a reference, weakref still works, however, it does not have any effect on the object.

maybe a lot of tutorials using straight pointers should be using WeakRefs.

In GDScript at least, weakref is not needed (and even degenerate) when the referenced object is an Object (and not a Reference). Either way, is_instance_valid() is the function to use to check if the pointer is valid or not.
The intented use of WeakRef is to not contribute to Reference's refcount, which it achieves by using ObjectID. The fact it allows checking for validity of the pointer is a side-effect that isnt much different from is_instance_valid(), and in fact it should no even work when given an Object because Objects don't extend Reference. Some people overuse it "just because it works" but semantically it isn't the best approach.

Also I believe the title of this issue was fixed in 4.0? (3.x couldn't do it because it's a significant core change?)

@lawnjelly
Copy link
Member

lawnjelly commented May 10, 2021

Either way, is_instance_valid() is the function to use to check if the pointer is valid or not.

EDIT: outdated, it seems the issue is fixed, and remram is likely using an outdated version of the engine.

@Zylann
Copy link
Contributor

Zylann commented May 10, 2021

From what I understood of the underlying issue, even WeakRef was subject to this because the problem lies into how ObjectID is looked up? (i.e a given ObjectID can be later re-used) or maybe that was applying to pointer lookups only?

@lawnjelly
Copy link
Member

lawnjelly commented May 10, 2021

EDIT: outdated, it seems the issue is fixed, and remram is likely using an outdated version of the engine.

@akien-mga
Copy link
Member

Either way, is_instance_valid() is the function to use to check if the pointer is valid or not.

Well it should be, in theory, but as the issue kind of points out, it doesn't actually work in 3.x.

AFAIK this issue has been fixed in 3.3 (actually in 3.2.2, then with further changes in 3.2.3 which made debug and release behavior inconsistent, fixed in 3.3). Using is_instance_valid() should work. If it doesn't, this should be reported in a new issue.

@lawnjelly
Copy link
Member

Ah maybe @remram44 is using an old version of the engine, sorry for confusion. 👍

@remram44
Copy link
Contributor

The "old version" I use is the latest stable, 3.3. The problem is fixed in 4.0, but in 3.3 it is possible for a non-weakref pointer to suddenly point to a different object, which is what this issue is about.

@Sebjugate
Copy link

@aikin

AFAIK this issue has been fixed in 3.3 (actually in 3.2.2, then with further changes in 3.2.3 which made debug and release behavior inconsistent, fixed in 3.3). Using is_instance_valid() should work. If it doesn't, this should be reported in a new issue.

Unfortunately, this particular behavior (freed references pointing to random other objects) is NOT fixed in 3.3. If you're referring to the dangling pointer fix (#38119 by @RandomShaper ), I'm not entirely sure I understand what it was fixing, but it did not fix this issue.

This seems like a fairly critical issue, and does not break or provide any error or warning in debug, but exhibits this unexpected behavior and problematic behavior in release (references pointing to random other objects).

I can provide a sample project if needed.

From what I've read of @reduz comments elsewhere, this may be impossible to fix in 3.x. If so, is there a document that explains what the best, safest way to store a reference to a node is then?

Thanks everyone!

@RandomShaper
Copy link
Member

It's fixed in 3.x for run in the editor. About extending it to release builds (and debug without debugger attached), I opened this: godotengine/godot-proposals#1589

If you have a case where it fails running from the editor, please report.

@remram44
Copy link
Contributor

I thought the fix was to use WeakRef?

@jaykyburz
Copy link
Author

jaykyburz commented Aug 15, 2021

This issue killed Godot for me, as well as reduz attitude that its no big deal. And then to double down and allow a release to go out where debug is different to release just shows the folks running the show here really have no idea what its like to ship and maintain a real game. You just can't trust the foundations of this engine, no matter how nice the graphics are. You can't trust the people that guide its development to make good decisions.

You would have to be crazy to try and build a business on Gotdot.

Can you imagine shipping something on Steam and then getting reports of random crashes in literately any place in your game because object references are reused. Then you spend days and weeks try to reproduce the issue in a debug build so you can try and work out what is going on, but it doesn't happen in debug.

@Yekez
Copy link

Yekez commented Aug 15, 2021

This issue killed Godot for me, as well as reduz attitude that its no big deal. And then to double down and allow a release to go out where debug is different to release just shows the folks running the show here really have no idea what its like to ship and maintain a real game. You just can't trust the foundations of this engine, no matter how nice the graphics are. You can't trust the people that guide its development to make good decisions.

You would have to be crazy to try and build a business on Gotdot.

Can you imagine shipping something on Steam and then getting reports of random crashes in literately any place in your game because object references are reused. Then you spend days and weeks try to reproduce the issue in a debug build so you can try and work out what is going on, but it doesn't happen in debug.

This issue killed Godot for me, as well as reduz attitude that its no big deal. And then to double down and allow a release to go out where debug is different to release just shows the folks running the show here really have no idea what its like to ship and maintain a real game. You just can't trust the foundations of this engine, no matter how nice the graphics are. You can't trust the people that guide its development to make good decisions.

You would have to be crazy to try and build a business on Gotdot.

Can you imagine shipping something on Steam and then getting reports of random crashes in literately any place in your game because object references are reused. Then you spend days and weeks try to reproduce the issue in a debug build so you can try and work out what is going on, but it doesn't happen in debug.

Totally. This is me getting owned in playstore. Dev attitude is incredible on this one like it is some non important issue. But you never know when it would crash your game without throwing anything, then days and nights to find it. I loved godot man.. I really did, but my game dev career ended before it began.

@Sebjugate
Copy link

@RandomShaper I think the big issue is that it works silently in Debug, but causes game-breaking bugs in Release. This difference of behavior combined with the fact that Debug does not give any warning or error makes for a really frustrating situation. I guess that means I wholeheartedly support your proposal to make release behavior the same as debug! Thanks for that proposal.

@remram44 I sounds like that might be a work-around. I haven't found any real official documentation that states this is how you should be storing references to nodes. Instead, everywhere I've seen (official and unofficial) promotes storing a direct reference like var reference = get_node('MyNode').

@jaykyburz @Yekez I still love Godot, but I am only a hobbyist. I can imagine how frustrating this issue would be when trying to release a production game. Especially if you didn't know of this footgun from the beginning.

Sorry to open this can of worms again everyone! I'll go over and support @RandomShaper's proposal.
IMO, since this particular issue is only half solved (for debug only) it should not be closed.

@Calinou
Copy link
Member

Calinou commented Aug 15, 2021

IMO, since this particular issue is only half solved (for debug only) it should not be closed.

There is #50968 now.

@Sebjugate
Copy link

@Calinou I don't know the technical details, but that sounds like a different issue. Is it due to the same functionality in the backend?

@RandomShaper
Copy link
Member

I think the big issue is that it works silently in Debug, but causes game-breaking bugs in Release. This difference of behavior combined with the fact that Debug does not give any warning or error makes for a really frustrating situation.
@Sebjugate, again, in debug, as long as you run from the editor, so the debugger is attached, you should be getting an error message whenever you try to do anything on a variable that has a deleted object assigned. If you find any situation under which that's not true, that's a bug. Please report and it'll be fixed.

Regarding how to store references to nodes, the standard practice is just that. WeakRef won't help there.

@Sebjugate
Copy link

@RandomShaper maybe I am misunderstanding but I do have a situation where I am not getting any error or warning in debug, but a game-breaking bug in release:

In debug, is_instance_valid(deleted_object) is always and forever false (which is expected, and very good!). However, in release is_instance_valid(deleted_object) can quickly and unexpectedly become true and deleted_object is now referencing an entirely different node. This causes strange and difficult to find bug because debug works as expected, and does not give any warning when merely calling is_instance_valid().

I apologize if I'm not being clear. I can upload my sample project later this afternoon if that would help.

@RandomShaper
Copy link
Member

is_instance_valid() won't give you an error in debug because calling it on a freed object is a valid use case. However, in debug it won't have false positives, whereas in release that's possible. That would be solved indeed if release had the same kind of protections as debug, or at least a relevant subset of them.

@Sebjugate
Copy link

@RandomShaper I understand. It doesn't make sense for debug to throw a warning every time you use is_instance_valid() on a freed object. It's just particularly frustrating because (based on my apparently flawed understanding), I used is_instance_valid() to guard against this very problem, but instead it has hidden the problem and made it almost impossible to debug.

Regardless, I think we both agree that your proposal to make release work the same as debug should be accepted ASAP. Is there anything we can do as users to help that happen?

@RandomShaper
Copy link
Member

I believe the best thing is to add your real use case issues to the proposal, but I think you've already done it. If anything, this last specific part about is_instance_valid() being a red herring seems like something very relevant to add there.

@S0yKaf
Copy link
Contributor

S0yKaf commented Sep 15, 2021

So what is the current best way to handle checking if an object is deleted if I can't reliably trust is_instance_valid() in release? it caused a bunch of really nasty and sometimes quite humorous bugs in my release build and I'm now trying to phase out all references of it in my code, but I'd like to know if anyone has found an elegant solution to remediate the problem? I even considered building my own version of Godot which would implement whatever behavior debug has for this in the release builds.

@Sebjugate
Copy link

Sebjugate commented Sep 15, 2021

I have stored the node's instance ID (retrieved from node.get_instance_id()) and fetched it with instance_from_id(). From my understanding this should be safe, and is more or less what weakref() does, except more explicitly (in my opinion). Interestingly, I've found this to be fairly performant. Still about 50% of the performance of storing a direct reference, but weakref() was a mere 25%.

I would be interested to what a developer or someone more intimate with the inner workings of Godot has to say though.

Note there is a PR (#51796) to unify this behavior this so fingers crossed, it may become a non-issue.

@RandomShaper
Copy link
Member

Your approach is correct and safe. Regarding performance, that's interesting, and a bit surprising.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment