You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Maybe this is not the right place to ask this question, but I would like to initiate a discussion about the thing puzzling me when reading the paper.
My question is: In the paper, the author does not explicitly enforce each slot to represent exactly a single object, then why did not the network learn to use each slot for more than 1 object in the image reconstruction task? Is there any inductive bias imposed by the architecture itself?
The text was updated successfully, but these errors were encountered:
Maybe this is not the right place to ask this question, but I would like to initiate a discussion about the thing puzzling me when reading the paper.
My question is: In the paper, the author does not explicitly enforce each slot to represent exactly a single object, then why did not the network learn to use each slot for more than 1 object in the image reconstruction task? Is there any inductive bias imposed by the architecture itself?
The text was updated successfully, but these errors were encountered: