You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Having duplicated columns can lead to confusing downstream behavior that might be difficult to detect, e.g. we recently had this occur in Altair for a couple of users vega/altair#2718.
Feature Description
It was suggested in the PR that introduced the flag to disallow duplicates that this might be suitable as a default option in the future #28394 (comment), but I couldn't find a follow up discussion so I 'm opening this issue to suggest that this becomes the default behavior to protect users from doing things they might not intend to, like selecting the same column twice.
Alternative Solutions
Keep the current default
Additional Context
No response
The text was updated successfully, but these errors were encountered:
I'm not sure what my opinion is on this, but open to discussions.
Currently, we disallow by setting an attribute in flags (see here), which IMO is the wrong API and we should rather have a parameter in the index constructor, like Index(..., allow_duplicates=False) instead. Then it would be easier to discuss if the parameter flag should be False or True.
To add, the flag-based approach doesn't allow us to decide if we want label duplicates in the DataFrame constructor, which doesn't seem right. E.g. we'd want
for precise control in the constructor. Also, a decision has to be if non-duplicate labels also means non-duplicate label indexing, e.g. should we disallow df.loc[["a", "a"]] when we disallow duplicate labels.
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
Having duplicated columns can lead to confusing downstream behavior that might be difficult to detect, e.g. we recently had this occur in Altair for a couple of users vega/altair#2718.
Feature Description
It was suggested in the PR that introduced the flag to disallow duplicates that this might be suitable as a default option in the future #28394 (comment), but I couldn't find a follow up discussion so I 'm opening this issue to suggest that this becomes the default behavior to protect users from doing things they might not intend to, like selecting the same column twice.
Alternative Solutions
Keep the current default
Additional Context
No response
The text was updated successfully, but these errors were encountered: