Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix (visual.ts): avoid adding overlapping categories from data view #87

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dm-p
Copy link
Contributor

@dm-p dm-p commented Nov 9, 2022

In the event that a categorical column is used in multiple data roles, this ensures that when we enumerate source and destination properties, any duplicates are discarded and only one occurrence is eligible for the node file.

Should ideally resolve #85 (or get us closer to root cause if not)

In the event that a categorical column is used in multiple data roles, this ensures that when we enumerate source and destination properties, any duplicates are discarded and only one occurrence is eligible for the node file.
@lmeyerov
Copy link
Contributor

lmeyerov commented Nov 9, 2022

@dm-p can you confirm, at least in the original #85 case, that the expected results now come out?

@dm-p
Copy link
Contributor Author

dm-p commented Nov 9, 2022

I could not reproduce #85 for the same data assignments as you, including switching-up the order of categories. This PR resolves what was discovered through investigation and is what I believe is causing the values to be swapped around (as the order in which duplicate fields would get added would have caused an overlap).

Alternatively, if you can send me your .pbix with the reproduced issue, I can test the developer visual against this specific occurence and confirm.

This change has been tested to confirm that only one instance of a category's values from the data view will be passes through to the node file per field if this situation is replicated (and again could not reproduce #85 on this setup).

@lmeyerov
Copy link
Contributor

lmeyerov commented Nov 9, 2022

Gotcha, I will try

The suspicious thing to me is because of the switchup, it sounds like somewhere we're assuming an indexing order to get a column, vs doing an appropriate lookup. So it's actually an interesting bug where even if we eliminate what we consider a bad user input, the situation is revealing a risky indexing operation. So maybe there is a legit scenario where it'd trigger too, and we've just been lucky.

@lmeyerov
Copy link
Contributor

lmeyerov commented Nov 9, 2022

FYI, a fun way to viz these may be via https://github.com/graphistry/pygraphistry#group-in-a-box-layout

using something like partition_key='top_motif_id'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

column names mixed up
2 participants