-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a column to store categories, rather than a mapping #69
Conversation
We briefly touched on this PR today, we'd like to move it forward since all interchange protocol implementers seem to be in favor of this. @vnlitvinov plans to revisit his proposal from above and add that here or open an alternative PR. Then we can try to finalize this, and then update implementations in the various libraries for it. |
I'm not sure I quite follow; the |
91ed593
to
902ba7c
Compare
@shwina I've missed your comment, so I went ahead with the original plan we discussed last meeting and I had added a commit doing so. If you don't like the idea feel free to remove my commit. As for it not being a mapping... it somewhat is :) it maps an integer index to a category value. |
After discussion on Thursday, @shwina and others were happy with the changes @vnlitvinov pushed. With one change to make: rename |
Signed-off-by: Vasily Litvinov <[email protected]>
Co-authored-by: Keith Kraus <[email protected]>
This change was made in the spec part, but not the rest of this PR. I'm inclined to push a change to |
I tried this - since the tests no longer pass with this PR, and we now have an actual Pandas implementation, it may be better to completely remove this early prototype in |
Discussed in today's call: everyone is happy with deleting this prototype code. |
Now that we have four real-world implementations in cuDF, Vaex, Modin and Pandas, we no longer need this prototype. Having it here makes it harder to merge other PRs (see data-apisgh-69), so let's remove it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everyone is happy and this is now a pretty small/straightforward diff. So merging. I'll follow up with all the implementers to ensure we will actually propagate these changes to all implementations.
Thanks @shwina, @vnlitvinov, @kkraus14
This PR implements one of the changes mentioned in #41, i.e.,
Replacing the previous
mapping
with a child column now implies that the data buffer stores integer indices into the child column.