Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multi-column --key values #17

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jsvine
Copy link

@jsvine jsvine commented Apr 2, 2021

These modifications allow users to pass multiple (comma-separated) columns as the --key, for scenarios in which rows are uniquely identified by a combination of columns — for instance, the county and the state. For instance:

csv-diff --key=state,county a.csv b.csv

An arbitrary number of columns can be used. These scenarios are fairly common, in my experience.

I aimed to make this implementation as simple as possible. As such, it doesn't handle one particular edge case: columns whose names contain a comma. My instinct is that this could be handled by adding a --key-sep option, in which the user could pass any arbitrary string to serve as a separator. E.g.,:

csv-diff --key="Column Name, With A Comma::Column 2" --key-sep="::" a.csv b.csv

... and then passing that argument to load_csv/load_json. But figured I'd raise the possibility here first before mucking around too much in the code.

This commit allows users to pass multiple (comma-separated) columns as
the --key, for use-cases in which rows are uniquely identified by a
combination of columns — for instance, the county *and* the state.
@jsvine
Copy link
Author

jsvine commented Apr 2, 2021

And I meant to say: Thanks for such an elegant and useful repo/tool! The code was a pleasure to read.

@jsvine
Copy link
Author

jsvine commented Apr 2, 2021

Ah, and while tinkering to scratch my own itch, I failed to recognize that something similar was proposed in #1!

This PR takes a slightly different approach (sticking with a single --key option, rather than multiple), based on my own personal preferences. No offense taken if you opt for the other one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant