Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added functionality to convert data to CSV/TSV and JSON and vice versa. #329

Closed
wants to merge 25 commits into from

Conversation

john-thuo1
Copy link
Contributor

@john-thuo1 john-thuo1 commented Oct 12, 2024

Contributor Checklist

  • [✔️] This pull request is on a separate branch and not the main branch.

Description

This pull request adds functionality to convert data into CSV, TSV, and JSON formats. The changes include:

  • Extension of the convert_to_json and convert_to_csv_or_tsv methods in convert.py.
  • Updates to the CLI in main.py to support new conversion commands.

Example Commands:

  1. From JSON to CSV:
    scribe-data convert --lan French --data-type translations --input-file ./fli/French/translations.json --output-type csv --output-dir ./converted/
  2. From CSV to JSON:
    scribe-data convert --lan French --data-type translations --input-file ./converted/French/translations.csv --output-type json --output-dir ./converted/
  3. From JSON to TSV:
    scribe-data convert --lan French --data-type translations --input-file ./fli/French/translations.json --output-type tsv --output-dir ./converted/ 
  4. From TSV to JSON:
    scribe-data convert --lan French --data-type translations --input-file ./converted/French/translations.tsv --output-type json --output-dir ./converted/ 
    

I tested the changes by running the CLI commands and verifying that the output matched the expected formats. I want to get feedback on the approaches before implementing the code tests and the SQLite conversion support

Related issue

Copy link

github-actions bot commented Oct 12, 2024

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you!

Maintainer checklist

  • The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution

    • The contributor's name and icon in remote commits should be the same as what appears in the PR
    • If there's a mismatch, the contributor needs to make sure that the email they use for GitHub matches what they have for git config user.email in their local Scribe-Data repo
  • The linting and formatting workflow within the PR checks do not indicate new errors in the files changed

  • The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

@john-thuo1 john-thuo1 changed the title feat: Implement functionality to convert data to CSV/TSV and JSON Added functionality to convert data to CSV/TSV and JSON and vice versa. Oct 12, 2024
@andrewtavis andrewtavis self-requested a review October 12, 2024 14:51
@andrewtavis andrewtavis added the hacktoberfest-accepted Accepted as a part of Hacktoberfest label Oct 12, 2024
"-od",
"--output-dir",
type=Path,
required=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for this to be required. Ideally the default directories would be used if this isn't included :)

@andrewtavis andrewtavis requested a review from mhmohona October 12, 2024 23:12
@andrewtavis
Copy link
Member

@mhmohona, if you have time, it'd be great if you could take a first look at the changes to convert.py :) Let me know if I should check it from the start!

@andrewtavis
Copy link
Member

@john-thuo1, is there a way that we can get the commit history here cleaned up a bit? It's hard for us to review this as I can't tell what the changes are that are in here and already on main. What might make sense would be to make a new branch from the current version of main and add your changes to that? Then open a new PR?

@john-thuo1
Copy link
Contributor Author

@john-thuo1, is there a way that we can get the commit history here cleaned up a bit? It's hard for us to review this as I can't tell what the changes are that are in here and already on main. What might make sense would be to make a new branch from the current version of main and add your changes to that? Then open a new PR?

Alright. Apologies for that. Still getting the hang of cherry-picking.

@andrewtavis
Copy link
Member

Hey it's hard stuff, @john-thuo1! Thanks for opening #338 :)

@john-thuo1 john-thuo1 deleted the decouple_convert branch October 13, 2024 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted Accepted as a part of Hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants