Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

user can upload/download the datasets in different formats (like CSV/excel/avro/arrow/JSON/Parquet etc) using UI #2752

Closed
Tracked by #31
sdhaker2 opened this issue Apr 22, 2023 · 9 comments
Labels
status: stale Indicates that there is no activity on an issue or pull request type: enhancement Indicates new feature requests

Comments

@sdhaker2
Copy link

Is your feature request related to a problem? Please describe.
Currently, there is no feature where users can upload/download the datasets from UI, and Non-tech users struggle to upload/download the datasets and configure them accordingly.

Describe the solution you'd like
I want to add a feature and make it more simple and easy where users can upload/download datasets from different formats using the UI. During upload, the user can assign column names to annotation, prediction, labels, and text similar to what we do during logging the datasets and can provide a link to the dataset from hugging face also and download/export the datasets in different formats (like CSV/excel/AVRO/arrow/JSON/Parquet, etc). I want to contribute to this feature.

Describe alternatives you've considered
N.A.

Additional context
N.A.

@sdhaker2 sdhaker2 added the type: enhancement Indicates new feature requests label Apr 22, 2023
@sdhaker2
Copy link
Author

This is similar to #1870 and #1888 where I did not see any activity from the past 2-3 months.

@nataliaElv
Copy link
Member

Hi @sdhaker2 ! This is definitely something we want to work on. While we work on making this available through the UI, you could check out this repo: https://github.com/argilla-io/argilla-streamlit
This streamlit app has a data manager to import/export data in some of the formats that you mention in the issue. You could deploy it in a huggingface space, for example.

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no activity.

@github-actions github-actions bot added the status: stale Indicates that there is no activity on an issue or pull request label Jul 28, 2023
@davidberenstein1957
Copy link
Member

#1888

@github-actions github-actions bot removed the status: stale Indicates that there is no activity on an issue or pull request label Aug 8, 2023
Copy link

github-actions bot commented Nov 7, 2023

This issue is stale because it has been open for 90 days with no activity.

@github-actions github-actions bot added the status: stale Indicates that there is no activity on an issue or pull request label Nov 7, 2023
Copy link

github-actions bot commented Dec 7, 2023

This issue was closed because it has been inactive for 30 days since being marked as stale.

@github-actions github-actions bot closed this as completed Dec 7, 2023
@camilleborrett
Copy link

Hi @nataliaElv ! I would also find this feature (also in #1888) really very useful. 



If you have been working on this, do you have any idea of when this feature will be official? 



The current work around you posted above no longer seems available. I get a 404 error message for both the GitHub repo https://github.com/argilla-io/argilla-streamlit and the HuggingFace space https://huggingface.co/spaces/argilla/data-manager.

@nataliaElv
Copy link
Member

Hi @camilleborrett ! Exactly, that workaround doesn't apply anymore, but here's something that you can do to easily import your data:

This currently works only with public datasets, but we're working to make it work for private datasets as well in the next release. Also, we expect to ship in that release the functionality to export your data to the hub as a public or private dataset.

Do you think this should work for you?

@camilleborrett
Copy link

Thanks a lot for your quick response @nataliaElv ! That's very cool :)
It doesn't really fit my use case though: I work in a sector that uses sensitive/confidential data, which means we cannot upload data to the Hugging Face Hub (either publicly or privately). A functionality to enable, mainly exporting, annotated data directly from the interface would be very useful. We have a tech team that uploads data to Argilla but we have non-tech annotators who would benefit from being able to export their annotations themselves in a format they can use (i.e. csv/xlsx), without always having to go through the tech team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: stale Indicates that there is no activity on an issue or pull request type: enhancement Indicates new feature requests
Projects
None yet
Development

No branches or pull requests

4 participants