Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Hugging Face Datasets library name and link #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ This repository contains the main and challenge data for QuAIL reading comprehen

See the blog post describing the project and updates post-paper-publication: https://text-machine-lab.github.io/blog/2020/quail

The data is available in [NLP Datasets library from Hugging Face](https://huggingface.co/nlp/viewer/?dataset=quail), as well as in this repository. We provide two formats: xml and jsonlines. The json format is what is used in our [leaderboard](http://text-machine.cs.uml.edu/lab2/projects/quail/); you can get the files [here](quail_v1.3/json) or in the [codalab data sheet](https://worksheets.codalab.org/worksheets/0xa8cd6ea812c04be7b728f44b2e0a56fc). The xml format is more human-readable, and there are two versions of each file:
The data is available in [Datasets library from Hugging Face](https://huggingface.co/datasets/quail), as well as in this repository. We provide two formats: xml and jsonlines. The json format is what is used in our [leaderboard](http://text-machine.cs.uml.edu/lab2/projects/quail/); you can get the files [here](quail_v1.3/json) or in the [codalab data sheet](https://worksheets.codalab.org/worksheets/0xa8cd6ea812c04be7b728f44b2e0a56fc). The xml format is more human-readable, and there are two versions of each file:

* [randomized](quail_v1.3/xml/randomized): the order of questions and answers is randomized. This is the version recommended for use in training/testing models, and the order is the same as the version provided in the Hugging Face Datasets collection
* [ordered](quail_v1.3/xml/ordered): the questions are ordered by type, and the correct answers are listed first. This is the more human-readable version for data exploration.
Expand All @@ -37,5 +37,5 @@ Leaderboard submission instructions: https://worksheets.codalab.org/worksheets/0

***********************

Update 30.10.2020: QuAIL v.1.3: the data is now available in [NLP Datasets](https://huggingface.co/nlp/viewer/?dataset=quail), and we switched to jsonlines to match the format to the online HuggingFace data viewer. Question order within texts and answer order within questions changed (beware of Python's random compatibility across systems and python versions [[REF]](https://stackoverflow.com/questions/8786084/reproducibility-of-python-pseudo-random-numbers-across-systems-and-versions)).
Update 30.10.2020: QuAIL v.1.3: the data is now available in [Hugging Face Datasets](https://huggingface.co/datasets/quail), and we switched to jsonlines to match the format to the online HuggingFace data viewer. Question order within texts and answer order within questions changed (beware of Python's random compatibility across systems and python versions [[REF]](https://stackoverflow.com/questions/8786084/reproducibility-of-python-pseudo-random-numbers-across-systems-and-versions)).