diff --git a/README.md b/README.md index 92d3952..312bc68 100755 --- a/README.md +++ b/README.md @@ -16,7 +16,7 @@ This repository contains the main and challenge data for QuAIL reading comprehen See the blog post describing the project and updates post-paper-publication: https://text-machine-lab.github.io/blog/2020/quail -The data is available in [NLP Datasets library from Hugging Face](https://huggingface.co/nlp/viewer/?dataset=quail), as well as in this repository. We provide two formats: xml and jsonlines. The json format is what is used in our [leaderboard](http://text-machine.cs.uml.edu/lab2/projects/quail/); you can get the files [here](quail_v1.3/json) or in the [codalab data sheet](https://worksheets.codalab.org/worksheets/0xa8cd6ea812c04be7b728f44b2e0a56fc). The xml format is more human-readable, and there are two versions of each file: +The data is available in [Datasets library from Hugging Face](https://huggingface.co/datasets/quail), as well as in this repository. We provide two formats: xml and jsonlines. The json format is what is used in our [leaderboard](http://text-machine.cs.uml.edu/lab2/projects/quail/); you can get the files [here](quail_v1.3/json) or in the [codalab data sheet](https://worksheets.codalab.org/worksheets/0xa8cd6ea812c04be7b728f44b2e0a56fc). The xml format is more human-readable, and there are two versions of each file: * [randomized](quail_v1.3/xml/randomized): the order of questions and answers is randomized. This is the version recommended for use in training/testing models, and the order is the same as the version provided in the Hugging Face Datasets collection * [ordered](quail_v1.3/xml/ordered): the questions are ordered by type, and the correct answers are listed first. This is the more human-readable version for data exploration. @@ -37,5 +37,5 @@ Leaderboard submission instructions: https://worksheets.codalab.org/worksheets/0 *********************** -Update 30.10.2020: QuAIL v.1.3: the data is now available in [NLP Datasets](https://huggingface.co/nlp/viewer/?dataset=quail), and we switched to jsonlines to match the format to the online HuggingFace data viewer. Question order within texts and answer order within questions changed (beware of Python's random compatibility across systems and python versions [[REF]](https://stackoverflow.com/questions/8786084/reproducibility-of-python-pseudo-random-numbers-across-systems-and-versions)). +Update 30.10.2020: QuAIL v.1.3: the data is now available in [Hugging Face Datasets](https://huggingface.co/datasets/quail), and we switched to jsonlines to match the format to the online HuggingFace data viewer. Question order within texts and answer order within questions changed (beware of Python's random compatibility across systems and python versions [[REF]](https://stackoverflow.com/questions/8786084/reproducibility-of-python-pseudo-random-numbers-across-systems-and-versions)).