-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove config names as yaml keys #4367
Conversation
I included the change from #4302 directly in this PR, this way the datasets will be updated right away in the CI (the CI is only triggered when a dataset card is changed) |
The documentation is not available anymore as the PR was closed or merged. |
Alright it's ready now :) Here is an example for the datasets/datasets/ade_corpus_v2/README.md Lines 1 to 78 in 76d9a14
CI failures are only related to dataset cards missing some content. |
Many datasets have dots in their config names. However it causes issues with the YAML tags of the dataset cards since we can't have dots in YAML keys.
I fix this, I removed the tags separations per config name completely, and have a single flat YAML for all configurations. Dataset search doesn't use this info anyway. I removed all the config names used as YAML keys, and I moved them in under a new
config:
key.This is related to #2362 (internal https://github.com/huggingface/moon-landing/issues/946).
Also removing the dots in the YAML keys would allow us to do as in #4302 which removes a hack that replaces all the dots by underscores in the YAML tags.
I also added a test in the CI that checks that all the YAML tags to make sure that: