Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix validation of datasets without metadata #342

Merged
merged 5 commits into from
Nov 24, 2020

Conversation

magdalenafuentes
Copy link
Collaborator

Fixes #341

@codecov
Copy link

codecov bot commented Nov 23, 2020

Codecov Report

Merging #342 (8de5515) into master (20094b1) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #342      +/-   ##
==========================================
- Coverage   98.58%   98.58%   -0.01%     
==========================================
  Files          28       28              
  Lines        2692     2691       -1     
==========================================
- Hits         2654     2653       -1     
  Misses         38       38              

@genisplaja
Copy link
Collaborator

Thanks for this PR!! I am actually checking it because this morning I tried to change line 144 from utils.py to if 'metadata' in dataset_index and dataset_index['metadata'] is not None: just as this PR, because I thought that it is a good idea to keep the same dict structure for all the indexes even if there is no general metadata file, but it failed. Now I'm rerunning it because I think that it should work!

@magdalenafuentes
Copy link
Collaborator Author

Great! Thanks :) let me know and if it does I'll merge

@genisplaja
Copy link
Collaborator

I don't know if you have tried it but in my case isn't working yet (I am not removing metadata key from index). I don't know if the problem is when dumping and loading from the JSON index file (converting from None to null and viceversa), but I have debugged the code and tried some alternatives, and when running the code in this PR without deleting the metadata key, the error is still there.

Maybe the best idea is to directly remove the metadata key from the datasets that do not have general metadata and we specify, in this same PR, in the CONTRIBUTING.md file that the datasets with no general metadata must not include the key in the index.

@magdalenafuentes
Copy link
Collaborator Author

Mmmmm... that's strange. It is working for me locally both with metadata=null and without.

Is it not working for the test_full_dataset either on orchset and mridangam_stroke?

@magdalenafuentes
Copy link
Collaborator Author

both orchset and mridangam_stroke pass the full test locally for me

@genisplaja
Copy link
Collaborator

genisplaja commented Nov 23, 2020

Mmmmm... that's strange. It is working for me locally both with metadata=null and without.

Is it not working for the test_full_dataset either on orchset and mridangam_stroke?

I haven't tried in orchset, but I tried it for mridangam_stroke and it broke. Yeah that's weird! Because I set an assert to check that content of metadata key was None and it was!

@genisplaja
Copy link
Collaborator

Btw if I remove the metadata field from the indexes it works perfectly!

@rabitt
Copy link
Collaborator

rabitt commented Nov 24, 2020

Quick comment - From a code structure perspective, it's better if any null top-level fields are removed, with the exception of version.

@genisplaja
Copy link
Collaborator

Ok! Understood. So then I guess we can leave the patch as @magdalenafuentes proposed and that would solve the issue. It's working for me now!

@magdalenafuentes magdalenafuentes merged commit bfe0ccc into master Nov 24, 2020
@rabitt rabitt deleted the fix_metadata_dataset branch December 17, 2020 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Datasets with no general metadata file are not passing test_full_dataset.py
4 participants