Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixup openfmri datasets metadata #17

Open
12 of 13 tasks
yarikoptic opened this issue Feb 12, 2018 · 0 comments
Open
12 of 13 tasks

fixup openfmri datasets metadata #17

yarikoptic opened this issue Feb 12, 2018 · 0 comments

Comments

@yarikoptic
Copy link
Member

yarikoptic commented Feb 12, 2018

Currently (some might have been fixed upstream) we have following gotchas while parsing metadata from openfmri datasets (before enabling any custom ones, just bids parser)

[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000009
[WARNING] Failed to load participants info due to: 'ascii' codec can't encode character u'\u2019' in position 72: ordinal not in range(128) [csv.py:next:108]. Skipping the rest of file
  • 30 - same as 9. FOI: took 7:10.73 (7min) to aggregate! 223kB size of ds- and 128kB size of cn- compressed
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000030
[ERROR  ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xe2 in position 3325: ordinal not in range(128) [ascii.py:decode:26]
[ERROR  ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000030)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000030 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000053
[ERROR  ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xc2 in position 1188: ordinal not in range(128) [ascii.py:decode:26]
[ERROR  ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000053)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000053 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
  • 117
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000117
[ERROR  ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xe2 in position 1585: ordinal not in range(128) [ascii.py:decode:26]
[ERROR  ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000117)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000117 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
  • 140 - just because README is large since includes output of bids-validator. doing nothing about that for now
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000140
[INFO   ] Removed metadata field(s) due to blacklisting and max size settings: set(['description'])
  • 164
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000164
[ERROR  ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xc3 in position 1244: ordinal not in range(128) [ascii.py:decode:26]
[ERROR  ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000164)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000164 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000201
[WARNING] Failed to load participants info due to: "delimiter" must be string, not unicode [csv.py:__init__:79]. Skipping the rest of file
  • 214
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000214
[ERROR  ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xe2 in position 1412: ordinal not in range(128) [ascii.py:decode:26]
[ERROR  ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000214)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000214 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
  • 216
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000216
[ERROR  ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xc3 in position 1232: ordinal not in range(128) [ascii.py:decode:26]
[ERROR  ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000216)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000216 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
  • 218 - it is in a bit screwy state... for now manually unannexed/git added top level text files. fixed participants.tsv header to not have trailing tab
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000218
[WARNING] Could not determine file-format, assuming TSV
  • 221 Was a unicode whitespace used to separate fields in Authors. sent patch upstream as well
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000221
[ERROR  ] Failed to get dataset metadata (bids): No JSON object could be decoded [decoder.py:raw_decode:382]
[ERROR  ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000221)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000221 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
  • 223 - just a single column in participants.tsv -- useless
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000223
[WARNING] Could not determine file-format, assuming TSV
  • 224
[INFO   ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000224
[ERROR  ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xe2 in position 1805: ordinal not in range(128) [ascii.py:decode:26]
[ERROR  ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000224)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000224 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant