-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade bids.py
metadata extractor
#94
Comments
I Wholeheartedly support this initiative! FWIW, Regarding getting content: for use/integration with datalad-fuse I am thinking of adding some config variable to metalad to bypass getting content by metalad since it would be available (only needed portions of the file) via fuse. Might need though a mode which would first query datalad-fuse on either it can access that file data, and get only when it can't (eg file on some fancy special remote datalad-fuse has no clue of how to access via fsspec) |
Oh wow, didn't know about datalad-fuse and now I do, great! I'm guessing that functionality (check whether possible to get partial content via fuse, otherwise |
May be ;-) in principle it could be a mode of operation on datalad-fuse , so that is it fails to access file via fsspec, it would just get it in full. Then metalad wouldn't need to deal with that.
I thought about the same but still hope we could avoid that. |
Here's a sample of output from the BOLD5000 dataset that I ran locally:
yields:
It contains all the fields extracted by the existing bids extractor, and additionally:
The extra info from Lastly, the above all runs on the light-weight datalad dataset, i.e. does not require local access to annexed content (assuming a |
New No PR yet since I am uncertain about:
|
TODO for @jsheunis:
|
Now that metalad 0.3.0 is released. You could add your bids extractor to the datalad-metalad repo, if you want. Feel free to create a PR against |
Thanks. AFAIK it probably still makes sense to keep the |
This PR is in response to #94. It adds a dataset-level extractor for BIDS datasets, called bids_dataset that: - uses gen4 metadata handling with datalad-metalad (and introduces this dependency, datalad-metalad>=0.3.1) - ensures the required files dataset_description.json and participants.tsv are available locally before proceeding with the extraction process - does not require locally available file content, other than the above mentioned or README text file content, which it makes part of the extraction output (automatically running get where applicable) - is compatible with pybids>=0.15.1 and BIDS v1.6.0 - does not change the existing bids extractor in any way - extracts extra information about the BIDS dataset (compared to the existing bids extractor), including information about subjects, sessions, runs, tasks, entities, and variables.
It would be useful if the bids extractor (and for some points, eventually all other extractors in this extension) could:
datalad.metadata.*
)datalad get
ting any file content that might be necessary for further extraction. (currently the extraction starts by getting all required file content...)I've made a start at this. I'm working on this within the context of the catalog: likely many of our future users will be working with BIDS data and would want to extract BIDS metadata and have it rendered in the catalog. So I have an idea of the BIDS-related metadata that would be useful in the catalog, but I'm keen to get input from other @datalad/developers if there are features that you think will be useful to include.
The text was updated successfully, but these errors were encountered: