-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --skiptop flag to prevent writing top level bids files #344
Conversation
This is useful in batch processing to prevent potential race conditions. The batch processing example has also been modified to use this flag.
Codecov Report
@@ Coverage Diff @@
## master #344 +/- ##
==========================================
+ Coverage 74.24% 74.28% +0.04%
==========================================
Files 35 35
Lines 2671 2687 +16
==========================================
+ Hits 1983 1996 +13
- Misses 688 691 +3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We better avoid extending the cmdline list of options. I suggested a possible solution for that
heudiconv/cli/run.py
Outdated
parser.add_argument('--skiptop', action='store_true', | ||
help='Skip creating of top-level bids files. ' | ||
'Useful when running in batch mode to prevent ' | ||
'possible race conditions.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our --help is already LONG enough and I expect that only more options would come to alter BIDS handling behavior. What if we "convert" --bids
option into something more useful?
Especially since I think we are no longer using positional args, I think it should be easy to do:
Here is a prototype showing that we could use `args='*'` mode:
~/proj/repronim/reproin > cat /tmp/argparse_.py
__version__ = '1'
from argparse import ArgumentParser
docstr = ("""Example:
heudiconv -d rawdata/{subject} -o . -f heuristic.py -s s1 s2 s3""")
parser = ArgumentParser(description=docstr)
parser.add_argument('--version', action='version', version=__version__)
parser.add_argument('-b', '--bids', nargs='*',
choices=['notop', 'nostriptimes'],
help="""\
flag for output into BIDS structure. Additional options which
could be specified: notop - no top level metadata-files,
nostriptimes - do not strip times from side-car files""")
out = parser.parse_args()
print(out)
~/proj/repronim/reproin > python /tmp/argparse_.py --help
usage: argparse_.py [-h] [--version]
[-b [{notop,nostriptimes} [{notop,nostriptimes} ...]]]
Example: heudiconv -d rawdata/{subject} -o . -f heuristic.py -s s1 s2 s3
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
-b [{notop,nostriptimes} [{notop,nostriptimes} ...]], --bids [{notop,nostriptimes} [{notop,nostriptimes} ...]]
flag for output into BIDS structure. Additional
options which could be specified: notop - no top level
metadata-files, nostriptimes - do not strip times from
side-car files
~/proj/repronim/reproin > python /tmp/argparse_.py
Namespace(bids=None)
~/proj/repronim/reproin > python /tmp/argparse_.py -b
Namespace(bids=[])
~/proj/repronim/reproin > python /tmp/argparse_.py -b notop
Namespace(bids=['notop'])
~/proj/repronim/reproin > python /tmp/argparse_.py -b notop -b nostriptimes
Namespace(bids=['nostriptimes'])
~/proj/repronim/reproin > python /tmp/argparse_.py -b notop nostriptimes
Namespace(bids=['notop', 'nostriptimes'])
~/proj/repronim/reproin > python /tmp/argparse_.py -b notop nostriptimes nonsense
usage: argparse_.py [-h] [--version]
[-b [{notop,nostriptimes} [{notop,nostriptimes} ...]]]
argparse_.py: error: argument -b/--bids: invalid choice: 'nonsense' (choose from 'notop', 'nostriptimes')
so we can discriminate when it is not specified (None
-- current False
) for when specified but without options ([]
-- current True). So, if instead of carrying around a bool bids
we just carry bids
a list of options with an empty list for False
, ['no-options']
if no options were provided, and otherwise just a full list of options, we could later add other options and remain compatible while testing if bids:
and check for options if bids and 'notop' not in bids:
where needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I restructured it with your suggestion. Internally, I set args.bids to either True of False and add args.bids_options, but from the point of view of the user it works as you suggested.
the flag for skipping the top directory is now called notop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THANK YOU!
- what about
--bids help
instead of yet another option?
heudiconv/cli/run.py
Outdated
@@ -222,6 +253,8 @@ def get_parser(): | |||
help='Additional queue arguments passed as ' | |||
'single string of Argument=Value pairs space ' | |||
'separated.') | |||
parser.add_argument('--help-bids', action='store_true', dest='help_bids', | |||
help='Display bids specific help.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not to just add "help" option to --bids
and react on it? part of the goal in my suggestion was not to expand the cmdline API options list more ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes that's a good ideal. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored as per @mgxd's suggestion and just included all the help in the argparse help field
heudiconv/convert.py
Outdated
@@ -80,7 +80,7 @@ def conversion_info(subject, outdir, info, filegroup, ses): | |||
|
|||
def prep_conversion(sid, dicoms, outdir, heuristic, converter, anon_sid, | |||
anon_outdir, with_prov, ses, bids, seqinfo, min_meta, | |||
overwrite, dcmconfig): | |||
overwrite, dcmconfig, bids_options): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is what I was hoping to avoid -- needing to eventually add/adjust signature of every function which now gets bids
but might eventually need to get access to bids_options
...
but may be making it more obvious at the cost of some duplication would be better than my more obscure suggestion... Let's see what others thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I see what you're saying. I wanted to keep bids as a boolean because I think it's more obvious to people and prevents us having to check every usage of args.bids. (e.g., if bids is an empty list, we would want to calculate bids stuff, but if args.bids:
would no longer work.) Therefore a change is required in either case, and I feel the bids
+ bids_options
method is more explicit and clear. I'm curious what others thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd like to avoid additional parameters to the signatures whenever possible. Though if we're changing the variable type + intention, it may be worthwhile to rename it to something more readable - I like bids_options
or bids_opts
. Most of the current checks can be ported rather easily
if args.bids_opts is not None: # if args.bids
if ['notopdirs'] in args.bids_opts: # special case
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to use bids_options internally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on @stilley2, we get excited whenever a new contributor submits a PR! I like @yarikoptic's suggestion of extending the --bids
argument to handle these BIDS-specific cases
If you run into any problems/issues, don't hesitate to ask either of us!
heudiconv/convert.py
Outdated
@@ -80,7 +80,7 @@ def conversion_info(subject, outdir, info, filegroup, ses): | |||
|
|||
def prep_conversion(sid, dicoms, outdir, heuristic, converter, anon_sid, | |||
anon_outdir, with_prov, ses, bids, seqinfo, min_meta, | |||
overwrite, dcmconfig): | |||
overwrite, dcmconfig, bids_options): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd like to avoid additional parameters to the signatures whenever possible. Though if we're changing the variable type + intention, it may be worthwhile to rename it to something more readable - I like bids_options
or bids_opts
. Most of the current checks can be ported rather easily
if args.bids_opts is not None: # if args.bids
if ['notopdirs'] in args.bids_opts: # special case
...
heudiconv/cli/run.py
Outdated
@@ -98,9 +103,20 @@ def process_extra_commands(outdir, args): | |||
return | |||
|
|||
|
|||
def help_bids(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be added to the argument's desc
field instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added bids options information to the help
field (as far as I can tell desc
is only parser level).
heudiconv/cli/run.py
Outdated
@@ -112,6 +128,19 @@ def main(argv=None): | |||
random.seed(args.random_seed) | |||
import numpy | |||
numpy.random.seed(args.random_seed) | |||
# Ensure only supported bids options are passed | |||
allowed_options = [option for option, _ in BIDS_OPTIONS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argparse's choices
will check if each input is valid, so this isn't needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Internal functions now check whether bids_options is None instead of checking the boolean bids. Also place bids options related help in the help field of the argparse "bids" argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! This is almost ready, we just need to add a test confirming the intended notop
behavior
a possible idea to followup with #349 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me - @yarikoptic any other comments?
I think not |
This is largely a bug fix. Metadata and order of `_key-value` fields in BIDS could change from the result of converting using previous versions, thus minor version boost. 14 people contributed to this release -- thanks [everyone](https://github.com/nipy/heudiconv/graphs/contributors)! Enhancement - Use [etelemetry](https://pypi.org/project/etelemetry) to inform about most recent available version of heudiconv. Please set `NO_ET` environment variable if you want to disable it ([#369][]) - BIDS: - `--bids` flag became an option. It can (optionally) accept `notop` value to avoid creation of top level files (`CHANGES`, `dataset_description.json`, etc) as a workaround during parallel execution to avoid race conditions etc. ([#344][]) - Generate basic `.json` files with descriptions of the fields for `participants.tsv` and `_scans.tsv` files ([#376][]) - Use `filelock` while writing top level files. Use `HEUDICONV_FILELOCK_TIMEOUT` environment to change the default timeout value ([#348][]) - `_PDT2` was added as a suffix for multi-echo (really "multi-modal") sequences ([#345][]) - Calls to `dcm2niix` would include full output path to make it easier to discern in the logs what file it is working on ([#351][]) - With recent [datalad]() (>= 0.10), created DataLad dataset will use `--fake-dates` functionality of DataLad to not leak data conversion dates, which might be close to actual data acquisition/patient visit ([#352][]) - Support multi-echo EPI `_phase` data ([#373][] fixes [#368][]) - Log location of a bad .json file to ease troubleshooting ([#379][]) - Add basic pypi classifiers for the package ([#380][]) Fixed - Sorting `_scans.tsv` files lacking valid dates field should not cause a crash ([#337][]) - Multi-echo files detection based number of echos ([#339][]) - BIDS - Use `EchoTimes` from the associated multi-echo files if `EchoNumber` tag is missing ([#366][] fixes [#347][]) - Tolerate empty ContentTime and/or ContentDate in DICOMs ([#372][]) and place "n/a" if value is missing ([#390][]) - Do not crash and store original .json file is "JSON pretification" fails ([#342][]) - ReproIn heuristic - tollerate WIP prefix on Philips scanners ([#343][]) - allow for use of `(...)` instead of `{...}` since `{}` are not allowed ([#343][]) - Support pipolar fieldmaps by providing them with `_epi` not `_magnitude`. "Loose" BIDS `_key-value` pairs might come now after `_dir-` even if they came first before ([#358][] fixes [#357][]) - All heuristics saved under `.heudiconv/` under `heuristic.py` name, to avoid discrepancy during reconversion ([#354][] fixes [#353][]) - Do not crash (with TypeError) while trying to sort absent file list ([#360][]) - heudiconv requires nipype >= 1.0.0 ([#364][]) and blacklists `1.2.[12]` ([#375][]) * tag 'v0.6.0': (60 commits) Version boost to 0.6.0 DOC: populate detailed changelog for 0.6.0 and tune up formatting in previous one Fix miscellaneous typos in ReproIn heuristic file. BF: fix check for the sbatch (SLURM) not being available ENH: make test-compare-two-versions take any two worktrees, and just show diff if results already known Update heudiconv/convert.py apply @mgxd 's suggestions, adding a warning and a timeout environment variable need str typecast Use empty string not None Empty acq_time results in empty cell not 'n/a' DOC: Clarify tarball session handling remove repetitive import statement respond to review - add explicit py2 check - change file saving strategy - use logger instead of print fix remaning py2 errors MNT: Add Python support metadata to package fix some python2/3 incompatibilities add return data (accidently removed return) make content unicode test that load_json provides filename if invalid explicitly name invalid json ...
[0.6.0] - 2019-12-16 This is largely a bug fix. Metadata and order of `_key-value` fields in BIDS could change from the result of converting using previous versions, thus minor version boost. 14 people contributed to this release -- thanks [everyone](https://github.com/nipy/heudiconv/graphs/contributors)! Enhancement - Use [etelemetry](https://pypi.org/project/etelemetry) to inform about most recent available version of heudiconv. Please set `NO_ET` environment variable if you want to disable it ([#369][]) - BIDS: - `--bids` flag became an option. It can (optionally) accept `notop` value to avoid creation of top level files (`CHANGES`, `dataset_description.json`, etc) as a workaround during parallel execution to avoid race conditions etc. ([#344][]) - Generate basic `.json` files with descriptions of the fields for `participants.tsv` and `_scans.tsv` files ([#376][]) - Use `filelock` while writing top level files. Use `HEUDICONV_FILELOCK_TIMEOUT` environment to change the default timeout value ([#348][]) - `_PDT2` was added as a suffix for multi-echo (really "multi-modal") sequences ([#345][]) - Calls to `dcm2niix` would include full output path to make it easier to discern in the logs what file it is working on ([#351][]) - With recent [datalad]() (>= 0.10), created DataLad dataset will use `--fake-dates` functionality of DataLad to not leak data conversion dates, which might be close to actual data acquisition/patient visit ([#352][]) - Support multi-echo EPI `_phase` data ([#373][] fixes [#368][]) - Log location of a bad .json file to ease troubleshooting ([#379][]) - Add basic pypi classifiers for the package ([#380][]) Fixed - Sorting `_scans.tsv` files lacking valid dates field should not cause a crash ([#337][]) - Multi-echo files detection based number of echos ([#339][]) - BIDS - Use `EchoTimes` from the associated multi-echo files if `EchoNumber` tag is missing ([#366][] fixes [#347][]) - Tolerate empty ContentTime and/or ContentDate in DICOMs ([#372][]) and place "n/a" if value is missing ([#390][]) - Do not crash and store original .json file is "JSON pretification" fails ([#342][]) - ReproIn heuristic - tolerate WIP prefix on Philips scanners ([#343][]) - allow for use of `(...)` instead of `{...}` since `{}` are not allowed ([#343][]) - Support pipolar fieldmaps by providing them with `_epi` not `_magnitude`. "Loose" BIDS `_key-value` pairs might come now after `_dir-` even if they came first before ([#358][] fixes [#357][]) - All heuristics saved under `.heudiconv/` under `heuristic.py` name, to avoid discrepancy during reconversion ([#354][] fixes [#353][]) - Do not crash (with TypeError) while trying to sort absent file list ([#360][]) - heudiconv requires nipype >= 1.0.0 ([#364][]) and blacklists `1.2.[12]` ([#375][]) * tag 'v0.6.0': Boost perspective release date in changelog to today ENH(TST): Fix version to older pytest to ease backward compatibility testing RF: use tmpdir not tmp_path fixture FIX: minor typo in CHANGELOG.md
This is useful in batch processing to prevent potential race
conditions. The batch processing example has also been
modified to use this flag.
See #340