Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Adds "AcquisitionTime" to the seqinfo #487

Merged
merged 5 commits into from
Jan 13, 2021

Conversation

pvelasco
Copy link
Contributor

@pvelasco pvelasco commented Jan 6, 2021

This commit adds the "AcquisitionTime" to the seqinfo, so that it is available for the heuristic classification.

Use case:
At least for Siemens scanners, for anatomical images (MPRAGE, TSE), when you select the images to be normalized (i.e., corrected for coil sensitivity bias), it allows the user to also save the original images, as a separated series.

If that is selected, we will have two series of DICOM files corresponding to the same run, with the only difference being the intensity normalization. If we were to extract both series in a BIDS compliant way, we should probably use the label rec- to distinguish between them (e.g., rec-normalized vs. rec-original).

However, it seems like most BIDS Apps (e.g., MRIQC and fMRIPrep) process all T1w and T2w images in the anat folder as if they were different images to be processed. For example, fMRIPrep averages all T1w/T2w images and then does the cortical segmentation. If we were to have normalized/original pairs of images as two different BIDS files, this will be a problem.

To avoid this situation, our heuristic file could save only one of the image types (say, the normalized). However, if the user chose not to normalize the images, no BIDS files would be extracted.

By adding the "AcquisitionTime" to the seqinfo, one can write a heuristic file that checks if two series have the same acquisition time and, if one has NORM in the image_type but the other doesn't, conclude that they come from the same run and decide accordingly. (In my case, I only save the normalized image in the subject's BIDS folder, while keeping both images in the sourcedata, for backup.) If only one type of images is present for a given run, it is saved. So, no matter what the normalization of the images is, we would have only one NIfTI file per run in the BIDS output (well, unless there are more than one part- or echo-).

@yarikoptic
Copy link
Member

wouldn't those images have the same series_uid (thus allowing to "group", removing the need for AcquisitionTime) and have different image_type (and even we already have .is_derived and .is_motioncorrected) to assist with deciding to add _rec etc entities? we already do smth like that in https://github.com/nipy/heudiconv/blob/master/heudiconv/heuristics/reproin.py#L676 .

Also note, if we are to add to seqinfo, we should append. Some heuristics might use positional index, thus we should not replace the ones we already have, but rather extend

@codecov
Copy link

codecov bot commented Jan 6, 2021

Codecov Report

Merging #487 (2768e7a) into master (d8e5c5f) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #487      +/-   ##
==========================================
+ Coverage   76.01%   76.03%   +0.02%     
==========================================
  Files          38       38              
  Lines        3010     3013       +3     
==========================================
+ Hits         2288     2291       +3     
  Misses        722      722              
Impacted Files Coverage Δ
heudiconv/dicoms.py 81.14% <ø> (ø)
heudiconv/utils.py 89.91% <ø> (ø)
heudiconv/tests/test_main.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8e5c5f...2768e7a. Read the comment docs.

@pvelasco
Copy link
Contributor Author

pvelasco commented Jan 6, 2021

It looks like Siemens creates a different .series_uid for the normalized and original series. Also, the flags .is_derived and .is_motioncorrected are the same for both runs (false in both cases).

As far as I can tell, unless we check the AcquisitionTime, there is no way of telling if two series of DICOMs, one with NORM in image_type and the other without it, correspond to the same scanner run or to a different run.

Here is my dicominfo.tsv (including AcquisitionTime (time)):

total_files_till_now	example_dcm_file		series_id			dcm_dir_name		series_files	unspecified	dim1	dim2	dim3	dim4	TR	TE	protocol_name		is_motion_corrected	is_derived	patient_id	study_description	referring_physician_name	series_description	sequence_name	image_type					accession_number	patient_age	patient_sex	date		time		series_uid
240	tmp+16+t1_mprage_sag_p2_.9mm_01+00001.dcm	16-t1_mprage_sag_p2_.9mm	16+t1_mprage_sag_p2_.9mm	240			256	256	240	1	2.3	2.32	t1_mprage_sag_p2_.9mm	False			False		jnxivpsm	Investigators^Dev	Dev				t1_mprage_sag_p2_.9mm	*tfl3d1_16ns	('ORIGINAL', 'PRIMARY', 'M', 'ND')		jwshh7L4		None		None		20190703	131526.045000	1.3.12.2.1107.5.2.43.166018.2019070313205172302209312.0.0.0
480	tmp+17+t1_mprage_sag_p2_.9mm_01+00001.dcm	17-t1_mprage_sag_p2_.9mm	17+t1_mprage_sag_p2_.9mm	240			256	256	240	1	2.3	2.32	t1_mprage_sag_p2_.9mm	False			False		jnxivpsm	Investigators^Dev	Dev				t1_mprage_sag_p2_.9mm	*tfl3d1_16ns	('ORIGINAL', 'PRIMARY', 'M', 'ND', 'NORM')	jwshh7L4		None		None		20190703	131526.045000	1.3.12.2.1107.5.2.43.166018.2019070313152451395509099.0.0.0

If it weren't for the AcquisitionTime, I could not tell if these were the same run, or two runs with identical parameters (dimensions, etc.) including the same protocol_name, but one for which the user selected "normalize" (without saving the originals) and the other without normalization.

@yarikoptic
Copy link
Member

Hm, weird. IMHO there must be some uid which is the same between the two! I will look around later for similar cases I have data for, or is this is some phantom/shareable data?

To preserve backwards compatibility
@yarikoptic
Copy link
Member

Let's ask the expert: @neurolabusc Chris -- is there a better way (besides AcquisitionTime) to identify the group of "base" image + all its NORM'ed ones? Apparently (Siemens in our case) a unique SOPInstanceUID assigned to the NORM'ed ones. Shouldn't there be some UID which should be shared by all (base and any derivative) but unique to that particular data acquisition?

@neurolabusc
Copy link

It is really up to the manufacture to decide these details, and therefore I think you will need bespoke solutions for GE, Siemens and Philips. I can see the logic in referring to both the raw T1 scan and the T1 scan where image intensity is corrected for field inhomogeneity as original. In my mind, diffusion images like ADC, MD, Trace, etc. are all derived as they each is generated based on a set of different scans. Choosing a different series UID ensures that tools do not collapse these as one image, e.g. on a Siemens console a user can choose to save both the homogeneity normalized and raw scan, but one wants to be able to distinguish them. To understand the danger, consider some Siemens Fieldmap sequences where the same Series UID and image instance number is applied to the first and second echo: this choice caused widespread data loss as many archiving systems saved some slices from the first echo and other slices from the second echo, discarding the duplicates.

I think dcm2niix is faithfully doing its job here. If the user selects to save both the raw and normalized T1 scan on the console, both are saved. It is unclear what their intention here. That is really the role of tools like heudiconv to apply the user specific heuristics.

@@ -89,6 +89,7 @@ def create_seqinfo(mw, series_files, series_id):
patient_age=dcminfo.get('PatientAge'),
patient_sex=dcminfo.get('PatientSex'),
date=dcminfo.get('AcquisitionDate'),
time=dcminfo.get('AcquisitionTime'),
series_uid=dcminfo.get('SeriesInstanceUID')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

place time last here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Will do shortly.

@yarikoptic
Copy link
Member

Thank you @neurolabusc !
@pvelasco - seems like your approach is the only "easy" way to go forward we have ATM! Please fix up ordering, adjust some basic unittest to ensure that we have time in that position (we have already some tests which operate on sample dicoms), and let's get this merged

@pvelasco
Copy link
Contributor Author

pvelasco commented Jan 8, 2021

Not sure if this is what you had in mind...

0fb0c2f

(Note: @yarikoptic, I'm not sure why the CI fails for Python 3.5. The error doesn't make sense to me. Maybe it can be forced to be run again?)

@yarikoptic
Copy link
Member

FTR
=================================== FAILURES ===================================
__________________________________ test_cache __________________________________
heudiconv/tests/test_main.py:287: in test_cache
    with open(cachedir / 'dicominfo.tsv', 'r') as f:
E   TypeError: invalid file: local('/tmp/pytest-of-travis/pytest-0/test_cache0/.heudiconv/S01/info/dicominfo.tsv')
------------------------------ Captured log call -------------------------------
main.py                    264 INFO     Running heudiconv version 0.9.0 latest 0.9.0
dicoms.py                  179 INFO     Analyzing 1 dicoms
dicoms.py                  303 INFO     Generated sequence info for 1 studies with 1 entries total
parser.py                  198 WARNING  Heuristic is missing an `infotoids` method, assigning empty method and using provided subject id S01. Provide `session` and `locator` fields for best results.
parser.py                  223 INFO     Study session for StudySessionInfo(locator=None, session=None, subject='S01')
main.py                    291 INFO     Need to process 1 study sessions
main.py                    335 INFO     PROCESSING STARTS: {'outdir': '/tmp/pytest-of-travis/pytest-0/test_cache0/', 'session': None, 'subject': 'S01'}
convert.py                  95 INFO     Processing 1 pre-sorted seqinfo entries
convert.py                 205 INFO     Doing conversion using dcm2niix
convert.py                 449 INFO     Converting /tmp/pytest-of-travis/pytest-0/test_cache0/S01/run001 (1 DICOMs) -> /tmp/pytest-of-travis/pytest-0/test_cache0/S01 . Converter: dcm2niix . Output types: ('nii.gz',)
nodes.py                   442 INFO     [Node] Setting-up "convert" in "/tmp/dcm2niixx0f5igt3/convert".
nodes.py                   739 INFO     [Node] Running "convert" ("nipype.interfaces.dcm2nii.Dcm2niix"), a CommandLine Interface with command:
dcm2niix -b n -z y -x n -t n -m n -f run001_heudiconv064 -o /tmp/pytest-of-travis/pytest-0/test_cache0/S01 -s n -v n /tmp/dcm2niixx0f5igt3/convert
subprocess.py               68 INFO     stdout 2021-01-08T14:53:29.259014:Compression will be faster with 'pigz' installed
subprocess.py               68 INFO     stdout 2021-01-08T14:53:29.259014:Chris Rorden's dcm2niiX version v1.0.20181125  GCC5.3.1 (64-bit Linux)
subprocess.py               68 INFO     stdout 2021-01-08T14:53:29.259014:Found 1 DICOM file(s)
subprocess.py               68 INFO     stdout 2021-01-08T14:53:29.259014:Convert 1 DICOM as /tmp/pytest-of-travis/pytest-0/test_cache0/S01/run001_heudiconv064 (64x64x35x1)
subprocess.py               68 INFO     stdout 2021-01-08T14:53:29.283601:Conversion required 0.027644 seconds (0.027616 for core code).
nodes.py                   533 INFO     [Node] Finished "convert".
main.py                    354 INFO     PROCESSING DONE: {'outdir': '/tmp/pytest-of-travis/pytest-0/test_cache0/', 'session': None, 'subject': 'S01'}
=============== 1 failed, 50 passed, 4 skipped in 167.25 seconds ===============
The command "coverage run `which py.test` -s -v heudiconv" exited with 1.

For some reason it doesn't give me an option to restart (may be because we need to finally migrate to new travis setup? ). And likely the failure is "legit" incompatibility of py 3.5 and pytest's tmpdir. I have committed a "suggestion" to wrap it into str explicitly -- let's see how that one goes ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants