Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature suggestion: support DICOM file list as input for conversion #252

Closed
fedorov opened this issue Dec 15, 2018 · 17 comments
Closed

Feature suggestion: support DICOM file list as input for conversion #252

fedorov opened this issue Dec 15, 2018 · 17 comments

Comments

@fedorov
Copy link

fedorov commented Dec 15, 2018

Currently, if I understand correctly, dcm2niix accepts as input name of the folder with the files to be converted. In some situations, the expectation of having all input files (and no other DICOM files) in a single folder is not consistent with the organization of files used by other systems. As a specific example, we would like to investigate integration of dcm2niix with 3D Slicer, and in 3D Slicer no expectation on how DICOM files are sorted can be made (the user may choose not to copy at all the files from the original location they were imported from). It would be convenient if the list of files to be used in the conversion could be specified in a configuration file of some sort. Otherwise, we will need to copy input files to a temporary folder just for the purposes of the conversion. Could this feature request be considered?

@neurolabusc
Copy link
Collaborator

My suggestion would be to use the same mechanism that MRIcroGL and FSLeyes use to allow the user to select a specific series to load. The screenshot below shows the graphical interface of the latest MRIcroGL when a user drops a DICOM folder onto the executable. A dialog opens up displaying the series available in the selected folder, allowing the user to choose which one they want to load.
These user interfaces exploit the command line -n parameter. You first call -n -1 to have the executable report the available Series. You then run -n x where x is the series number you want to convert (e.g. -n 23 opens series 23). Note you can use -n a few times in each call if you want to convert multiple series. Just provide a suitable output name (-f) so you can load them - for example convert the files to your temporary directory. The source code for FSLeyes and MRIcroGL provide samples for this approach.
The other approach would be to use or extend @benjaminirving's dcm2niixbatch.

$ dcm2niix -n -1 -f %p /sag
Chris Rorden's dcm2niiX version v1.0.20181210  GCC6.1.0 (64-bit MacOS)
Found 12 DICOM file(s)
	23	/sag/sag_sag_desc_35sl
 /sag/sagdesc35/MR.1.3.12.2.1107.5.2.32.35131.2014031013003590507090477
	21	/sag/sag_sag_int_36sl
 /sag/sagint36/MR.1.3.12.2.1107.5.2.32.35131.2014031012593442716690029
	19	/sag/sag_sag_asc_36sl
 /sag/sagasc36/MR.1.3.12.2.1107.5.2.32.35131.2014031012583942473089577
	20	/sag/sag_sag_desc_36sl
 /sag/sagdesc36/MR.1.3.12.2.1107.5.2.32.35131.2014031012590727219489809
	24	/sag/sag_sag_int_35sl
 /sag/sagint35/MR.1.3.12.2.1107.5.2.32.35131.2014031013010980862390698
	22	/sag/sag_sag_asc_35sl
 /sag/sagasc35/MR.1.3.12.2.1107.5.2.32.35131.2014031013000537156690252
Conversion required 0.030093 seconds (0.029588 for core code).

dcm2niix_select

@fedorov
Copy link
Author

fedorov commented Dec 16, 2018

You then run -n x where x is the series number you want to convert

Interesting, thank you for those suggestions! What if I have multiple series that have the same SeriesNumber in the same folder?

@fedorov
Copy link
Author

fedorov commented Dec 16, 2018

The other approach would be to use or extend @benjaminirving's dcm2niixbatch

Looking at that file, it seems that it takes input as pointers to directories, not files. Am I mistaken?

@neurolabusc
Copy link
Collaborator

The example below shows two common cases of multiple-instances of the same series number:

  1. There are two copies of series 9, acquired on different dates.
  2. There are two copies of series 4, reflecting two echoes of a fieldmap. Note that the requested file naming of protocol and date-time (-f %p_%t) does not disambiguate these, so dcm2niix adds the _e2 disambiguation post fix to the second echo.

Note as well that when you run dcm2niix in reporting mode (-n -1) it provides three tab separated values for each series:

  1. Series number
  2. Disambiguated output filename with specified file name (-f) and output directory (-o)
  3. Representative single DICOM image from this series.

When you run dcm2niix with the -n 9 parameter it will convert ALL instances of series 9. I am not sure how FSLeyes handles this, but MRIcroGL uses the disambiguated output filename to load the desired series from the temp directory. Since the conversion is done without compression (-z n), it is very quick, and the temp directory can be cleared. However, there are more sophisticated solutions that could resolve this, though I suspect they would need pull requests to be ideal. Two other solutions might be to use the representative DICOM image and single mode (-s y), the other would be to provide the Series Instance UID. The main issue with either of these approaches is that they would have to be overcome complications associated with Siemens Field Maps. Specifically, a single Series Instance is given to all echoes and reconstruction types (real, imaginary, magnitude, phase) and instance numbers are repeated across these types. Unfortunately, the DICOM standard does not require instance numbers to be unique, so while Siemens generation of instance numbers is misleading and useless it is technically legal. This has caused a huge amount of dataloss, as many systems delete what appear to be replicated DICOM images, resulting in incomplete volumes. It is very hard to preemptively disambiguate these images, since they replicate all the standard methods for distinguishing images. There may be other situations which create these confusions, but in my domain the Siemens Field Maps are the most common. Therefore, while the MRIcroGL approach is a bit clumsy, it is brutally effective in the real world. Any other method will need to be carefully tested in the real world for corner cases. The DICOM standard is complex and each vendor has evolved their interpretation over the years.

$ dcm2niix  -n -1 -f %p_%t ~/fld
Chris Rorden's dcm2niiX version v1.0.20181210  GCC6.1.0 (64-bit MacOS)
	9	/fld/BOLD_3mm_matched_0.55esp_20181108150819
 /fld/fld/9_BOLD_3mm_matched_0.55esp/0001.dcm
	9	/fld/ax_asc_36sl_20140310133834
 /fld/fld/axasc36/MR.1.3.12.2.1107.5.2.32.35131.2014031012525641770887330
slices not stacked: echo varies (TE 7.11, 4; echo 2, 1). Use 'merge 2D slices' option to force stacking
	4	/fld/B0map_onesizefitsall_v1_20181108150819_e2_ph
 /fld/fld/4_B0map_onesizefitsall_v1/0045_e2_ph.dcm
	4	/fld/B0map_onesizefitsall_v1_20181108150819_e1_ph
 /fld/fld/4_B0map_onesizefitsall_v1/0001_ph.dcm

I agree, the current implementation of dcm2niixbatch would require you to segment your data, but it does illustrate flexibility if you wanted to make a fork or pull request with more sophisticated file handling. There is certainly room to improve the -n -1 method, but pull requests to improve this method will need to be coordinated with MRIcroGL and FSLeyes which both rely on it.

@fedorov
Copy link
Author

fedorov commented Dec 16, 2018

Chris, as I tried to explain in the initial issue, in our situation one cannot assume anything about how DICOM files are organized in folders. It is a completely valid situation if one folder contains multiple patients and multiple studies. Furthermore, it is also limiting that -n can only be used up to 16 times (and in any case, there are limits on command line length in Windows): for example, some vendors save individual time frames for DCE acquisition as separate series, and there can be upwards of a 100 of frames.

will need to be coordinated with MRIcroGL and FSLeyes which both rely on it

What I suggested is not to change functionality of any of the existing options, but to add a new option that would allow to specify a text or JSON file, which would contain flat list of input DICOM files that should be used for conversion. Otherwise, the only option to use dcm2niix in our situation is to temporarily copy all of the files to be considered for conversion into a separate directory. Sounds like this is what we would need to do if we want to integrate it.

@neurolabusc
Copy link
Collaborator

neurolabusc commented Dec 16, 2018

Why don't you look at my latest commit on the development branch. It includes a new function textDICOM(). You call this by using the single file (-s y) parameter and passing the name of the text file. So the minimal call would be something like dcm2niix -s y ~/DICOMs.txt (though in practice you will probably want to include -f ?, -o ?, -z n, -b n).

A sample text file might look like this:

/Users/rorden/fld/a.dcm
/Users/rorden/am/b.dcm

Only the specified images are converted - other images in the folders are ignored. All the images are converted as a single NIfTI file. The software should automatically order the slices, series, etc correctly. There are a couple caveats:

  • This code assumes you are providing images that can be stacked together (e.g. the same X and Y dimensions, etc). For multi-volume data you will want to make sure the number of slices per volume are the same for all volumes, etc.
  • This code assumes that you have specified legitimate DICOM images or meta-data. Note the commented line isDICOMfile in textDICOM() - if you use that it will check that the inputs include a legitimate DICOM header (and will therefore reject DICOM meta-data that has an image but does not include the appropriate Part 10 header).
  • DICOM files that have repeated slice positions and instance numbers might be loaded in a random order. The only example I can think of are Siemens Field Maps. Since these are not disambiguated in the header you will want to do this yourself (perhaps a future release could put echo number into an unused DimensionIndexValues which would allow some sorting).
  • Since DICOM files often have arbitrary file extensions, this code demands that your text file has the extension '.txt'. Either Windows or Unix EOLNs is acceptable.

@neurolabusc
Copy link
Collaborator

As a separate comment, is there an easy way to determine how to organize the DCE images together if they use separate series numbers? What mechanism does slicer use to concatenate these files? Can you share an example? If there is a clear method, one should be able update dcm2niix to automatically detect and combine these images for the user.

@neurolabusc
Copy link
Collaborator

neurolabusc commented Dec 17, 2018

@fedorov could you test the latest commit on the development branch. It should stack a DCE sequence from a single session as a single volume even if the series number varies. The output order will match the series number. I do not have access to many DCE examples, so please tell me if I missed any variations. I downloaded samples from The Cancer Imaging Archive (TCIA) which includes DCE sequences from Siemens, Philips and GE. When it detects DCE images that will be stacked, it generates a message letting you know that you can use the -m o if you want to turn off this feature (in which case each series will be saved as a separate NIfTI volume). If you select to include the series number in the output filename, the file number used will be the first in the stacked series (e.g. series 7 in the example below).

$ ./dcm2niix -i y -f %v_%t_%s_%p /QIN-Breast-DCE-MRI-BC10/
Chris Rorden's dcm2niiX version v1.0.20181210  GCC8.2.0 (64-bit MacOS)
Found 3841 DICOM file(s)
DCE volumes stacked despite varying series number (use '-m o' to turn off merging).
slices stacked despite varying acquisition numbers (if this is not desired recompile with 'mySegmentByAcq')
Convert 3840 DICOM as /QIN-Breast-DCE-MRI-BC10/Siemens_19970727134543_7_twist_20s_dyn_TRA_(h20_ex_B17) (320x320x120x32)
Conversion required 7.660269 seconds (5.351059 for core code).

$ ./dcm2niix -f %v_%t_%s_%p -m o /QIN-Breast-DCE-MRI-BC10/
Chris Rorden's dcm2niiX version v1.0.20181210  GCC8.2.0 (64-bit MacOS)
Found 3841 DICOM file(s)
Convert 120 DICOM as /QIN-Breast-DCE-MRI-BC10/Siemens_19970727134543_61_twist_20s_dyn_TRA_(h20_ex_B17) (320x320x120x1)
...
Convert 120 DICOM as /QIN-Breast-DCE-MRI-BC10/Siemens_19970727134543_59_twist_20s_dyn_TRA_(h20_ex_B17) (320x320x120x1)
Conversion required 6.675303 seconds (6.209471 for core code).

@fedorov
Copy link
Author

fedorov commented Dec 17, 2018

@neurolabusc thank you so much for working on this!

I will comment once I get a chance to test.

As a separate comment, is there an easy way to determine how to organize the DCE images together if they use separate series numbers? What mechanism does slicer use to concatenate these files? Can you share an example?

No, I don't have an easy way. The current implementation was driven by specific datasets... The way Slicer handles multivolume parsing is to apply multiple parsing strategies, and give users freedom to choose from multiple results.

I think you found exactly same dataset that I used for Slicer loading support. I don't have any separate collection of a variety of DCE examples either.

At this moment, I was only considering dcm2niix for parsing scalar volumes. The main motivation was due to unreliable support of multiframe images, see discussion in https://discourse.slicer.org/t/dicom-multiframe-support/4806. Going forward, it would be interesting to also revisit dcm2niix for miltivolume, but that would require more work due to how things are implemented, so not sure if/when I will get to it. It would indeed be quite attractive to use dcm2niix instead, since it is a lot more widespread representation and tool AND it parsing seems to be much faster, looking at your example output!

@fedorov
Copy link
Author

fedorov commented Dec 17, 2018

What mechanism does slicer use to concatenate these files? Can you share an example?

I realized I probably could answer your question better ... I will point to the code and give more details a bit later, need to run now.

@neurolabusc
Copy link
Collaborator

Great. Since slicer already has a bunch of Python code, you might also want to evaluate @icometrix's dicom2nifti. This might be able to provide a clean solution without needing to communicate via text files.

@fedorov
Copy link
Author

fedorov commented Dec 17, 2018

I have more confidence in dcm2niix (and looking through the issues, some of them are rather worrying, such as icometrix/dicom2nifti#19). Text files would be used only for communicating lists of files, so I don't see a problem with that.

@neurolabusc
Copy link
Collaborator

neurolabusc commented Dec 17, 2018

OK. At the moment dcm2niix handles multi-frame from Philips, Siemens (Vida XA10A, though no tool can rescue Vida data saved on the console as mosaics or de-identified) and Bruker. I think these are all the enhanced formats seen in practice. I have never tested the generic/reference enhanced DICOM example reference images, as each vendor came up with their own creative definition of enhanced DICOM.

@fedorov
Copy link
Author

fedorov commented Dec 17, 2018

Knowingly or not, but (speaking from experience) you can also handle legacy enhanced MF!

@neurolabusc
Copy link
Collaborator

Closing issue. Requested feature is integrated in development branch, will be included in next release.

@fedorov
Copy link
Author

fedorov commented Jan 10, 2019

Thanks a lot Chris! 👍

@neurolabusc
Copy link
Collaborator

@fedorov I am not sure if you ever implemented this feature. I have changed the behavior a little bit. Previously, all DICOMs had to belong to the same series, and only a single output NRRD/NIfTI image was generated. The new behavior is that if multiple series are specified, multiple NRRD/NIfTI images will be created. This turns out to be a handy way to live with the macOS App Sandbox restrictions.

yarikoptic added a commit to neurodebian/dcm2niix that referenced this issue Apr 6, 2021
* tag 'v1.0.20210317': (23 commits)
  CI: Travis updates submodules always from remote.
  Update dcm_qa_nih submodule.
  Remove diagnostic message
  help should describe accession number (rordenlab#496)
  Describe and provide kludge for mangled Canon DICOMs (rordenlab#495)
  Update Philips notes (rordenlab#493)
  Update dcm_qa submodule.
  Support Canon Enhanced DICOM (rordenlab#491)
  Use first recognized manufacturer tag (rordenlab#487)
  If mosaics where ANY volume is non-planar, save ALL volumes as 3D(rordenlab#481)
  Add notes
  Kludge to separate vNav files as 3D (rordenlab#481)
  Tell user to override merging (-m o) to separate coils, specific with fmrib usage that disrupts Siemens DCE processing (rordenlab#187)
  Forbid semicolon in filenames as Linux uses this for command concatenation and Windows function GetOpenFileName will truncate (rordenlab#425)
  report totalReadoutTime once
  EstimatedEffectiveEchoSpacing -> EffectiveEchoSpacing (rordenlab#480)
  GE Round factor for Total Readout Time
  Update explicit naming of DICOMs (rordenlab#252), GE file naming changes (rordenlab#476)
  Update issue templates
  GE TotalReadouTime, BEP009 fixes (rordenlab#476)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants