[WIP] Parse ADNI XML metadata #587

NicolasGensollen · 2022-03-01T11:25:18Z

This PR continues the work started in #431 for adding metadata from XML files to the output of the ADNI to BIDS convertor. (Note: I made a lot of refactoring so I went for opening a new PR since the diff would be huge anyway...).

Refactor adni_json.py: shorter functions (especially parse_xml_file), more explicit variable and function names, docstrings, all functions private except create_json_metadata...
Change output column names to be lower case (seems to better match BIDS specs for tabular files)
Add JSON file to better describe column names ?
Add tests

pep8speaks · 2022-03-01T11:25:21Z

Hello @NicolasGensollen! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-03-30 08:20:43 UTC

ghisvail · 2022-03-01T14:22:52Z

Back when the first PR was proposed, I was wondering whether we could leverage the new pandas.read_xml to parse the XML metadata instead of raw xml.etree manipulations.

My thinking being: could we just let pandas do the actual XML parsing and only maintain code that selects and combines the metadata attributes we want? With the hypothesis that this would greatly simplify the resulting code and the future review of this PR.

omar-rifai · 2022-03-01T14:23:50Z

Thanks @NicolasGensollen for picking this up. Also, for the record, the two main elements that were blocking for merging 431 were (in order of priority):

Defining which metadata to include (c.f discussion#293) and where they should be written in the BIDS hierarchy
Eventually parallelizing with an appropriate library (not a priority)

omar-rifai · 2022-03-08T08:50:42Z

@NicolasGensollen, before finalizing can you make sure to add @emaheux to the ~~author~~ co-author of the first commit as he provided the draft code for this. Thanks!

NicolasGensollen · 2022-03-08T09:32:45Z

@omar-rifai Of course ! I can also add you as the author of another one since I started from your work!

clinica/iotools/converters/adni_to_bids/adni_json.py

test/tests/iotools/converters/adni_to_bids/test_adni_json.py

omar-rifai · 2022-03-23T09:14:15Z

Hello @NicolasGensollen, let us know if this is finalized so that we can review and merge. Thanks !

NicolasGensollen · 2022-03-23T15:15:04Z

Hi @omar-rifai
I believe this is ready for reviews.
I re-ran the converter this afternoon (see folder data_ci_2022_xml, the output is in ref).

omar-rifai

Looks good thanks ! I'll merge after the rebase.

omar-rifai · 2022-03-23T15:45:07Z

clinica/iotools/converters/adni_to_bids/adni_json.py

+
+    .. note::
+        Use multiprocessing / multithreading for parsing the files?
+        Not sure we will get a huge performance boost by doing that tbh.


what is the rationale behind this comment? Threading seems to be designed for that purpose.

As discussed, we can remove the comment

Done in d8b7d59

omar-rifai · 2022-03-25T15:37:02Z

Probably related to the test itself but I'm getting an TypeError on test_get_existing_scan_dataframe:

 def test_get_existing_scan_dataframe(tmp_path):
        """Test function `_get_existing_scan_dataframe`."""
        from clinica.iotools.converters.adni_to_bids.adni_json import _get_existing_scan_dataframe
        from pandas.testing import assert_frame_equal
        subj_path = tmp_path / "sub-01"
        subj_path.mkdir()
>       with pytest.warns(match="No scan tsv file for subject sub-01 and session foo"):
E       TypeError: warns() missing 1 required positional argument: 'expected_warning'

omar-rifai · 2022-03-28T08:40:29Z

@NicolasGensollen, can you also please run a make format (again?) so that it passes the lint checks?

NicolasGensollen · 2022-03-28T09:12:28Z

can you also please run a make format (again?)

Done in 6034162

I'm getting an TypeError on test_get_existing_scan_dataframe

Oops, missed your comment on Friday.
Did you manage to run the tests since then?
Doing pytest -v works on my end.

omar-rifai · 2022-03-28T09:50:12Z

can you also please run a make format (again?)

Done in 6034162

I'm getting an TypeError on test_get_existing_scan_dataframe

Oops, missed your comment on Friday. Did you manage to run the tests since then? Doing pytest -v works on my end.

Yes seems to work now, thanks! I also fixed some issues with the data_ci and the converters succeed now. However, It seems that there is still an issue with merge_tsv in iotools. Can you check that is everything merged ? I had a similar issue last week from a recent change in dev.

Co-authored-by: Ghislain Vaillant <[email protected]>

ghisvail mentioned this pull request Mar 1, 2022

Store source identifiers in BIDS metadata during conversion #588

Open

ghisvail reviewed Mar 8, 2022

View reviewed changes

clinica/iotools/converters/adni_to_bids/adni_json.py Outdated Show resolved Hide resolved

ghisvail reviewed Mar 8, 2022

View reviewed changes

test/tests/iotools/converters/adni_to_bids/test_adni_json.py Outdated Show resolved Hide resolved

NicolasGensollen force-pushed the parse-adni-xml-metadata branch 2 times, most recently from d7784cb to 8b2147f Compare March 11, 2022 10:14

NicolasGensollen marked this pull request as ready for review March 23, 2022 15:11

omar-rifai approved these changes Mar 25, 2022

View reviewed changes

NicolasGensollen force-pushed the parse-adni-xml-metadata branch from 8b2147f to de25bcf Compare March 25, 2022 13:24

emaheux and others added 11 commits March 30, 2022 10:20

refactor code from Omar PR

05bcf29

some improvements

ec48e91

Fix PEP8

3813b85

Add to cli

2b5d41a

Write metadata to tsv/json files

5898388

fix small bug in _read_xml_files

385da2f

Raise explicit errors rather than rely on assertions in a couple places

207090b

Start adding unit tests (unfinished)

a01a2a2

Fix: _get_image_rating was crashing when no rating available

6751ec1

Add more tests and more templates

8cec373

Update clinica/iotools/converters/adni_to_bids/adni_json.py

712b559

Co-authored-by: Ghislain Vaillant <[email protected]>

NicolasGensollen added 12 commits March 30, 2022 10:20

Address reviews from Ghislain

62d9ed3

Fix typo in return value (thanks unittests)

c82db04

couple more tests...

3a3936c

missing import and small fix

514440e

add test

5d890eb

Add unit tests for bids_id_to_loni and add type info

b9df7f2

Doc

366c246

non regression (needs xml data folder)

d5e23f1

Linting...

c803216

tests -> unittests

b2fbc26

remove comment on multi-proc/threading

27db5da

Run linter

2c043d2

NicolasGensollen force-pushed the parse-adni-xml-metadata branch from 6034162 to 2c043d2 Compare March 30, 2022 08:20

omar-rifai merged commit 7d9232d into aramis-lab:dev Mar 30, 2022

NicolasGensollen deleted the parse-adni-xml-metadata branch March 30, 2022 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Parse ADNI XML metadata #587

[WIP] Parse ADNI XML metadata #587

NicolasGensollen commented Mar 1, 2022 •

edited

Loading

pep8speaks commented Mar 1, 2022 •

edited

Loading

ghisvail commented Mar 1, 2022

omar-rifai commented Mar 1, 2022 •

edited

Loading

omar-rifai commented Mar 8, 2022 •

edited

Loading

NicolasGensollen commented Mar 8, 2022

omar-rifai commented Mar 23, 2022

NicolasGensollen commented Mar 23, 2022

omar-rifai left a comment

omar-rifai Mar 23, 2022

NicolasGensollen Mar 25, 2022

NicolasGensollen Mar 28, 2022

omar-rifai commented Mar 25, 2022

omar-rifai commented Mar 28, 2022

NicolasGensollen commented Mar 28, 2022

omar-rifai commented Mar 28, 2022

[WIP] Parse ADNI XML metadata #587

[WIP] Parse ADNI XML metadata #587

Conversation

NicolasGensollen commented Mar 1, 2022 • edited Loading

pep8speaks commented Mar 1, 2022 • edited Loading

Comment last updated at 2022-03-30 08:20:43 UTC

ghisvail commented Mar 1, 2022

omar-rifai commented Mar 1, 2022 • edited Loading

omar-rifai commented Mar 8, 2022 • edited Loading

NicolasGensollen commented Mar 8, 2022

omar-rifai commented Mar 23, 2022

NicolasGensollen commented Mar 23, 2022

omar-rifai left a comment

Choose a reason for hiding this comment

omar-rifai Mar 23, 2022

Choose a reason for hiding this comment

NicolasGensollen Mar 25, 2022

Choose a reason for hiding this comment

NicolasGensollen Mar 28, 2022

Choose a reason for hiding this comment

omar-rifai commented Mar 25, 2022

omar-rifai commented Mar 28, 2022

NicolasGensollen commented Mar 28, 2022

omar-rifai commented Mar 28, 2022

NicolasGensollen commented Mar 1, 2022 •

edited

Loading

pep8speaks commented Mar 1, 2022 •

edited

Loading

omar-rifai commented Mar 1, 2022 •

edited

Loading

omar-rifai commented Mar 8, 2022 •

edited

Loading