ENH: Split ICA into three steps #857

SophieHerbst · 2024-02-23T12:58:32Z

I wanted to re-run the apply_ICA step after changing the thresholds for automatic component rejection in my config.

I had three surprises/ observations:

mismatch between documentation website and actual nme of the step: 07 versus 08
the output prints each run (raw), while I expected the ICA being applied to epochs given the order in the pipeline
nothing was computed (Computation unnecessary (cached) …)

@hoechenberger how do I rerun this step? If the other points are just a matter of changing the doc, I can do it with a few how-to pointers from you

(versions: latest main of mne, pipeline, and mne-bids)

(mne) sh254795@is152869:/neurospin/meg/meg_tmp/TimeInWM_Izem_2019/BIDS_anonymized$ mne_bids_pipeline --config=/neurospin/meg/meg_tmp/TimeInWM_Izem_2019/BIDS_anonymized/code/config_sophie.py --steps=preprocessing/apply_ica
┌────────┬ Welcome aboard MNE-BIDS-Pipeline! 👋 ────────────────────────────────────────────────────────────────
│13:51:23│ 📝 Using configuration: /neurospin/meg/meg_tmp/TimeInWM_Izem_2019/BIDS_anonymized/code/config_sophie.py
└────────┴ 
┌────────┬ init/_01_init_derivatives_dir ───────────────────────────────────────────────────────────────────────
│13:51:23│ ✅ Output directories already exist …
└────────┴ done (6s)
┌────────┬ init/_02_find_empty_room ────────────────────────────────────────────────────────────────────────────
│13:51:29│ ✅ sub-155 run-01 Computation unnecessary (cached) …
└────────┴ done (2s)
┌────────┬ preprocessing/_08a_apply_ica ────────────────────────────────────────────────────────────────────────
│13:51:36│ ✅ sub-155 Computation unnecessary (cached) …
│13:51:36│ ✅ sub-155 run-04 Computation unnecessary (cached) …
│13:51:36│ ✅ sub-155 run-05 Computation unnecessary (cached) …
│13:51:36│ ✅ sub-155 run-06 Computation unnecessary (cached) …
│13:51:36│ ✅ sub-155 run-07 Computation unnecessary (cached) …
│13:51:36│ ✅ sub-155 run-08 Computation unnecessary (cached) …
│13:51:36│ ✅ sub-155 run-rest Computation unnecessary (cached) …
│13:51:36│ ✅ sub-155 run-noise Computation unnecessary (cached) …
│13:51:41│ ✅ sub-155 run-01 Computation unnecessary (cached) …
│13:51:41│ ✅ sub-155 run-03 Computation unnecessary (cached) …
│13:51:41│ ✅ sub-155 run-02 Computation unnecessary (cached) …
└────────┴ done (12s)

The text was updated successfully, but these errors were encountered:

larsoner · 2024-02-23T13:05:59Z

This first cached line is for the epochs

┌────────┬ preprocessing/_08a_apply_ica ────────────────────────────────────────────────────────────────────────
│13:51:36│ ✅ sub-155 Computation unnecessary (cached) …

The thresholds affect fitting ICA, so you need to rerun that step and the applying step

SophieHerbst · 2024-02-23T13:12:00Z

Thanks for the reply @larsoner. I don't understand why the thresholds affect the fitting.
I meant those parameters:
ica_ctps_ecg_threshold = 0.05
ica_eog_threshold = 2.0

hoechenberger · 2024-02-23T13:14:34Z

The thresholds affect fitting ICA, so you need to rerun that step and the applying step

Isn't that a bug then? Shouldn't we auto-rerun ICA fitting if config values related to the ICA step change?

I don't understand why the thresholds affect the fitting.

They don't, but fitting and ExG artifact detection are performed in the same step (run-ica), hence that step needs to be re-run. Maybe it should be split into two steps, a) fitting and b) artifact detection

SophieHerbst · 2024-02-23T13:16:20Z

I thought that the whole point of having two separate files for run and apply ICA was to run the ICA once and be able to adjust the thresholds for eog/ecg rejection without recomputing ICA

larsoner · 2024-02-23T13:18:42Z

SophieHerbst · 2024-02-23T13:18:53Z

that is how it was in 'the old pipeline': https://github.com/brainthemind/CogBrainDyn_MEG_Pipeline/blob/master/06a-apply_ica.py
Did it get lost?

larsoner · 2024-02-23T13:21:51Z

Isn't that a bug then? Shouldn't we auto-rerun ICA fitting if config values related to the ICA step change?

To detect this in the general case this would require traversing the whole processing tree. In other words, if a user selects a single step, don't treat it like "just run this" but instead like a stopping step, "run everything up to this". Like if a user changed their filter settings but only asks to run source estimation, quite a few preceding steps actually need to be rerun.

hoechenberger · 2024-02-23T13:24:10Z

I thought that the whole point of having two separate files for run and apply ICA was to run the ICA once and be able to adjust the thresholds for eog/ecg rejection without recomputing ICA

The report generated through run-ica is supposed to help you determine which components to remove in the next step (apply-ica)

To assist you, we run the automated artifact detection

The results end up together with all components in the ica+components report, e.g.
http://mne.tools/mne-bids-pipeline/stable/examples/ERP_CORE/sub-015_ses-ERN_task-ERN_proc-ica+components_report.html

The apply-ica step then looks into the ICA component CSV file to figure which components you would like removed

Like I said above, I think your point of separating the artifact detection step from the fitting step is fair. I think we should do, in three steps:

fit ICA
detect artifacts
apply ICA

so you could fiddle with artifact detection settings (step 2) without having to re-fit ICA (step 1)

hoechenberger · 2024-02-23T13:25:27Z

To detect this in the general case this would require traversing the whole processing tree. In other words, if a user selects a single step, don't treat it like "just run this" but instead like a stopping step, "run everything up to this".

So would would have been a viable solution for @SophieHerbst now?
Simply re-run the entire preprocessing pipeline and rely on caching to skip e.g. filtering etc?

hoechenberger · 2024-02-23T13:28:02Z

So would would have been a viable solution for @SophieHerbst now? Simply re-run the entire preprocessing pipeline and rely on caching to skip e.g. filtering etc?

We don't have a "please run everything up to this step" functionality right now, do we? Do you think we could implement this somehow?

SophieHerbst · 2024-02-23T13:28:22Z

And why is the run-ica step before make_epochs?
As I see from the code it creates epochs and runs ICA on them.

hoechenberger · 2024-02-23T13:36:54Z

And why is the run-ica step before make_epochs? As I see from the code it creates epochs and runs ICA on them.

ICA creates an entirely separate set of epochs to allow for applying different filters before fitting (e.g., 1 Hz high-pass for ICA while you'll maybe want 0.1 Hz high-pass for your "actual" data)

SophieHerbst · 2024-02-23T13:39:43Z

I was just confused to see that it appears before epochs in the steps and that the output talks about the raw runs. But I get it now.

hoechenberger · 2024-02-23T13:47:34Z

I was just confused to see that it appears before epochs in the steps and that the output talks about the raw runs. But I get it now.

Yes so it iterates over the raw files in order to create epochs

it's actually a good sign this caught your attention – Eric just recently changed the order of things to what it is now to highlight the fact that ICA is not fitted to your "regular" epochs

but it appears this needs better documentation and / or logging output

SophieHerbst · 2024-02-23T14:05:05Z

but it appears this needs better documentation and / or logging output

And the steps in the documentation need to be renamed accordingly

larsoner · 2024-02-23T14:06:44Z

Like I said above, I think your point of separating the artifact detection step from the fitting step is fair. I think we should do, in three steps:

Yes this is doable. We could probably consider renaming the steps a bit so we don't have to keep renaming the steps that follow if possible (like with 05a1, 05a2 or something instead of 05a, 06a but only a 05a for SSP and no 06a at all)

I was just confused to see that it appears before epochs in the steps and that the output talks about the raw runs. But I get it now.

Yes so it iterates over the raw files in order to create epochs... it's actually a good sign this caught your attention...

Yes this is exactly why I reordered it, I initially also thought ICA was fit on the task epochs but it's not 🙂

So would would have been a viable solution for @SophieHerbst now? Simply re-run the entire preprocessing pipeline and rely on caching to skip e.g. filtering etc?

Yes personally nowadays I always just mne_bids_pipeline config.py anytime I change anything and rely on caching to skip stuff. Data on SSDs this only has ~10s overhead for ~20 subjects data. Maybe it's not as viable on NFS-stored data or something -- @SophieHerbst can you give some information on how long it takes mne_bids_pipeline config.py to run on your data without changing anything, i.e., to be a no-op that just says "cached" for every step? Maybe the mtime checks are expensive or something.

We don't have a "please run everything up to this step" functionality right now, do we? Do you think we could implement this somehow?

Yeah it would be possible in principle to implement this, I'll put this in a new issue

And the steps in the documentation need to be renamed accordingly

@SophieHerbst what would you like to work on, if anything? Looks like we have

Update docs
Reorder/reconfigure ICA steps as per ENH: Split ICA into three steps #857 (comment)

I'm happy to do any of these

SophieHerbst · 2024-02-23T15:04:22Z

@larsoner I am happy to give both a try but likely will need some supervision to get it right. Also, I might not get to it before next week, so if you have time to start go ahead.
For the documentation, can you point me to how the website is updated?

larsoner · 2024-02-23T15:10:54Z

Docs are all built from https://github.com/mne-tools/mne-bids-pipeline/tree/main/docs/source and also the contents of https://github.com/mne-tools/mne-bids-pipeline/blob/main/mne_bids_pipeline/_config.py , so depends on what you need to update. I would use git grep with a few words from the offending text to find the right file

larsoner · 2024-02-23T17:41:10Z

Upon reflecting a bit... I think we should probably change both docs/source/features/steps.md and docs/source/features/overview.md to be generated using Python code. We can use the module __doc__ (ideally the title, but could be some other private var). @SophieHerbst okay for me to do this? That way you don't have to manually update anything and we'll automatically stay up-to-date...

larsoner · 2024-02-23T17:49:39Z

Hah, looks like we already do that with steps.md. So we need to do it with features.md as well...

larsoner · 2024-02-23T19:58:08Z

Since the doc part is in #860 now (I hope), I re-titled this issue to be about splitting ICA into three steps:

ICA.fit (which is done on task epochs, but they're separate from the ones eventually used because filtering differs)
Artifact identification
ICA.apply

hoechenberger · 2024-02-23T20:24:48Z

ICA.fit (which is done on task epochs, but they're separate from the ones eventually used because filtering differs)

yes this is what I meant to say, sorry if there was any confusion!

SophieHerbst · 2024-02-28T14:08:01Z

@hoechenberger @larsoner do you think a new release could be done with the changes in ICA?
Then I could run with the stable version once more, hand over to the student, and hopefully not touch preprocessing for this dataset again.

larsoner · 2024-02-28T14:14:32Z

I'd like to get #863 in, then I don't see why not 👍

hoechenberger · 2024-02-28T14:36:43Z

somebody would need to implement the "ICA splitting" first 😅

larsoner · 2024-02-28T14:40:34Z

I can tackle that this week I think, I don't think it'll be too bad

SophieHerbst · 2024-03-01T14:41:44Z

sorry, I didn't have the time to contribute to this. let me know I can test anything

SophieHerbst changed the title ~~Confused about pipeline steps and re-run~~ Pipeline stepnames and re-run Feb 23, 2024

larsoner closed this as completed Feb 23, 2024

larsoner reopened this Feb 23, 2024

larsoner mentioned this issue Feb 23, 2024

ENH: Add "run up to this step" option #858

Open

larsoner changed the title ~~Pipeline stepnames and re-run~~ ENH: Split ICA into three steps Feb 23, 2024

hoechenberger mentioned this issue Feb 26, 2024

Review ICA report logic #864

Closed

larsoner mentioned this issue Feb 28, 2024

ENH: Split ICA into multiple steps #865

Merged

1 task

larsoner closed this as completed in #865 Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Split ICA into three steps #857

ENH: Split ICA into three steps #857

SophieHerbst commented Feb 23, 2024 •

edited

Loading

larsoner commented Feb 23, 2024

SophieHerbst commented Feb 23, 2024

hoechenberger commented Feb 23, 2024

SophieHerbst commented Feb 23, 2024

larsoner commented Feb 23, 2024

SophieHerbst commented Feb 23, 2024

larsoner commented Feb 23, 2024 •

edited

Loading

hoechenberger commented Feb 23, 2024

hoechenberger commented Feb 23, 2024

hoechenberger commented Feb 23, 2024

SophieHerbst commented Feb 23, 2024

hoechenberger commented Feb 23, 2024 •

edited

Loading

SophieHerbst commented Feb 23, 2024

hoechenberger commented Feb 23, 2024 •

edited

Loading

SophieHerbst commented Feb 23, 2024

larsoner commented Feb 23, 2024 •

edited

Loading

SophieHerbst commented Feb 23, 2024

larsoner commented Feb 23, 2024

larsoner commented Feb 23, 2024

larsoner commented Feb 23, 2024

larsoner commented Feb 23, 2024

hoechenberger commented Feb 23, 2024

SophieHerbst commented Feb 28, 2024

larsoner commented Feb 28, 2024

hoechenberger commented Feb 28, 2024

larsoner commented Feb 28, 2024

SophieHerbst commented Mar 1, 2024

ENH: Split ICA into three steps #857

ENH: Split ICA into three steps #857

Comments

SophieHerbst commented Feb 23, 2024 • edited Loading

larsoner commented Feb 23, 2024

SophieHerbst commented Feb 23, 2024

hoechenberger commented Feb 23, 2024

SophieHerbst commented Feb 23, 2024

larsoner commented Feb 23, 2024

SophieHerbst commented Feb 23, 2024

larsoner commented Feb 23, 2024 • edited Loading

hoechenberger commented Feb 23, 2024

hoechenberger commented Feb 23, 2024

hoechenberger commented Feb 23, 2024

SophieHerbst commented Feb 23, 2024

hoechenberger commented Feb 23, 2024 • edited Loading

SophieHerbst commented Feb 23, 2024

hoechenberger commented Feb 23, 2024 • edited Loading

SophieHerbst commented Feb 23, 2024

larsoner commented Feb 23, 2024 • edited Loading

SophieHerbst commented Feb 23, 2024

larsoner commented Feb 23, 2024

larsoner commented Feb 23, 2024

larsoner commented Feb 23, 2024

larsoner commented Feb 23, 2024

hoechenberger commented Feb 23, 2024

SophieHerbst commented Feb 28, 2024

larsoner commented Feb 28, 2024

hoechenberger commented Feb 28, 2024

larsoner commented Feb 28, 2024

SophieHerbst commented Mar 1, 2024

SophieHerbst commented Feb 23, 2024 •

edited

Loading

larsoner commented Feb 23, 2024 •

edited

Loading

hoechenberger commented Feb 23, 2024 •

edited

Loading

hoechenberger commented Feb 23, 2024 •

edited

Loading

larsoner commented Feb 23, 2024 •

edited

Loading