-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Artifact Subspace Reconstruction #9302
Conversation
…on (tested for equivalence)
…s back to default when pyriemann is not available
need to fix asr
Todo: ASR object docstrings, asr_calibrate docstring
Hi @DiGyt, I'm happy if my code ends up being useful. And in the long term it will be better maintained in MNE than I could ever on my own. Also happy to review the code next week. FWIW, here's what kept me from filing a PR in MNE-python:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a few comments. Now I am concerned about the online aspect absolutely not covered by mne-python. If it can be valuable for offline processing I think a convincing example on real data would be very useful to assess the relevance of ASR for mne users. thanks @DiGyt !
return mu, sig, alpha, beta | ||
|
||
|
||
def yulewalk(order, F, M): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is already a yule walker code in https://github.com/mne-tools/mne-python/blob/main/mne/time_frequency/ar.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, Thanks for pointing that out!
return out, zf | ||
|
||
|
||
def ma_filter(N, X, Zi): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be done with https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.lfilter.html#scipy.signal.lfilter
return X, Zf | ||
|
||
|
||
def geometric_median(X, tol=1e-5, max_iter=500): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be used to do an evoked using median and not mean of epochs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make a more general version of this function and add it to utils?
@@ -0,0 +1,38 @@ | |||
# Authors: Dirk Gütlin <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't put a big file like mne/preprocessing/tests/data/matlab_asr_data.mat in the repo. It should be in the mne-testing repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as mentioned in the initial commit, this is just a preliminary thing. In case we want to keep this file in the testing procedure, I'll move it to the testing data repo...
Totally agree with that. I have the exact same opinion on a lot of the MATLAB code. I actually liked some of your implementations better as they looked way more "pythonic". However, I wanted to play it safe and therefore chose the path of direct equivalence... @agramfort @nbara I agree that the method generally was developed to work with real-time streaming data. However, knowing that ASR/rASR was included as the standard EEG cleaning plugin for EEGLAB, there seems to be a point of applying it to offline data (at least to the EEGLAB devs). My personal opinion of ASR is that it looks very good if you inspect it visually, meaning that it seems to catch and interpolate noise very well while at the same time leaving alone clean segments. P.S.: if anyone knows about some cleaning algorithms that would be interesting to include, feel free to mention them here :) |
From: #7479 (comment) Soo, here is a link to my comparison of a few automated cleaning methods. https://digyt.github.io/automated_EEG_cleaning_comparison/ It's far from exhaustive and perfectly valid, as this would imply more variable data and more computing power (definitely going beyond an online Google Colab Script). Other limitations are named in the notebook. However, as far as this allows me to draw conclusions, I would say the following about Artifact Subspace Reconstruction:
So, in brief I would say that ASR is no "superhuman" cleaning algorithm, but from my point of view it definitely has its use cases, and could fill some gaps especially in the Python cleaning landscape. |
thanks a lot for sharing this blog post !
exposing these methods with a consistent API would certainly have value.
Maybe not to be
done in MNE-Python directly.
regarding your conclusions based on decoding scores I would suggest to look
at error bars
on the cross-validation scores. you seem to have tested on small dataset
and I suspect there
is quite some variance on the scores.
… |
Yes, exactly. In the performance comparison table A. I printed the standard deviations next to the means. The datasets used are definitely too few to make stable conclusions about significance for most of the comparisons. My plan was to apply this to more datasets (and especially more noisy ones), but as I do this in my free time, good datasets are hard to find... Already had an Eye on mne-hcp like you did for AutoReject. However, this is still work to be done.
I don't know, I really would petition you @agramfort to generally take up as much functionality as possible in MNE. I'm aware that you guys got a lot of code to manage already and that there is way too much work for too few active contributors (and definitely people like me should involve ourselves more in the maintenance work), but the problem that I see is that most of these small toolboxes which have no stable maintainer body just die out at some point and their functionality (even though still relevant) is vanishing with growing number of compatibility bugs. Just take Issue #94 with pyriemann (which is already relatively big) as an example. That's just my view on things. I know you probably have a better and more resource-aware perspective on this, still I think it should be considered. Edit: Still, if you think MNE grows too big for using/maintaining/testing I'd imagine it to be awesome to have all that cleaning stuff in one place, with consistent API (like having multiple cleaning algorithms available in AutoReject, or maybe even a mne-cleaning toolbox). Just as wishful thinking... 😃 |
keep us posted on your progress. I am indeed a bit reluctant to put more
maintenance work on mne.
Hence why I pushed for autoreject in a separate project and considering
moving mne.connectivity
in a new separate package mne-connectivity.
… |
Our analysis on newborns EEG showed better artifacts removal using ASR than using conventional methods like ICA (because the data length is shorter and contaminated with non-stereotypical artifacts). |
@DiGyt sorry for the slow movement here. We talked briefly at the MNE-Python dev meeting today about this and other PRs that are in a similar situation (e.g., #11234) and had some ideas about how to move forward, with the goal being to balance discoverability with maintenance burden. The gist of the idea is something like:
I think if we had a policy like this in place a year ago, your code could have been used by people for the last year. And we could have made sure it continued to work with MNE-Python code changes by helping with CI/testing/releasing aspects, which are often the hardest for newer contributors to grasp (and maintain) anyway. WDYT? |
Hey @larsoner No need to be sorry at all. Thank you very much for looking at the topic again :) I also thought about what to do with the ASRpy code. For now, there's a simple version available under: However, I would love to integrate it somewhere with a larger community and a larger platform, so the MNE-incubator seems like the right place for this. So what would be the correct move now? Should I just close this PR and reopen it under mne-tools/mne-incubator? |
Once transition to some other package is complete, you could upload a new point release to PyPI with a
I would have said yes, but in #7479 (comment) you mentioned that there is an existing implementation in https://github.com/nbara/python-meegkit that works. I'm inclined to promote that instead of creating yet another version of the algorithm in mne-incubator. @nbara thoughts on this? For context, read my post above. I'm happy if To me one missing "requirement" for python-meegkit would be PyPI+conda-forge accessibility -- I think it's important for adoption, and allows us to trivially add it to mne-installers that are now promoted in our installation guide. I'm happy to put the infrastructure work above into python-meegkit as long as @nbara's on board, though! Another option would be for @nbara to migrate some or all python-meegkit code to mne-incubator. If we went this route and @nbara volunteered to co-maintain mne-incubator, that would be great! |
Hi @larsoner and @DiGyt, sorry for the late response.
I'm definitely committed to keep maintaining MEEGkit, and I definitely planned to add it to pypi down the line (just haven't gotten around to it really) so I'm definitely up for that.
I'm not against it, the only that bugs me is that I've tried to avoid adding MNE as a dependency of MEEGkit [https://github.com/nbara/python-meegkit/blob/master/requirements.txt]. All the code in MEEGkit was meant to operate on numpy arrays rather than MNE's Epochs/Raw etc. So if we go this route and I migrate the ASR code to mne-incubator, I will have to maintain a separate, "numpy-only" version of the code. Conversely, do you think we can come up with an implementation where the core of the algorithm is numpy/scipy functions only, and add the MNE compatibility within Either way, let me know. I'm not strongly opinionated either way and will not let this stop us from moving forward. EDIT: I just added meegkit to pypi https://pypi.org/project/meegkit/ |
Definitely. We already do this in MNE-Python. We have functions like tfr_morlet (for objects) and tfr_array_morlet (for arrays) and the object one is a wrapper for the array one. |
Any reason why? We've tried to keep MNE dependencies small over the years If it's because of conda pulling in a ton of dependencies, you can depend on |
Well mostly because most if not the functionalities in MEEGkit do not require MNE. But also because this was used as part of a research project, and colleagues which did not use MNE either. I didn't know about mne-base though, thanks for pointing it out. |
@nbara would you be interested in migrating some or all of your code to |
I'm ok with moving the ASR code to I don't know if you meant moving all of meegkit (?) but this will take significantly more time so I'd rather we did it piece by piece. |
I have not yet looked at what all is available there... Without looking I had in mind: anything that we want to make easily available to the community that we think/hope would be widely used, to be released in the installers / mne-incubator. Makes sense to move things over bit by bit as use cases come up! |
Closing PR here since I think we'll want ASR in |
Reference issue
Fixes #7479.
What does this implement/fix?
Introduces Artifact Subspace Reconstruction for cleaning raw EEG.
Additional information
Sorry for the long waiting since my mention in issue #6290.
I wanted to make sure that this algorithm was as equivalent as possible to the original ASR implementation in MATLAB.
@nbara, first of all thank you very much for your implementation, this was a great thing to start with and I could many functions without any need to change. Due to your contribution to the code, I mentioned you as an author in the corresponding files (but without the E-Mail address). Are you okay with that or should I change it in some way?
Side note @nbara :
Although your code looks really solid on a theoretical level, most of the operations converge from the original implementation. You are probably aware of most of the conceptual differences (using different functions or loops etc), but I also think I found smaller mistakes like dot products on matrices where actually the transposed matrices were required (I think this happened somewhere in
asr_calibrate
or one of its helper functions). Just wanted to tell you this, since I think this can change the data drastically...Equivalence to the original
In this implementation, I made sure that
asr_calibrate
andasr_process
are perfectly equivalent with the original ASR implementation. However there were several steps that introduce a slight divergence of the data. This is mostly due to different solvers used in MATLAB/Python. These steps are:fit_eeg_distribution
clean_windows
Everything else in these functions should be entirely equivalent, leading to an average channel correlation of r=0.94-0.97 between original version and this implementation (as compared to r~0.0 with uncleaned data or the ASR implementation this was based on).
Concerning testing, I am not sure how we should optimally proceed. For now I test the correlation of cleaned data with data cleaned with the original ASR. This is obviously not perfect, as it's not a precise assertion and requires additional external data for testing (for now I added this test data directly to preprocessing/tests/data, assuming that it will be changed anyway).
riemannian ASR
The riemannian variant of Artifact Subspace Reconstruction is not yet available, as it requires pyriemann (and probably pymanopt) as dependencies and the current version of pyriemann has compatibility issues with the newer scipy versions (See pyRiemann/pyRiemann#94 and pyRiemann/pyRiemann#98). But as most people report no big difference between ASR/rASR anyway, this should be okay for now. However, I left a
method
parameter and notes to later adapt the riemannian variant more easily.