Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable graceful degradation for feature extractor #278

Closed
sgratiy opened this issue Aug 22, 2019 · 5 comments
Closed

Enable graceful degradation for feature extractor #278

sgratiy opened this issue Aug 22, 2019 · 5 comments

Comments

@sgratiy
Copy link
Contributor

sgratiy commented Aug 22, 2019

Currently, the feature extraction reports a failure when features from all types of sweeps cannot be computed. For instance, if short squares do not have spiking sweeps, feature extraction will fail with no features extracted.

Request: Change feature extractor to return features that could be computed rather than failing the entire extractor output. E.g, if short squares fail, then still report features for long squares and ramps.

@tmchartrand
Copy link
Collaborator

Just wanted to add here that this is a very high-priority fix from my perspective, and I know Anatoly has come up against this in his analysis too. As a first pass, simply filling in failed features with a missing value. Would also be important eventually to record the specific failure appropriately in the feature extractor output as well, since these failures ultimately need to be treated as additional cell-level QC criteria by any downstream analysis.

@gouwens
Copy link
Collaborator

gouwens commented Sep 12, 2019

Could the scope of this be clarified with some more details? There really isn't "the feature extraction" here - this is a library that enables the analysis of ephys features. What functions or scripts specifically are being talked about here? And should this be a package-level responsibility or a user-level responsibility?

@gouwens
Copy link
Collaborator

gouwens commented Sep 12, 2019

Also, I'd like to caution that missing values could be ambiguous here if not used with care. For example, sometimes a sweep doesn't have an adaptation index because there was only one spike, not because it failed QC.

@tmchartrand
Copy link
Collaborator

@gouwens, I've talked with Sergey about this, and the context is specifically the pipeline feature extractor.

In that context, the fix is important to me to enable a complete analysis of a given feature across a dataset by accessing precalculated results in LIMS tables or the associated json results records. Currently any such analysis will be incomplete due to cells for which no features are recorded after a potentially minor failure during the feature extraction (and thus potentially contain a subset biased in subtle ways also).

Regarding missing values: it seems to me that as long as we consider the QC fail to be a true indicator that we shouldn't trust this sweep, it's fine to use the same value (NaN) for that as for a case where the feature is simply not applicable, or the feature code failed due to some unforeseen edge case. Ideally the specific reasons could be recorded (maybe in the cell json and as LIMS tags?), but in all cases the result is just no information on that feature for the given sweep or cell.

The only exception I could see would be if we wanted to calculate some features despite certain QC failures, to allow users of the output to assess the impact of specific QC criteria. In your example it seems like it would be reasonable to calculate the adaptation despite a QC fail due to high noise, say, or at least to allow a user to override the QC pass restriction. That's probably a different discussion though, and not one I feel strongly about.

@wbwakeman
Copy link

This should be addressed by PR #449 which has been merged into the released version of ipfx. Now available from pypi: pip install --upgrade ipfx

Please use that updated version and let us know if there are still issues that need to be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants