-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up masking/preprocessing in CBMAEstimator hierarchy #195
Comments
Sounds good to me.
KernelEstimators can return either lists of images ( Ultimately, I know you also want to allow KernelTransformers to return Datasets by default (per #41), which I think would be a good replacement for the list-of-imgs output type.
That is one way of calling the transformer right now. When a Dataset is passed in, the masker from the Dataset is used automatically. When a DataFrame is passed in, then the mask needs to be a separate argument because there's no way of inferring it. The problem is that, in the CBMAEstimators' permutation-based FWE correctors, we're replacing the coordinates in the dataset with randomized ones. It's just faster and less memory-intensive to work with DataFrames at that point than Datasets.
I can do that. |
Minor update: I changed the |
I think I can set the masker at initialization, although the masker will still need to be inferred from the dataset in cases where the kernel transformer isn't initialized with one, and that happens in I still don't want to set the masking procedure at initialization, for the reasons above. Other than that, I think everything else that's been requested will be handled in #320. |
I think the situation has developed enough that this isn't applicable anymore, without a single PR being responsible for closing it, so I'm going to close. We can reopen if necessary. |
There's some redundancy in the masking/preprocessing done inside the
CBMAEstimator
subclasses that makes maintenance more difficult than it could be. We should probably clean this up and push as much of this stuff as possible into the baseCBMAEstimator
class. I realize there's some heterogeneity infit()
signature that makes things a bit more complex, but probably we can handle that with either mixins or by adding something like aCBMAPairwiseEstimator
child that expects two inputs.Another thing I'm not super clear on is why we have a lot of calls of the form
self.kernel_estimator.transform(self.dataset, mask=self.mask, masked=True)
. IMO the masking parameters should be set on theKernelTransformer
at initialization time, as with other classes. I think we should apply the same principle as elsewhere: if theKernelTransformer
isn't initialized with a masker, then it will always take it from theDataset
. Since the dataset is always passed through from theCBMAEstimator
, I think the above call can then be reduced toself.kernel_estimator.transform(self.dataset)
, which is a much cleaner API and will cut down on a lot or redundancy. (As a minor aside, shouldn't it beCBMAEstimator.kernel_transformer
rather thanCBMAEstimator.kernel_estimator
, for consistency with theKernelTransformer
class?)This probably shouldn't be touched until we merge #163 (which I'm working on right now), but just bringing it up here for comment.
The text was updated successfully, but these errors were encountered: