dicarlo.BashivanKar2019 #39

jjpr-mit · 2020-08-24T06:00:43Z

No description provided.

mschrimpf · 2020-08-24T13:46:39Z

@stothe2 do you think it makes more sense to have many (>12) separate assemblies (@kohitij-kar suggested this) or merge those assemblies into one, but then having many NaN entries (that can again be filtered out by later splitting into separate assemblies)?
My intuition is that we should keep an assembly more self-contained, i.e. have one or two assemblies for this paper, so that we don't flood the assembly lookup.

kohitij-kar · 2020-08-24T13:53:46Z

@mschrimpf and @stothe2 — Just to directly register the conversation, I am pasting it below. Also, I don’t understand assemblies etc very well. I don’t think of data that way anyways. So I actually never made any specific suggestions: just explained what I thought is the case with this dataset. All I mean is please don’t consider my opinion very strongly during making decisions for this. Please do what you think is best here.

Jon: Each monkey looked a a completely different set of synthetic images, is that correct? If I made synthetic into one single DataAssembly, it would be mostly NANs. So, would it be appropriate to make three separate assemblies?

I assume I need to package curvature and naturalistic as one assembly each, too, to give context to the synthetic. So that's five assemblies total.

Until now the data I've packaged had already been chopped into tidy rectangular blocks. Where there are differing numbers of reps in a single assembly, do I chop off the excess, or leave some NANs in?

Ko: Here are my thoughts on the issues you mention.

Yes I think it’s best to treat each monkey, each session as separate.
Naturalistic images are same for both monkeys. But the curvature stimulus was different dependent on what receptive field we had for the neurons. I would either leave that out for now (and push this to later discussions) or include them as separate for each monkey.
I would use nans and not chop off any data
The problem may be that assemblies etc had been discussed so far with our usual data collection in mind. This was a very different type of experiment. So the data that exists is not completely consistent with everything else (yet I believe there is a way to make it such).

mschrimpf · 2020-08-24T14:05:57Z

thanks Ko! This data is actually a really good case for us to more thoroughly define what we mean by an "assembly" -- i.e. whether it is a recording session or rather an entire experiment.
I'm leaning towards the latter because I think that users will mostly want to retrieve data for an experiment rather than for individual sessions. (And having all the data for a paper in one package feels more clean)

kohitij-kar · 2020-08-24T14:12:58Z

I agree. I will never advocate for an assembly per recording session. That’s too much. But I think as you will see that beyond that — things will quickly get more complicated as we do different types of experiments (which I believe will be a marker of the lab actually making SoTA experimental progress and not just churning out the same old stuff at more quantity). For now, assemblies per dataset grouped by purpose of collection, e.g paper or specific analysis, sounds reasonable to me.

stothe2 · 2020-08-24T16:24:17Z

@jjpr-mit Looking at the code you pushed, Jon, one thing that could be done to reduce the number of assemblies is to combine the sessions (at least). So, Monkey M/OHP/Session1/Nat - Monkey M/OHP/Session1/Synth (since the naturalistic stimuli in a given monkey and session were used for synthesis for same monkey and session), etc.

As for rest of the discussion (single vs multiple assemblies), I'm ambivalent. I see both as a short term solution (like Ko said, we need to start thinking of experiments people will be doing in the next ~2 years -- amygdala, muscimol, stimulation -- and come up with a more intuitive solution for the heterogeneous data).

jjpr-mit · 2020-08-24T17:02:22Z

So, Monkey M/OHP/Session1/Nat - Monkey M/OHP/Session1/Synth

@stothe2 Yes, I thought about combining each synth/nat pair, since they align on the neuroid axis. I'd have to combine the synth and nat StimulusSets to do that, though, and they've got completely different metadata, so I settled for separating the stages of the experiment.

stothe2 · 2020-08-24T20:16:13Z

I'd have to combine the synth and nat StimulusSets to do that, though, and they've got completely different metadata

Fair point. It shouldn't matter for the assembly (since you would only include those columns which are common to both StimulusSets, for example, image_id), but I guess you're now dealing with NaNs in the StimulusSet (like dicarlo.BOLD5000). Even so, I would prefer having them combined since it makes it obvious to the user that they're meant to go together (i.e., we used responses to these naturalistic images for mapping, and then used the resulting mapped model for synthesising these other images).

jjpr-mit · 2020-09-02T20:01:11Z

only include those columns which are common to both StimulusSets

The metadata from the StimulusSet are automatically merged with the assembly on load.

Trying to combine two StimulusSets with different metadata into one just so we can combine pairs of assemblies introduces complexity, and reducing complexity is the point of combining. So I think it defeats the purpose. Until we upgrade DataAssembly to handle heterogeneous non-aligned datasets, we can't accommodate combining these assemblies.

jjpr-mit · 2020-09-02T21:26:12Z

@stothe2 @mschrimpf @kohitij-kar
I'm trying to figure out the right way to collect non-aligned chunks of data together. The authors of xarray say that DataArray and Dataset are not intended for that:

https://stackoverflow.com/questions/53754243/xarray-hierarchical-data-organization

Finally, note that xarray (currently) does not have any version of a hierarchical data structure for non-aligned axes. Aligned axes are an intentional constraint of the data models for xarray.Dataset and xarray.DataArray. If you have sub-groups that are not aligned along common axes, you'll need to keep track of them in some separate data structure.
answered Dec 18 '18 at 18:52 shoyer

also:

Dataset groups #1092
Feature Request: Hierarchical storage and processing in xarray #4118

I don't think we should hold up this PR to refactor BrainIO with a hierarchical data solution.

mschrimpf · 2020-09-03T00:45:05Z

yes agree, I thought what @stothe2 proposed was to have one assembly per session (with a bunch of NaNs from non-overlap -- i.e. no refactoring, just nans)

jjpr-mit · 2021-01-20T16:52:15Z

The shapes of the source data:

jjpr-mit · 2021-01-20T16:55:10Z

How synthetic sessions will be arranged as a single assembly (black is NaNs):

The image is reduced to a single repetition from each session for clarity.

…y two assemblies.

Also fix tests for dicarlo.BashivanKar2019 as only two assemblies * master: Rust305 (#51) Fix Kuzovkin 2018 (#50) Inplace (#47) dicarlo.Seibert2019 (#48) add ImageNet stimulus set (#45) Created exceptions in fetch and packaging for PropertyAssembly class that do not merge responses with stimulus_set (#42) Update lookup.csv (#44) Update Rajalingham2020 lookup (#43) Update lookup.csv (#41) use image_file_name without .png if present (#40) # Conflicts: # tests/test_assemblies.py # tests/test_stimuli.py

… Added NaNs to tests.

stothe2 · 2021-01-28T21:08:08Z

yes agree, I thought what @stothe2 proposed was to have one assembly per session (with a bunch of NaNs from non-overlap -- i.e. no refactoring, just nans)

From what it looks like, there are two NeuronAssemblies (and corresponding StimulusSets) -- one for naturalistic and one for synthetic. This works too!

Lookup and tests for BashivanKar2019.

2014d55

jjpr-mit requested a review from mschrimpf August 24, 2020 06:01

mschrimpf requested a review from stothe2 August 24, 2020 13:35

jjpr-mit added 4 commits January 20, 2021 13:11

Revert packaging of dicarlo.BashivanKar2019 so I can repackage as onl…

c34ee73

…y two assemblies.

Consolidated assemblies for synthetic data (dicarlo.BashivanKar2019).…

ffa5cfb

… Added NaNs to tests.

Trigger Travis.

5f9c05b

stothe2 approved these changes Jan 28, 2021

View reviewed changes

jjpr-mit requested review from stothe2 and removed request for mschrimpf and stothe2 February 3, 2021 22:48

stothe2 approved these changes Feb 3, 2021

View reviewed changes

mschrimpf approved these changes Feb 4, 2021

View reviewed changes

jjpr-mit merged commit ff93098 into master Feb 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dicarlo.BashivanKar2019 #39

dicarlo.BashivanKar2019 #39

jjpr-mit commented Aug 24, 2020

mschrimpf commented Aug 24, 2020

kohitij-kar commented Aug 24, 2020 •

edited

Loading

mschrimpf commented Aug 24, 2020

kohitij-kar commented Aug 24, 2020

stothe2 commented Aug 24, 2020

jjpr-mit commented Aug 24, 2020

stothe2 commented Aug 24, 2020

jjpr-mit commented Sep 2, 2020

jjpr-mit commented Sep 2, 2020

mschrimpf commented Sep 3, 2020

jjpr-mit commented Jan 20, 2021

jjpr-mit commented Jan 20, 2021

stothe2 commented Jan 28, 2021

dicarlo.BashivanKar2019 #39

dicarlo.BashivanKar2019 #39

Conversation

jjpr-mit commented Aug 24, 2020

mschrimpf commented Aug 24, 2020

kohitij-kar commented Aug 24, 2020 • edited Loading

mschrimpf commented Aug 24, 2020

kohitij-kar commented Aug 24, 2020

stothe2 commented Aug 24, 2020

jjpr-mit commented Aug 24, 2020

stothe2 commented Aug 24, 2020

jjpr-mit commented Sep 2, 2020

jjpr-mit commented Sep 2, 2020

mschrimpf commented Sep 3, 2020

jjpr-mit commented Jan 20, 2021

jjpr-mit commented Jan 20, 2021

stothe2 commented Jan 28, 2021

kohitij-kar commented Aug 24, 2020 •

edited

Loading