Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing NWB support #853

Closed
talmo opened this issue Jul 21, 2022 Discussed in #851 · 4 comments
Closed

Enhancing NWB support #853

talmo opened this issue Jul 21, 2022 Discussed in #851 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@talmo
Copy link
Collaborator

talmo commented Jul 21, 2022

We should support the features mentioned by @bendichter in our NWB I/O.

  1. Make the adapter more flexible.

    def write(self, filename: str, labels: Labels):

    This should accept a couple more things to look like:

    write(
        self,
        filename: str,
        overwrite: bool = False,
        session_description: str = "sleap",
        identifier: Optional[str] = None,  # if None, use a GUID like str(uuid.uuid4())
        session_start_time: Optional[datetime.datetime] = None,  # if None, use datetime.datetime.now()
    ):
        if session_start_time is None:
            session_start_time = datetime.datetime.now(datetime.timezone.utc)
        if identifier is None:
            identifier = uuid.uuid4()
        if Path(filename).exists() and not overwrite:
            with NWBHDF5IO(filename, mode="r") as io:
                nwb_file = io.read()
        else:
                nwb_file = NWBFile(
                    session_description=session_description,
                    identifier=identifier,
                    session_start_time=session_start_time,
            )
        # ...
        with NWBHDF5IO(path, mode="w" if overwrite else "a") as io:
            io.write(nwb_file)
  2. Create a high-level export API as a instance-level method on Labels with signature:

    def export_nwb(self, filename: str, overwrite: bool = False, session_description: str = "sleap", identifier: Optional[str] = None, session_start_time: Optional[datetime.datetime] = None):
        # Call the adapter to write self

Discussed in #851

Originally posted by bendichter July 21, 2022
I see you have made a lot of progress in supporting NWB lately! It looks like you are using the ndx-pose extension as intended. A few points I want to bring up

  1. Add append feature

Currently, a user is not able to set required metadata such as the session_start_time, identifier, and session_description. Instead, placeholder values are used here:

nwb_file = NWBFile(
session_description="session_description",
identifier="identifier",
session_start_time=datetime.datetime.now(datetime.timezone.utc),
)

with no way to overwrite them. This creates NWB files that are valid, but contains incorrect data. There are two ways to solve this. The first is to allow a user to pass in this metadata to the write function. The second would be to allow a user to create an NWB file on their own with the proper metadata and then append the sleap data to that file.

The second approach, allowing a user to append to an existing NWB file, also solves a different problem. NWB files are meant to store all of the streams of data in a single file, e.g. behavior + optical physiology or electrophysiology. If I wanted to put sleap and ephys data in an NWB file, since this is currently written without an append option, I'd be forced to first write the sleap data and then append the ephys data. This is technically doable but becomes a problem if any other data writer or converter has the same limitation of not being able to append. Because of this, it is best practice to allow for appending to an existing NWB file.

  1. Adding sleap to NWB software documentation

We keep track of software that can read and write NWB here. Would you mind submitting a PR here next to the entry for DeepLabCut so our users know that sleap is part of the NWB ecosystem?

  1. NWB Conversion Tools

We are developing NWB Conversion Tools which provide automated conversions from a large number of proprietary formats. It would be great if we could use this library to convert data that has already been written to NWB. It seems like your adapter system with read and write for several formats could be helpful. Does sleap have a standard output format?

@talmo talmo added the enhancement New feature or request label Jul 21, 2022
@CodyCBakerPhD
Copy link

Sounds great, and it was nice meeting you all today!

I just wanted to mention some technical details about operating on an NWBFile in append mode that we've learned from many past experiences...

The approach of

    if Path(filename).exists() and not overwrite:
        with NWBHDF5IO(filename, mode="r") as io:
            nwb_file = io.read()
    else:
            nwb_file = NWBFile(
                session_description=session_description,
                identifier=identifier,
                session_start_time=session_start_time,
        )
    # ...
    with NWBHDF5IO(path, mode="w" if overwrite else "a") as io:
        io.write(nwb_file)

when Path(filename).exists() = True and overwrite = False, the modified nwb_file object after the # ... stages will be pointing to the now-closed io from when it was first read from with NWBHDF5IO(filename, mode="r") as io: and this could cause issues when you finally hit the final io.write in the new io from with NWBHDF5IO(path, mode="w" if overwrite else "a") as io: in the final step.

Instead, the way we've successfully performed appending is to do so from within the respective context that that nwb_file object that was read.

Here are some code snippets that have evolved over the years for this purpose...

The oldest, probably least fancy way: https://github.com/catalystneuro/neuroconv/blob/cc1c2fb9b175eeabe026066901d2a8344585343a/src/nwb_conversion_tools/nwbconverter.py#L168-L181

which is basically what you have now except everything is combined into a single context and the # ... steps are performed inside that context.

The newer/fancier way we have is through our own custom designed context for making the 'append' vs. 'make a fresh NWBFile' decision behind the scenes: https://github.com/catalystneuro/neuroconv/blob/main/src/neuroconv/nwbconverter.py#L173-L184 + https://github.com/catalystneuro/neuroconv/blob/main/src/neuroconv/tools/nwb_helpers.py#L118-L177

Of course another alternative is to simply not use a context at all and just remember to call io.close() either at the end of the export or in an except clause wrapper in the event that there was an error anywhere in the process. I personally prefer contexts for maximum safety, though.

Oh, and another design principle I'd strongly suggest is making whatever the steps are within that # ... block an external function like add_content_to_nwbfile_object(nwbfile: pynwb.NWBFile, ...) so that all the SLEAP/ndx-pose details can be added to an in-memory NWBFile object separate from all the I/O handling. Such a helper function is probably what we would integrate directly with the corresponding DataInterface in NeuroConv since we typically like to handle the integration of multiple data streams all in a single io.write() call at the end of our process.

Anyway, looking forward to meeting some of you at Janelia next week!
Cheers,
Cody

@CodyCBakerPhD
Copy link

CodyCBakerPhD commented Jul 22, 2022

Yet another detail that I just recalled - any call to NWBHDF5IO(filename, mode="r") will also want to include the keyword argument load_namespaces=True otherwise it might fail to open an existing file that has extensions used within it.

@talmo
Copy link
Collaborator Author

talmo commented Jul 22, 2022

Excellent, thanks for the pointers @CodyCBakerPhD!! We'll be mindful of how we work with the IO handles.

@roomrys roomrys added the fixed in future release Fix or feature is merged into develop and will be available in future release. label Jul 23, 2022
@roomrys roomrys mentioned this issue Jul 24, 2022
@talmo
Copy link
Collaborator Author

talmo commented Jul 25, 2022

Added in v1.2.5.

@talmo talmo closed this as completed Jul 25, 2022
@roomrys roomrys removed the fixed in future release Fix or feature is merged into develop and will be available in future release. label Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants