You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a number of different NWB files that were created with the same code. Of these, a particular dataset takes an oddly long time to load in, but only in matnwb -- pynwb loads these just fine. For comparison here and here are two datasets that were generated with the same code. I don't think filesize is a factor here, but FWIW, this dataset takes three to four times longer to load than this one or this one, which are both much bigger. May be worth noting that the conversion code was written in python, and that I've confirmed with other people on other systems that this only happens in matnwb.
I ran the MATLAB profiler on the nwbRead function to try and figure out what was going on -- figure that might prove helpful, so attached it here.
MATLAB Version: 23.2.0.2515942 (R2023b) Update 7
MATLAB License Number: 40702059
Operating System: Microsoft Windows 10 Pro Version 10.0 (Build 19045)
Java Version: Java 1.8.0_202-b08 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
MATLAB Version 23.2 (R2023b) License 40702059
Simulink Version 23.2 (R2023b) License 40702059
Bioinformatics Toolbox Version 23.2 (R2023b) License 40702059
Computer Vision Toolbox Version 23.2 (R2023b) License 40702059
Data Acquisition Toolbox Version 23.2 (R2023b) License 40702059
Deep Learning HDL Toolbox Version 23.2 (R2023b) License 40702059
Deep Learning Toolbox Version 23.2 (R2023b) License 40702059
GPU Coder Version 23.2 (R2023b) License 40702059
Image Acquisition Toolbox Version 23.2 (R2023b) License 40702059
Image Processing Toolbox Version 23.2 (R2023b) License 40702059
MATLAB Coder Version 23.2 (R2023b) License 40702059
MATLAB Compiler Version 23.2 (R2023b) License 40702059
MATLAB Compiler SDK Version 23.2 (R2023b) License 40702059
MATLAB Test Version 23.2 (R2023b) License 40702059
Mapping Toolbox Version 23.2 (R2023b) License 40702059
Medical Imaging Toolbox Version 23.2 (R2023b) License 40702059
Optimization Toolbox Version 23.2 (R2023b) License 40702059
Parallel Computing Toolbox Version 23.2 (R2023b) License 40702059
Partial Differential Equation Toolbox Version 23.2 (R2023b) License 40702059
Reinforcement Learning Toolbox Version 23.2 (R2023b) License 40702059
Signal Processing Toolbox Version 23.2 (R2023b) License 40702059
SimBiology Version 23.2 (R2023b) License 40702059
Simulink Coder Version 23.2 (R2023b) License 40702059
Simulink Compiler Version 23.2 (R2023b) License 40702059
Statistics and Machine Learning Toolbox Version 23.2 (R2023b) License 40702059
Symbolic Math Toolbox Version 23.2 (R2023b) License 40702059
System Identification Toolbox Version 23.2 (R2023b) License 40702059
The profile.pdf is nice but unfortunately it seems like the slowness doesn't just come from the MatNWB implementation itself, but the internal hdf5 lib provided by MATLAB. I will have to benchmark this locally and find the places that these hdf5 libs are used.
Okay it's like I suspected: The issue is the slowness of H5R.get_name which is painfully slow (it takes 0.11 seconds per call and it is called 1925 times in the example file). Unfortunately, afaik this is the only way to retrieve the full HDF5 path name from the raw reference data which is important for the ObjectView and RegionView objects to search through the in-memory NwbFile references. Does h5py do something different with references that avoids this problem?
You could cache both the ref data as well as a path to the file and do what types.untyped.DataStub does. That would require a bit of breakage in the classes themselves but may not be that big of a change.
What happened?
I have a number of different NWB files that were created with the same code. Of these, a particular dataset takes an oddly long time to load in, but only in matnwb -- pynwb loads these just fine. For comparison here and here are two datasets that were generated with the same code. I don't think filesize is a factor here, but FWIW, this dataset takes three to four times longer to load than this one or this one, which are both much bigger. May be worth noting that the conversion code was written in python, and that I've confirmed with other people on other systems that this only happens in matnwb.
I ran the MATLAB profiler on the nwbRead function to try and figure out what was going on -- figure that might prove helpful, so attached it here.
Profiler.pdf
Steps to Reproduce
Error Message
No response
Operating System
Windows
Matlab Version
MATLAB Version: 23.2.0.2515942 (R2023b) Update 7
MATLAB License Number: 40702059
Operating System: Microsoft Windows 10 Pro Version 10.0 (Build 19045)
Java Version: Java 1.8.0_202-b08 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
MATLAB Version 23.2 (R2023b) License 40702059
Simulink Version 23.2 (R2023b) License 40702059
Bioinformatics Toolbox Version 23.2 (R2023b) License 40702059
Computer Vision Toolbox Version 23.2 (R2023b) License 40702059
Data Acquisition Toolbox Version 23.2 (R2023b) License 40702059
Deep Learning HDL Toolbox Version 23.2 (R2023b) License 40702059
Deep Learning Toolbox Version 23.2 (R2023b) License 40702059
GPU Coder Version 23.2 (R2023b) License 40702059
Image Acquisition Toolbox Version 23.2 (R2023b) License 40702059
Image Processing Toolbox Version 23.2 (R2023b) License 40702059
MATLAB Coder Version 23.2 (R2023b) License 40702059
MATLAB Compiler Version 23.2 (R2023b) License 40702059
MATLAB Compiler SDK Version 23.2 (R2023b) License 40702059
MATLAB Test Version 23.2 (R2023b) License 40702059
Mapping Toolbox Version 23.2 (R2023b) License 40702059
Medical Imaging Toolbox Version 23.2 (R2023b) License 40702059
Optimization Toolbox Version 23.2 (R2023b) License 40702059
Parallel Computing Toolbox Version 23.2 (R2023b) License 40702059
Partial Differential Equation Toolbox Version 23.2 (R2023b) License 40702059
Reinforcement Learning Toolbox Version 23.2 (R2023b) License 40702059
Signal Processing Toolbox Version 23.2 (R2023b) License 40702059
SimBiology Version 23.2 (R2023b) License 40702059
Simulink Coder Version 23.2 (R2023b) License 40702059
Simulink Compiler Version 23.2 (R2023b) License 40702059
Statistics and Machine Learning Toolbox Version 23.2 (R2023b) License 40702059
Symbolic Math Toolbox Version 23.2 (R2023b) License 40702059
System Identification Toolbox Version 23.2 (R2023b) License 40702059
Code of Conduct
The text was updated successfully, but these errors were encountered: