You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the new to_dataframe() function that was implemented in #380
One issue that I'm seeing is that when loading some of the waveform signals from https://physionet.org/content/mimic3wdb-matched/1.0/ using to_dataframe() it eats up a lot of memory. Specifically, on the machine I'm running on which has 96gb of memory, reading the record and calling to_dataframe runs out of memory.
I would like to lazy load the signal data into a chunked dataframe which would allow me to process the waveform signals in parts that could fit into memory, rather than loading it all into memory.
The text was updated successfully, but these errors were encountered:
I accomplished this by reading the header of the record, getting the signal length, and then building my own chunking process, by using rdrecord(sigfrom, sigto) which unblocks me, but before I close this, might be worth discussing what they think the solution is and if there should be a documented solution to this problem or approach.
Thanks @thomasdziedzic-calmwave, let's keep this issue open. I think it would be good to try to address the problem directly, perhaps as an argument to to_dataframe().
Should we consider adopting Dask dataframes? My understanding is that they are better able to handle datasets that are too large for RAM: https://docs.dask.org/en/stable/
I'm using the new
to_dataframe()
function that was implemented in #380One issue that I'm seeing is that when loading some of the waveform signals from https://physionet.org/content/mimic3wdb-matched/1.0/ using
to_dataframe()
it eats up a lot of memory. Specifically, on the machine I'm running on which has 96gb of memory, reading the record and calling to_dataframe runs out of memory.I would like to lazy load the signal data into a chunked dataframe which would allow me to process the waveform signals in parts that could fit into memory, rather than loading it all into memory.
The text was updated successfully, but these errors were encountered: