You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug, including details regarding any error messages, version, and platform.
def load_my_data():
path = "mydata.arrow"
with pa.memory_map(path) as f:
t = pyarrow.feather.read_table(f)
return t
t = load_my_data()
I have the above script. My arrow file is uncompressed. I was hoping that I would be able to load it without copying, by memory-mapping.
The above doesn't seem to do that. I see no mention of my file path in /proc/{pid}/smaps. Moreover, if I create a large list of load_my_data() results, my resident memory usage goes up and up until I OOM.
Is this expected?
Thanks
Component(s)
Python
The text was updated successfully, but these errors were encountered:
This is my fault. Turns out the files are actually compressed (I thought the default compression=None in pyarrow.feather.writer_feather was "no compression" and not lz4).
It would be useful to have a flag that will error a reader if the file can't be memory-mapped.
kou
changed the title
MemoryMappedFile / memory_map feather table read creates a copy?
[Python] MemoryMappedFile / memory_map feather table read creates a copy?
Dec 7, 2024
Describe the bug, including details regarding any error messages, version, and platform.
I have the above script. My arrow file is uncompressed. I was hoping that I would be able to load it without copying, by memory-mapping.
The above doesn't seem to do that. I see no mention of my file path in
/proc/{pid}/smaps
. Moreover, if I create a large list ofload_my_data()
results, my resident memory usage goes up and up until I OOM.Is this expected?
Thanks
Component(s)
Python
The text was updated successfully, but these errors were encountered: