Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'mmap' #398

Open
kvablack opened this issue Apr 24, 2024 · 0 comments
Open

AttributeError: 'NoneType' object has no attribute 'mmap' #398

kvablack opened this issue Apr 24, 2024 · 0 comments

Comments

@kvablack
Copy link

This seems to happen at shutdown in any data pipeline that has NumPy arrays. Here is the full stacktrace:

INFO:absl:Process 0 exiting.
INFO:absl:Processing complete for process with worker_index 0
INFO:absl:Grain pool is exiting.
INFO:absl:Shutting down multiprocessing system.
INFO:absl:Shutting down multiprocessing system.
Exception ignored in: <function SharedMemoryArray.__del__ at 0x7e3b780a8a60>
Traceback (most recent call last):
  File "/home/black/micromamba/envs/trainpi/lib/python3.10/site-packages/grain/_src/python/shared_memory_array.py", line 139, in __del__
AttributeError: 'NoneType' object has no attribute 'mmap'
/home/black/micromamba/envs/trainpi/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Even if it's not an actual problem, it's a bit annoying because it overwhelms the logging output when you have many workers.

Here's the simplest possible repro:

import grain.python as grain
import numpy as np
import logging
logging.basicConfig(level=logging.INFO)

if __name__ == "__main__":
    class DataSource:
        def __len__(self):
            return 10

        def __getitem__(self, idx):
            return np.zeros(1)

    source = DataSource()
    sampler = grain.IndexSampler(
        num_records=len(source),
        num_epochs=1,
        shard_options=grain.NoSharding(),
        shuffle=False
    )
    loader = grain.DataLoader(
        data_source=source,
        sampler=sampler,
        worker_count=1,
    )

    for batch in loader:
        pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant