Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: cannot cache function 'square': no locator available for file #4908

Open
99991 opened this issue Nov 29, 2019 · 9 comments
Open
Labels
bug caching Issue involving caching
Milestone

Comments

@99991
Copy link

99991 commented Nov 29, 2019

When trying to import a numba-cached function from a python egg package, I get the following error message about no cache file locator being found:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 656, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 626, in _load_backward_compatible
  File "/home/username/.local/lib/python3.6/site-packages/mytestpackage-1.2.3-py3.6.egg/mytestpackage/mymodule.py", line 3, in <module>
  File "/home/username/.local/lib/python3.6/site-packages/numba/decorators.py", line 194, in wrapper
    disp.enable_caching()
  File "/home/username/.local/lib/python3.6/site-packages/numba/dispatcher.py", line 679, in enable_caching
    self._cache = FunctionCache(self.py_func)
  File "/home/username/.local/lib/python3.6/site-packages/numba/caching.py", line 614, in __init__
    self._impl = self._impl_class(py_func)
  File "/home/username/.local/lib/python3.6/site-packages/numba/caching.py", line 349, in __init__
    "for file %r" % (qualname, source_path))
RuntimeError: cannot cache function 'square': no locator available for file '/home/username/.local/lib/python3.6/site-packages/mytestpackage-1.2.3-py3.6.egg/mytestpackage/mymodule.py'

The bug occurs at least with numba versions 0.45.1 and 0.46.1.
The bug appears to be operating system-agnostic.
Exporting a user cache directory does not help: export NUMBA_CACHE_DIR=/tmp.
I believe that the various caching file locators fail because they assume that the python files are located in plain directories, when they can in fact be in an egg, for example here:

def from_function(cls, py_func, py_file):
    if not (os.path.exists(py_file) or getattr(sys, 'frozen', False)):
        # Perhaps a placeholder (e.g. "<ipython-XXX>")
        # stop function exit if frozen, since it uses a temp placeholder
        return
    self = cls(py_func, py_file)
    try:
        self.ensure_cache_path()
    except OSError:
        # Cannot ensure the cache directory exists or is writable
        return
    return self

To produce a minimal package and reproduce this bug, run this code in a shell:

echo "UEsDBBQDAAAAADd9fU8AAAAAAAAAAAAAAAAOAAAAbXl0ZXN0cGFja2FnZS9QSwMEFAMAAAAAvXx9TwAAAAAAAAAAAAAAABwAAABteXRlc3RwYWNrYWdlL215dGVzdHBhY2thZ2UvUEsDBAoDAAAAAMVqfU9D34g0AgAAAAIAAAAnAAAAbXl0ZXN0cGFja2FnZS9teXRlc3RwYWNrYWdlL19faW5pdF9fLnB5IApQSwMEFAMAAAgAuXx9TxKRbY9SAAAAUwAAACcAAABteXRlc3RwYWNrYWdlL215dGVzdHBhY2thZ2UvbXltb2R1bGUucHkdxbEKgCAQAND9vuJo0ohmEYI+oi0azE4y0OrywM8vesuL6Tq5YJa0OoDxv89HLKoJRgUz20U3HXrndxomFtKwUcDnFsekqraAH6YinLFiixVeUEsDBBQDAAAIALB8fU9A2V6/YwAAAI0AAAAWAAAAbXl0ZXN0cGFja2FnZS9zZXR1cC5weVWMSwqFMAxF51lF6EihCO857lqkaJSibUoTBXev+Bl4h+dcToiZiyILjIUjCumalXkRvM1NLI4hDV32/ewnEoCLVoDnko/kTNyVRJ+DsZfZqEjg5Myv+TftA9+G+xSr2kINB1BLAQI/AxQDAAAAADd9fU8AAAAAAAAAAAAAAAAOACQAAAAAAAAAEIDtQQAAAABteXRlc3RwYWNrYWdlLwoAIAAAAAAAAQAYAICaOR/DptUBgPSbIcOm1QGAmjkfw6bVAVBLAQI/AxQDAAAAAL18fU8AAAAAAAAAAAAAAAAcACQAAAAAAAAAEIDtQSwAAABteXRlc3RwYWNrYWdlL215dGVzdHBhY2thZ2UvCgAgAAAAAAABABgAgJBTl8Km1QEAizQiw6bVAYCQU5fCptUBUEsBAj8DCgMAAAAAxWp9T0PfiDQCAAAAAgAAACcAJAAAAAAAAAAggKSBZgAAAG15dGVzdHBhY2thZ2UvbXl0ZXN0cGFja2FnZS9fX2luaXRfXy5weQoAIAAAAAAAAQAYAIBuvZ6vptUBAIs0IsOm1QGAbr2er6bVAVBLAQI/AxQDAAAIALl8fU8SkW2PUgAAAFMAAAAnACQAAAAAAAAAIICkga0AAABteXRlc3RwYWNrYWdlL215dGVzdHBhY2thZ2UvbXltb2R1bGUucHkKACAAAAAAAAEAGACA3I6SwqbVAQCLNCLDptUBgNyOksKm1QFQSwECPwMUAwAACACwfH1PQNlev2MAAACNAAAAFgAkAAAAAAAAACCApIFEAQAAbXl0ZXN0cGFja2FnZS9zZXR1cC5weQoAIAAAAAAAAQAYAADebIjCptUBgJVG28Km1QEA3myIwqbVAVBLBQYAAAAABQAFACgCAADbAQAAAAA=" | base64 -d > mytestpackage.zip
unzip mytestpackage.zip
cd mytestpackage
python setup.py install --user
cd /tmp
python -c "import mytestpackage.mymodule"

Or alternatively, if you do not trust this base64-encoded zip-file, you can create a package manually. The directory structure is:

mytestpackage
├── mytestpackage
│   ├── __init__.py
│   └── mymodule.py
└── setup.py

The content of the file mymodule.py is:

import numba

@numba.njit("f8(f8)", cache=True)
def square(x):
    return x * x

The content of the file setup.py is:

from setuptools import setup, find_packages

setup(
    name="mytestpackage",
    version="1.2.3",
    packages=find_packages(),
)

The file __init__.py is empty.

Often it might be preferable to compile the functions ahead of time during setup with numba.pycc if the required features are available (e.g. no parallel=True (yet)).

Another workaround is to install the package with pip install ., which will not create an egg, instead of python setup.py install.

@stuartarchibald stuartarchibald added bug caching Issue involving caching labels Nov 29, 2019
prisae added a commit to emsig/emg3d that referenced this issue Jan 3, 2020
@candalfigomoro
Copy link

Another workaround is to add zip_safe=False to the setup.py file of the packages that use numba caching.

@trianta2
Copy link

trianta2 commented Oct 29, 2020

Indeed, the issue seems to be this line:

if not os.path.exists(py_file):

os.path.exists returns False for a path inside an egg/zip.

My use case is including an egg with a PySpark job, which will be copied to executors running my numba code. I don't have the option to pip install my package on these executor nodes.

I was considering monkey patching numba and inserting my own custom locator here:

_locator_classes = [_UserProvidedCacheLocator,

Ultimately I decided on some magic inside my own package, where I infer if my package was imported as a zip/egg, and if so I unzip the package, modify sys.path, and reload my package.

@abin-tiger
Copy link

abin-tiger commented May 25, 2022

For most users working with docker images setting ENV NUMBA_CACHE_DIR=/tmp should help(even it didn't work for the OP)

@gmarkall
Copy link
Member

Just to confirm, because this never got a response from a maintainer before, this is still an issue with numba main. Following the OP's steps above (which were clear and concise, many thanks @99991), I get:

$ python -c "import mytestpackage.mymodule"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/site-packages/mytestpackage-1.2.3-py3.10.egg/mytestpackage/mymodule.py", line 4, in <module>
  File "/home/gmarkall/numbadev/numba/numba/core/decorators.py", line 212, in wrapper
    disp.enable_caching()
  File "/home/gmarkall/numbadev/numba/numba/core/dispatcher.py", line 863, in enable_caching
    self._cache = FunctionCache(self.py_func)
  File "/home/gmarkall/numbadev/numba/numba/core/caching.py", line 601, in __init__
    self._impl = self._impl_class(py_func)
  File "/home/gmarkall/numbadev/numba/numba/core/caching.py", line 337, in __init__
    raise RuntimeError("cannot cache function %r: no locator available "
RuntimeError: cannot cache function 'square': no locator available for file '/home/gmarkall/mambaforge/envs/numbadev/lib/python3.10/site-packages/mytestpackage-1.2.3-py3.10.egg/mytestpackage/mymodule.py'

I'm going to add this to the 0.57RC milestone as I think it's very tight to get anything into 0.56, but I do think it's important to look at resolving this.

@gmarkall gmarkall added this to the Numba 0.57 RC milestone May 25, 2022
@hancelpv
Copy link

Hi @gmarkall, when is the numba 0.57 release expected to come out ? I'm facing the same issue in one of my projects and wanted to plan my work around the release timeline.

Also, please let me know if there is a workaround/alternate solution to this issue.
Many thanks!

@gmarkall
Copy link
Member

I'd guess we're at least 6 months out from 0.57.

Workarounds from this issue seem to be:

If you go with the monkey patch route, I would imagine the locator class that gets monkey patched into the list of cache locators would probably be suitable to add to Numba permanently as a PR - if you go with this, please do let me know and we can work towards a PR.

@sureskn3
Copy link

sureskn3 commented Aug 17, 2023

Seems numba caching issue is happening when using package timezonefinder under pyspark.
python: 3.6
spark: 2.4.8
timezonefinder: 5.2.0

Issue reference: jannikmi/timezonefinder#206 (comment)

@FloWuenne
Copy link

I am having this problem when this Docker container (https://github.com/BioContainers/containers/edit/master/cellpose/2.2.2/Dockerfile) that uses ENV NUMBA_CACHE_DIR=/tmp is converted to a singularity container. It gives the following error message in a Github CI when trying to run code within that singularity container. Strangely, this does not happen, when I run the code in singularity within a Gitpod code environment...

Does anyone have any suggestion how to fix the Docker container to also work in hosted singularity tests?

@tbrittoborges
Copy link

tbrittoborges commented Sep 13, 2024

I am having this problem when this Docker container (https://github.com/BioContainers/containers/edit/master/cellpose/2.2.2/Dockerfile) that uses ENV NUMBA_CACHE_DIR=/tmp is converted to a singularity container. It gives the following error message in a Github CI when trying to run code within that singularity container. Strangely, this does not happen, when I run the code in singularity within a Gitpod code environment...

For people coming from Google search:

  • You can add SINGULARITYENV_NUMBA_CACHE_DIR="$TMPDIR" to your singularity call
  • On Nextflow you can also use singularity.runOptions = '--no-mount tmp --writable-tmpfs'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug caching Issue involving caching
Projects
None yet
Development

No branches or pull requests