Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] FEA Make it possible to pass pre-inspected modules to threadpool_limit #95

Merged
merged 20 commits into from
Sep 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
2.3.0 (in development)
3.0.0 (in development)
======================

- New object `threadpooctl.ThreadpoolController` which holds controllers for all the
supported native libraries. The states of these libraries is accessible through the
`info` method (equivalent to `threadpoolctl.threadpool_info()`) and their number of
threads can be limited with the `limit` method which can be used as a context
manager (equivalent to `threadpoolctl.threadpool_limits()`). This is especially useful
to avoid searching through all loaded shared libraries each time.

- Fixed an attribute error when using old versions of OpenBLAS or BLIS that are
missing version query functions.
https://github.com/joblib/threadpoolctl/pull/88
Expand Down
59 changes: 47 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,27 +113,62 @@ that are loaded when importing Python packages:
'version': None}]
```

In the above example, `numpy` was installed from the default anaconda channel and
comes with the MKL and its Intel OpenMP (`libiomp5`) implementation while
`xgboost` was installed from pypi.org and links against GNU OpenMP (`libgomp`)
so both OpenMP runtimes are loaded in the same Python program.
In the above example, `numpy` was installed from the default anaconda channel and comes
with MKL and its Intel OpenMP (`libiomp5`) implementation while `xgboost` was installed
from pypi.org and links against GNU OpenMP (`libgomp`) so both OpenMP runtimes are
loaded in the same Python program.

The state of these libraries is also accessible through the object oriented API:

```python
>>> from threadpoolctl import ThreadpoolController, threadpool_info
>>> from pprint import pprint
>>> import numpy
>>> controller = ThreadpoolController()
>>> pprint(controller.info())
ogrisel marked this conversation as resolved.
Show resolved Hide resolved
[{'architecture': 'Haswell',
'filepath': '/home/jeremie/miniconda/envs/dev/lib/libopenblasp-r0.3.17.so',
'internal_api': 'openblas',
'num_threads': 4,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.17'}]

>>> controller.info() == threadpool_info()
True
```

### Setting the Maximum Size of Thread-Pools

Control the number of threads used by the underlying runtime libraries
in specific sections of your Python program:

```python
from threadpoolctl import threadpool_limits
import numpy as np
>>> from threadpoolctl import threadpool_limits
>>> import numpy as np

>>> with threadpool_limits(limits=1, user_api='blas'):
... # In this block, calls to blas implementation (like openblas or MKL)
... # will be limited to use only one thread. They can thus be used jointly
... # with thread-parallelism.
... a = np.random.randn(1000, 1000)
... a_squared = a @ a
```

The threadpools can also be controlled via the object oriented API, which is especially
useful to avoid searching through all the loaded shared libraries each time. It will
however not act on libraries loaded after the instanciation of the
``ThreadpoolController``:

```python
>>> from threadpoolctl import ThreadpoolController
>>> import numpy as np
>>> controller = ThreadpoolController()

with threadpool_limits(limits=1, user_api='blas'):
# In this block, calls to blas implementation (like openblas or MKL)
# will be limited to use only one thread. They can thus be used jointly
# with thread-parallelism.
a = np.random.randn(1000, 1000)
a_squared = a @ a
>>> with controller.limit(limits=1, user_api='blas'):
... a = np.random.randn(1000, 1000)
... a_squared = a @ a
ogrisel marked this conversation as resolved.
Show resolved Hide resolved
```

### Known Limitations
Expand Down
4 changes: 2 additions & 2 deletions continuous_integration/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ python -m pip install -q -r dev-requirements.txt
bash ./continuous_integration/build_test_ext.sh

python --version
python -c "import numpy; print('numpy %s' % numpy.__version__)" || echo "no numpy"
python -c "import scipy; print('scipy %s' % scipy.__version__)" || echo "no scipy"
python -c "import numpy; print(f'numpy {numpy.__version__}')" || echo "no numpy"
python -c "import scipy; print(f'scipy {scipy.__version__}')" || echo "no scipy"

python -m flit install --symlink
4 changes: 2 additions & 2 deletions continuous_integration/install_with_blis.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ CFLAGS=-I$ABS_PATH/BLIS_install/include/blis LDFLAGS=-L$ABS_PATH/BLIS_install/li
bash ./continuous_integration/build_test_ext.sh

python --version
python -c "import numpy; print('numpy %s' % numpy.__version__)" || echo "no numpy"
python -c "import scipy; print('scipy %s' % scipy.__version__)" || echo "no scipy"
python -c "import numpy; print(f'numpy {numpy.__version__}')" || echo "no numpy"
python -c "import scipy; print(f'scipy {scipy.__version__}')" || echo "no scipy"

python -m flit install --symlink
8 changes: 4 additions & 4 deletions tests/_openmp_test_helper/nested_prange_blas.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ IF USE_BLIS:
ELSE:
from scipy.linalg.cython_blas cimport dgemm

from threadpoolctl import _ThreadpoolInfo, _ALL_USER_APIS
from threadpoolctl import ThreadpoolController


def check_nested_prange_blas(double[:, ::1] A, double[:, ::1] B, int nthreads):
Expand All @@ -43,12 +43,12 @@ def check_nested_prange_blas(double[:, ::1] A, double[:, ::1] B, int nthreads):
int prange_num_threads
int *prange_num_threads_ptr = &prange_num_threads

inner_info = [None]
inner_controller = [None]

with nogil, parallel(num_threads=nthreads):
if openmp.omp_get_thread_num() == 0:
with gil:
inner_info[0] = _ThreadpoolInfo(user_api=_ALL_USER_APIS)
inner_controller[0] = ThreadpoolController()

prange_num_threads_ptr[0] = openmp.omp_get_num_threads()

Expand All @@ -62,4 +62,4 @@ def check_nested_prange_blas(double[:, ::1] A, double[:, ::1] B, int nthreads):
&alpha, &B[0, 0], &k, &A[i * chunk_size, 0], &k,
&beta, &C[i * chunk_size, 0], &n)

return np.asarray(C), prange_num_threads, inner_info[0]
return np.asarray(C), prange_num_threads, inner_controller[0]
Loading