Skip to content

6.11.1: Support direct linkage to BLAS libraries

Compare
Choose a tag to compare
@honnibal honnibal released this 20 May 16:56

✨ New features and improvements

  • Thinc now vendorizes OpenBLAS's cblas_sgemm function, and delegates matrix multiplications to it by default. The provided function is single-threaded, making it easy to call Thinc from multiple processes. The default sgemm function can be overridden using the THINC_BLAS environment variable --- see below.
  • thinc.neural.util.get_ops now understands device integers, e.g. 0 for GPU 0, as well as strings like "cpu" and "cupy".
  • Update StaticVectors model, to make use of spaCy v2.0's Vectors class.
  • New .gemm() method on NumpyOps and CupyOps classes, allowing matrix and vector multiplication to be handled with a simple function. Example usage:

Customizing the matrix multiplication backend

Previous versions of Thinc have relied on numpy for matrix multiplications. When numpy is installed via wheel using pip (the default), numpy will usually be linked against a suboptimal matrix multiplication kernel. This made it difficult to ensure that Thinc was well optimized for the target machine.

To fix this, Thinc now provides its own matrix multiplications, by bundling the source code for OpenBLAS's sgemm kernel within the library. To change the default BLAS library, you can specify an environment variable, giving the location of the shared library you want to link against:

THINC_BLAS=/opt/openblas/lib/libopenblas.so pip install thinc --no-cache-dir --no-binary
export LD_LIBRARY_PATH=/opt/openblas/lib
# On OSX:
# export DYLD_LIBRARY_PATH=/opt/openblas/lib

If you want to link against the Intel MKL instead of OpenBLAS, the easiest way is to install Miniconda. For instance, if you installed miniconda to `/opt/miniconda', the command to install Thinc linked against MKL would be:

THINC_BLAS=/opt/miniconda/numpy-mkl/lib/libmkl_rt.so pip install thinc --no-cache-dir --no-binary
export LD_LIBRARY_PATH=/opt/miniconda/numpy-mkl/lib
# On OSX:
# export DYLD_LIBRARY_PATH=/opt/miniconda/numpy-mkl/lib

If the library file ends in a .a extension, it is linked statically; if it ends in .so, it's linked dynamically. Make sure you have the directory on your LD_LIBRARY_PATH at runtime if you use the dynamic linking.

🔴 Bug fixes

  • Fix pickle support for FeatureExtracter class.
  • Fix unicode error in Quora dataset loader.
  • Fix batch normalization bugs. Now supports batch "renormalization" correctly.
  • Models now reliably distinguish predict vs. train modes, using the convention drop=None. Previously, layers such as BatchNorm relied on having their predict() method called, which didn't work they were called by layers which didn't implement a predict() method. We now set drop=None to make this more reliable.
  • Fix bug that caused incorrect data types to be produced by FeatureExtracter.

👥 Contributors

Thanks to @dvsrepo, @justindujardin, @alephmelo and @darkdreamingdan for the pull requests and contributions.