Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: np.linalg.eigh(complex) give wrong eigenvectors with accelerate and netlib blas #24640

Open
husisy opened this issue Sep 5, 2023 · 8 comments

Comments

@husisy
Copy link

husisy commented Sep 5, 2023

Describe the issue:

macos-arm (apple silicon M2) only, not reproducible on ubuntu-22.04,AMD-R7

call np.linalg.eigh on a hermitian complex matrix, the eigenvectors are wrong with accelerate and netlib blas, but correct with openblas.

env blas eigh
env00 accelerate wrong
env01 netlib wrong
env02 openblas correct
conda create -y -n env00 -c conda-forge "libblas=*=*accelerate" numpy

conda create -y -n env01 -c conda-forge "libblas=*=*netlib" numpy

# default if no blas specified
conda create -y -n env02 -c conda-forge "libblas=*=*openblas" numpy

seems related issue: #21950

Reproduce the code example:

import numpy as np
np0 = np.array([[0,1j],[-1j,0]])
EVL,EVC = np.linalg.eigh(np0)
print(EVC)
print(EVC.T.conj() @ EVC)

Error message:

# wrong (accelerate,netlib), the first EVC is correct but not normalized, the second EVC is wrong in direction
# [[-0.70710678-0.70710678j  0.70710678+1.20710678j]
#  [ 0.70710678-0.70710678j  0.5       -0.70710678j]]
# [[ 2.        +0.j  -0.5       -0.5j]
#  [-0.5       +0.5j  2.70710678+0.j ]]

# correct (openblas)
# [[-0.70710678+0.j          0.70710678+0.j        ]
#  [ 0.        -0.70710678j  0.        -0.70710678j]]
# [[1.00000000e+00+0.j 2.23711432e-17+0.j]
#  [2.23711432e-17+0.j 1.00000000e+00+0.j]]

Runtime information:

import sys, numpy
print(numpy.__version__); print(sys.version)
## same for all: env00 env01 env02
# 1.25.2
# 3.11.5 | packaged by conda-forge | (main, Aug 27 2023, 03:33:12) [Clang 15.0.7 ]

Context for the issue:

No response

@husisy
Copy link
Author

husisy commented Sep 7, 2023

Solved in 1.26.

conda create -y -n env00 -c conda-forge "libblas=*=*accelerate" pip
conda activate env00
pip install numpy

conda create -y -n env01 -c conda-forge "libblas=*=*netlib" pip
conda activate env01
pip install numpy

20231021 updated: change pip install numpy=1.26rc1 to pip install numpy (the latest numpy will be installed, no need to specify it)

@husisy husisy closed this as completed Sep 7, 2023
@cosmic-latte
Copy link

I reproduced this bug on M2 Mac using numpy version 1.26.0, and it seems that it has not been resolved or is there something wrong with my installation method?

conda env create -f environment.yml

environment.yml content:

name: foundation_accel
channels:
  - conda-forge
dependencies:
  - python=3.11
  - libblas=*=*accelerate
  - nc-time-axis
  - xarray
  - matplotlib
  - bottleneck
  - dask
  - seaborn
  - netcdf4
  - scipy
  - jupyterlab
  - ipympl
  - mpl-probscale
  - pytables
  - scikit-image
  - scikit-learn
  - plotly
  - vega_datasets
  - altair
  - tqdm
  - palettable
  - scienceplots
  - joypy
  - openpyxl
prefix: /Users/fff8e7/anaconda3/envs/foundation_accel

numpy version: 1.26.0 with accelerate blas

There is no problem running the example code in another conda venv with openblas implemented 1.26.0.

@husisy
Copy link
Author

husisy commented Oct 21, 2023

Solved in 1.26.

conda create -y -n env00 -c conda-forge "libblas=*=*accelerate" pip
conda activate env00
pip install numpy

conda create -y -n env01 -c conda-forge "libblas=*=*netlib" pip
conda activate env01
pip install numpy

20231021 updated: change pip install numpy=1.26rc1 to pip install numpy (the latest numpy will be installed, no need to specify it)

Please try the commands above to create environment (env00 or env01). I guess you install numpy from conda-forge channel like env02 below

conda create -y -n env02 -c conda-forge "libblas=*=*accelerate" pip numpy

In this case, i do reproduce this bug. I think this is a bug of conda-forge side.

@cosmic-latte
Copy link

Solved in 1.26.

conda create -y -n env00 -c conda-forge "libblas=*=*accelerate" pip
conda activate env00
pip install numpy

conda create -y -n env01 -c conda-forge "libblas=*=*netlib" pip
conda activate env01
pip install numpy

20231021 updated: change pip install numpy=1.26rc1 to pip install numpy (the latest numpy will be installed, no need to specify it)

Please try the commands above to create environment (env00 or env01). I guess you install numpy from conda-forge channel like env02 below

conda create -y -n env02 -c conda-forge "libblas=*=*accelerate" pip numpy

In this case, i do reproduce this bug. I think this is a bug of conda-forge side.

After I installed numpy using pip, there was no problem. Now it seems that there is a problem with numpy provided by conda-forge. Thank you for your reply!

@cosmic-latte
Copy link

There again,

Solved in 1.26.

conda create -y -n env00 -c conda-forge "libblas=*=*accelerate" pip
conda activate env00
pip install numpy

conda create -y -n env01 -c conda-forge "libblas=*=*netlib" pip
conda activate env01
pip install numpy

20231021 updated: change pip install numpy=1.26rc1 to pip install numpy (the latest numpy will be installed, no need to specify it)

Despite the use of conda create -y -n env00 -c conda-forge "libblas=*=*accelerate" pip command, the actual pip-installed numpy (pip install numpy) is still built on openblas.

According to this code, when running np.show_config() in the env00 environment to view the np information, numpy is actually built based on openblas.

See the following function output message:

"Build Dependencies": {
    "blas": {
      "name": "openblas64",
      "found": true,
      "version": "0.3.23.dev",
      "detection method": "pkgconfig",
      "include directory": "/opt/arm64-builds/include",
      "lib directory": "/opt/arm64-builds/lib",
      "openblas configuration": "USE_64BITINT=1 DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS= NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= SANDYBRIDGE MAX_THREADS=3",
      "pc file directory": "/usr/local/lib/pkgconfig"
    },
    "lapack": {
      "name": "dep4347366432",
      "found": true,
      "version": "1.26.1",
      "detection method": "internal",
      "include directory": "unknown",
      "lib directory": "unknown",
      "openblas configuration": "unknown",
      "pc file directory": "unknown"
    }
  },

So I'm a little confused as to what the problem is.

@husisy
Copy link
Author

husisy commented Oct 22, 2023

I apologize for the oversight in my previous response. I've since learned that pip install numpy utilizes openblas. I'm uncertain about the root of the issue. I'll re-open this for more experienced contributors to investigate.

@husisy husisy reopened this Oct 22, 2023
@cosmic-latte
Copy link

Since the above problem does not exist in versions built against OpenBLAS and Intel MKL, I speculate that this bug may come from the upstream libraries: Apple Accelerate and netlib blas?

I have learned through research that these basic BLAS libraries can indeed give incorrect results, I am posting this link here as an example, it should be noted that the bug mentioned in this external link may NOT correlate with the bug in this issue. This requires further investigation, and if it is determined that the bug is coming from upstream libraries, it should be reported to Apple Inc. as well as netlib blas.

@cosmic-latte
Copy link

#25007

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants