Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cuQuantum-based TFQ native ops and layers. #770

Closed
wants to merge 123 commits into from

Conversation

jaeyoo
Copy link
Member

@jaeyoo jaeyoo commented May 15, 2023

This has large changes and requires some deps error, rewrite documentations, and fixed internal bugs inside random ops, and so forth.

Please look at the PR descriptions in my repo to understand the procedures : https://github.com/jaeyoo/quantum/pulls?q=is%3Apr+is%3Aclosed

Sinestro38 and others added 30 commits March 29, 2023 06:18
Upgrade bazel version 5.3.0 and fix some typo in tf version (tensorflow#755)
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@jaeyoo jaeyoo requested a review from MichaelBroughton May 15, 2023 18:05
@@ -11,7 +11,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to revert the files where there are little to no changes made to the actual code so that the diff view is easier to read ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let me update the PR soon :) thank you Michael.

@Sinestro38
Copy link
Contributor

Sinestro38 commented May 20, 2023

Breaking Changes

Major Features and Improvements

  • Significant performance improvements by introducing cuQuantum support for circuit execution on Nvidia GPUs:
    • TensorFlow Quantum Keras layers can now be executed on GPU by setting the optional arguement use_cuquantum=True at layer instantiation. Examples:
      • tfq.layers.Expectation(use_cuquantum=True)
      • tfq.layers.SampledExpectation(use_cuquantum=True) (note that cuQuantum runtime is unsupported for any noisy circuit operations
      • tfq.layers.State(use_cuquantum=True)
      • tfq.layers.Sample(use_cuquantum=True)
      • tfq.layers.PQC(model_circuit, operators, use_cuquantum=True)
      • tfq.layers.ControlledPQC(model_circuit, operators, use_cuquantum=True)
    • Important notes:
      • CuQuantum execution is currently only supported for source distributions meaning that the user must build TensorFlow Quantum & tensorFlow-cpu from source following the instructions in install.md.
        • Ensure that the first entry is "N" in the configure.sh script at this step of building. This ensures that you build upon tensorflow-cpu as tensorflow-gpu is unnecessary for CuQuantum support in TensorFlow Quantum.
      • The cuQuantum SDK must be installed locally. See installation instructions for details. As part of the installation process, ensure that the CUQUANTUM_ROOT environment variable is set (referred to in the installation instructions). If not set, bazel will attempt to automatically locate the folder containing the cuQuantum installation upon running configure.sh at this step.
      • Quantum concurrency (global context option) should be turned off when use_cuquantum=True. This can be done by running: tfq.python.quantum_context.set_quantum_concurrent_op_mode(False)

Source: https://github.com/Sinestro38/quantum/blob/master/tensorflow_quantum/release.md
cc: @MichaelBroughton @QuantumJaeYoo

@jccalvojackson
Copy link

jccalvojackson commented May 25, 2023

Hi I tried all the above and when ran the tests, these failed:

//tensorflow_quantum/core/ops:tfq_adj_grad_op_cuquantum_test             FAILED in 8.9s
  /home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/testlogs/tensorflow_quantum/core/ops/tfq_adj_grad_op_cuquantum_test/test.log
//tensorflow_quantum/core/ops:tfq_simulate_ops_cuquantum_test            FAILED in 7.6s
  /home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/testlogs/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test/test.log
//tensorflow_quantum/python/layers/circuit_executors:sampled_expectation_test FAILED in 59.1s
  /home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/testlogs/tensorflow_quantum/python/layers/circuit_executors/sampled_expectation_test/test.log
//tensorflow_quantum/python/optimizers:spsa_minimizer_test               FAILED in 24.6s
  /home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/testlogs/tensorflow_quantum/python/optimizers/spsa_minimizer_test/test.log

the output of the second test log being

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //tensorflow_quantum/core/ops:tfq_simulate_ops_cuquantum_test
-----------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.runfiles/__main__/tensorflow_quantum/core/ops/load_module.py", line 42, in load_module
    return load_library.load_op_library(path)
  File "/home/ubuntu/quantum_nvidia/quantum/quantum_env/lib/python3.8/site-packages/tensorflow/python/framework/load_library.py", line 54, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libcublas.so.12: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.runfiles/__main__/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.py", line 23, in <module>
    from tensorflow_quantum.core.ops import tfq_simulate_ops_cuquantum
  File "/home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.runfiles/__main__/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum.py", line 19, in <module>
    SIM_OP_MODULE = load_module("_tfq_simulate_ops_cuquantum.so")
  File "/home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.runfiles/__main__/tensorflow_quantum/core/ops/load_module.py", line 46, in load_module
    return load_library.load_op_library(path)
  File "/home/ubuntu/quantum_nvidia/quantum/quantum_env/lib/python3.8/site-packages/tensorflow/python/framework/load_library.py", line 54, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /home/ubuntu/quantum_nvidia/quantum/quantum_env/lib/python3.8/site-packages/tensorflow_quantum/core/ops/_tfq_simulate_ops_cuquantum.so: cannot open shared object file: No such file or directory

@Sinestro38
Copy link
Contributor

@jccalvojackson

  • Have you added the lib directory of your cuQuantum AND cuBLAS installations to your LD_LIBRARY_PATH environment variable? For example, after setting CUQUANTUM_ROOT env var, I would run:
export LD_LIBRARY_PATH=${CUQUANTUM_ROOT}/lib:${LD_LIBRARY_PATH}

@jccalvojackson
Copy link

jccalvojackson commented May 25, 2023

iI followed those, I didn't get any error during the build instructions.

when running configure I do get the message

cuQuantum library is detected here: CUQUANTUM_ROOT=<my path>

I tried again and see this error in one of the tests

Traceback (most recent call last):
  File "/home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.runfiles/__main__/tensorflow_quantum/core/ops/load_module.py", line 42, in load_module
    return load_library.load_op_library(path)
  File "/home/ubuntu/quantum_nvidia/quantum/quantum_env/lib/python3.8/site-packages/tensorflow/python/framework/load_library.py", line 54, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libcublas.so.12: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.runfiles/__main__/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.py", line 23, in <module>
    from tensorflow_quantum.core.ops import tfq_simulate_ops_cuquantum
  File "/home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.runfiles/__main__/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum.py", line 19, in <module>
    SIM_OP_MODULE = load_module("_tfq_simulate_ops_cuquantum.so")
  File "/home/ubuntu/.cache/bazel/_bazel_ubuntu/3a577e6722a9311ddc77692e6a730328/execroot/__main__/bazel-out/k8-opt/bin/tensorflow_quantum/core/ops/tfq_simulate_ops_cuquantum_test.runfiles/__main__/tensorflow_quantum/core/ops/load_module.py", line 46, in load_module
    return load_library.load_op_library(path)
  File "/home/ubuntu/quantum_nvidia/quantum/quantum_env/lib/python3.8/site-packages/tensorflow/python/framework/load_library.py", line 54, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libcublas.so.12: cannot open shared object file: No such file or directory

I see that I have /usr/local/cuda/lib64/libcublas.so.11 but not libcublas.so.12 how do I make it use 11 instead of 12?

when running configure it says

Please specify the CUDA SDK major version you want to use. [Leave empty to default to CUDA 11]:

which I do

@jccalvojackson
Copy link

managed to make it work (though is not using gpus significantly) but that's for later. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants