From ae1df976d70e587de60d43fd755ca596608179b3 Mon Sep 17 00:00:00 2001 From: mhucka Date: Sat, 30 Nov 2024 16:32:57 +0000 Subject: [PATCH] Disable memory leak tests for now The current failures in the Cirq compatibility CI workflow are limited to the Address Sanitizer (ASAN) tests in `scripts/msan_test.sh`. They started happening only when we updated the version of Linux used by the workflow from Ubuntu 16.04 to 20.04, because GitHub no longer offers the Ubuntu 16 runners. After spending a ridiculous amount of time testing various combinations of TensorFlow, TensorFlow Quantum, and compiler toolchains on a more recent Linux, my conclusion is that the ASAN failures stem from differences in the toolchains used to produce the copy of TensorFlow 2.15.0 we get from PyPI, and the current toolchain used to compile TFQ on GitHub. This conclusion comes from the fact if I build a local copy of TensorFlow, and then build TFQ against that, using Clang for everything, the ASAN failures go away. Given that we can't build TensorFlow as part of this workflow (it takes 2 hours to build using 24-cores on a fast machine), it's not clear what can be done to stop the ASAN failures. I'm temporarily commenting out the leak tests in this workflow so that we can proceed on doing other updates and releasing a new version of TFQ. However, this needs to be revisited at some point. --- .github/workflows/ci.yaml | 39 ++++++++++++++++++++++----------------- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index 0b9ca153c..763fec9e0 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -70,23 +70,28 @@ jobs: - name: Full Library Test run: ./scripts/test_all.sh - leak-tests: - name: Memory Leak tests - runs-on: ubuntu-20.04 - needs: [lint, format] - - steps: - - uses: actions/checkout@v1 - - uses: actions/setup-python@v1 - with: - python-version: '3.10' - architecture: 'x64' - - name: Install Bazel on CI - run: ./scripts/ci_install.sh - - name: Configure CI TF - run: echo "Y\n" | ./configure.sh - - name: Leak Test qsim and src - run: ./scripts/msan_test.sh + # 2024-11-30 [mhucka] temporarily turning off leak-tests because it produces + # false positives on GH that we can't immediately address. TODO: if updating + # TFQ to use Clang and the latest TF does not resolve this, find a way to + # skip the handful of failing tests and renable the rest of the msan tests. + # + # leak-tests: + # name: Memory Leak tests + # runs-on: ubuntu-20.04 + # needs: [lint, format] + # + # steps: + # - uses: actions/checkout@v1 + # - uses: actions/setup-python@v1 + # with: + # python-version: '3.10' + # architecture: 'x64' + # - name: Install Bazel on CI + # run: ./scripts/ci_install.sh + # - name: Configure CI TF + # run: echo "Y\n" | ./configure.sh + # - name: Leak Test qsim and src + # run: ./scripts/msan_test.sh tutorials-test: name: Tutorial tests