Use of xfail in conftest.py hiding passing conditions #1729

matthewfeickert · 2021-12-08T21:03:44Z

Summary

In PR #1702 we adopted using xfail_strict = true for pytest to become aware of when tests that should be failing aren't. With the release of jax v0.2.26 (which includes a fix for jax-ml/jax#8513 — c.f. PR #817)

pyhf/tests/test_tensor.py

Lines 379 to 396 in 43c1567

    
           @pytest.mark.fail_jax 
        
           def test_percentile(backend): 
        
               tb = pyhf.tensorlib 
        
               a = tb.astensor([[10, 7, 4], [3, 2, 1]]) 
        
               assert tb.tolist(tb.percentile(a, 0)) == 1 
        
               assert tb.tolist(tb.percentile(a, 50)) == 3.5 
        
               assert tb.tolist(tb.percentile(a, 100)) == 10 
        
               assert tb.tolist(tb.percentile(a, 50, axis=1)) == [7.0, 2.0] 
        
           # FIXME: PyTorch doesn't yet support interpolation schemes other than "linear" 
        
           # c.f. https://github.com/pytorch/pytorch/pull/59397 
        
           # c.f. https://github.com/scikit-hep/pyhf/issues/1693 
        
           @pytest.mark.fail_pytorch 
        
           @pytest.mark.fail_pytorch64 
        
           @pytest.mark.fail_jax 
        
           def test_percentile_interpolation(backend):

should be failing given xfail_strict as jax v0.2.26 should be passing here. However, they don't fail.

This is because in conftest.py we use pytest.xfail

pyhf/tests/conftest.py

Lines 101 to 102 in 43c1567

    
           if fail_backend: 
        
               pytest.xfail(f"expect {func_name} to fail as specified")

and according to the xfail tutorial

Note that no other code is executed after the pytest.xfail() call, differently from the marker

so the use of pytest.xfail in conftest.py is actually skipping tests instead of running them and checking for failure (which is a surprise to me!). The pytest.xfail docs note that

It is better to use the pytest.mark.xfail marker when possible to declare a test to be xfailed under certain conditions like known bugs or missing features.

Note also that

pyhf/pyproject.toml

Lines 41 to 42 in 43c1567

    
           [tool.pytest] 
        
           xfail_strict = true

should actually be under [tool.pytest.ini_options] as noted in the pytest docs (this was done wrong in PR #1702).

OS / Environment

All

Steps to Reproduce

$ git diff
diff --git a/pyproject.toml b/pyproject.toml
index 9c81dc60..6616d586 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -38,11 +38,9 @@ ignore = [
     'AUTHORS',
 ]
 
-[tool.pytest]
-xfail_strict = true
-
 [tool.pytest.ini_options]
 minversion = "6.0"
+xfail_strict = true
 addopts = [
     "--ignore=setup.py",
     "--ignore=validation/",
diff --git a/tests/conftest.py b/tests/conftest.py
index 0e2537b7..e1c3f87e 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -99,7 +99,9 @@ def backend(request):
         )
 
     if fail_backend:
+        print(f"expect {func_name} to fail as specified")
         pytest.xfail(f"expect {func_name} to fail as specified")
+        print("this won't ever run")
 
     # actual execution here, after all checks is done
     pyhf.set_backend(*request.param)
diff --git a/tests/test_tensor.py b/tests/test_tensor.py
index 616b6ca5..3b8768e2 100644
--- a/tests/test_tensor.py
+++ b/tests/test_tensor.py
@@ -377,6 +377,7 @@ def test_boolean_mask(backend):
 
 
 @pytest.mark.fail_jax
+@pytest.mark.fail_numpy
 def test_percentile(backend):
     tb = pyhf.tensorlib
     a = tb.astensor([[10, 7, 4], [3, 2, 1]])

test_percentile[numpy] should pass, but use of pytest.xfail skips it:

$ pytest -sx tests/test_tensor.py -k test_percentile[numpy]
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 221 deselected / 1 selected                                                                                                                                                         

tests/test_tensor.py expect test_percentile[numpy] to fail as specified
x

Expected Results

For tests marked with xfail to fail:

$ git diff
diff --git a/pyproject.toml b/pyproject.toml
index 9c81dc60..6616d586 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -38,11 +38,9 @@ ignore = [
     'AUTHORS',
 ]
 
-[tool.pytest]
-xfail_strict = true
-
 [tool.pytest.ini_options]
 minversion = "6.0"
+xfail_strict = true
 addopts = [
     "--ignore=setup.py",
     "--ignore=validation/",
diff --git a/tests/test_tensor.py b/tests/test_tensor.py
index 616b6ca5..421015f3 100644
--- a/tests/test_tensor.py
+++ b/tests/test_tensor.py
@@ -377,6 +377,7 @@ def test_boolean_mask(backend):
 
 
 @pytest.mark.fail_jax
+@pytest.mark.xfail()
 def test_percentile(backend):
     tb = pyhf.tensorlib
     a = tb.astensor([[10, 7, 4], [3, 2, 1]])

$ pytest -sx tests/test_tensor.py -k test_percentile[numpy]
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 221 deselected / 1 selected                                                                                                                                                         

tests/test_tensor.py F

================================================================================================ FAILURES =================================================================================================
_________________________________________________________________________________________ test_percentile[numpy] __________________________________________________________________________________________
[XPASS(strict)]

Actual Results

$ pytest -sx tests/test_tensor.py -k test_percentile[numpy]
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 221 deselected / 1 selected                                                                                                                                                         

tests/test_tensor.py expect test_percentile[numpy] to fail as specified
x

pyhf Version

master at commit 43c1567

Code of Conduct

I agree to follow the Code of Conduct

The text was updated successfully, but these errors were encountered:

matthewfeickert · 2021-12-08T21:05:15Z

@kratsg If you can take a look at this with me it would be helpful for making sure the tests are working as expected, especially for the lower bounds constraints tests.

matthewfeickert · 2021-12-08T23:46:42Z

Okay this might actually be pretty straightforward to add. The following

$ git diff
diff --git a/pyproject.toml b/pyproject.toml
index 9c81dc60..6616d586 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -38,11 +38,9 @@ ignore = [
     'AUTHORS',
 ]
 
-[tool.pytest]
-xfail_strict = true
-
 [tool.pytest.ini_options]
 minversion = "6.0"
+xfail_strict = true
 addopts = [
     "--ignore=setup.py",
     "--ignore=validation/",
diff --git a/tests/conftest.py b/tests/conftest.py
index 0e2537b7..bed0f9ae 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -99,7 +99,10 @@ def backend(request):
         )
 
     if fail_backend:
-        pytest.xfail(f"expect {func_name} to fail as specified")
+        request.node.add_marker(
+            pytest.mark.xfail(reason=f"expect {func_name} to fail as specified")
+        )
+        # print(request.node.own_markers)  # This is for debug purposes only
 
     # actual execution here, after all checks is done
     pyhf.set_backend(*request.param)

works as expected for the following scenarios:

First install versions of jax and jaxlib that will fail the test

$ cat /tmp/old-jax-requirements.txt 
jax==0.2.25
jaxlib==0.1.74
$ python -m pip install -r /tmp/old-jax-requirements.txt

Now run pytest and verify that the tests fail so that they are properly caught by the xfail we added and get reported as x in the summary. This works

$ pytest -sx tests/test_tensor.py -k test_percentile  # This should pass as jax v0.2.25 should fail
$ pytest -sx tests/test_tensor.py -k test_percentile
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 198 deselected / 24 selected                                                                                                                                                        

tests/test_tensor.py ....x..xx.x.ssss.sssss.s

Now install versions of jax and jaxlib that will pass the test

$ cat /tmp/new-jax-requirements.txt 
jax==0.2.26
jaxlib==0.1.75
$ python -m pip install -r /tmp/new-jax-requirements.txt

Now run pytest. As jax v0.2.26 is now able to pass the test, the xfail that we added catches the test didn't fail and now raises and error as we have xfail_strict. Everything works as intended now. 👍

$ pytest -sx tests/test_tensor.py -k test_percentile
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 198 deselected / 24 selected                                                                                                                                                        

tests/test_tensor.py ....F

================================================================================================ FAILURES =================================================================================================
__________________________________________________________________________________________ test_percentile[jax] ___________________________________________________________________________________________
[XPASS(strict)] expect test_percentile[jax] to fail as specified

matthewfeickert added bug Something isn't working tests pytest needs-triage Needs a maintainer to categorize and assign labels Dec 8, 2021

matthewfeickert removed the needs-triage Needs a maintainer to categorize and assign label Dec 8, 2021

matthewfeickert mentioned this issue Dec 9, 2021

test: Make fail_backend markers add pytest.mark.xfail and remove fail_jax marker on percentile tests #1730

Merged

4 tasks

matthewfeickert self-assigned this Dec 9, 2021

matthewfeickert closed this as completed in #1730 Dec 9, 2021

matthewfeickert mentioned this issue Feb 8, 2022

Track and update changes to percentile in PyTorch and JAX backends #1693

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of xfail in conftest.py hiding passing conditions #1729

Use of xfail in conftest.py hiding passing conditions #1729

matthewfeickert commented Dec 8, 2021 •

edited

Loading

matthewfeickert commented Dec 8, 2021

matthewfeickert commented Dec 8, 2021

Use of xfail in conftest.py hiding passing conditions #1729

Use of xfail in conftest.py hiding passing conditions #1729

Comments

matthewfeickert commented Dec 8, 2021 • edited Loading

Summary

OS / Environment

Steps to Reproduce

Expected Results

Actual Results

pyhf Version

Code of Conduct

matthewfeickert commented Dec 8, 2021

matthewfeickert commented Dec 8, 2021

matthewfeickert commented Dec 8, 2021 •

edited

Loading