Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of xfail in conftest.py hiding passing conditions #1729

Closed
1 task done
matthewfeickert opened this issue Dec 8, 2021 · 2 comments · Fixed by #1730
Closed
1 task done

Use of xfail in conftest.py hiding passing conditions #1729

matthewfeickert opened this issue Dec 8, 2021 · 2 comments · Fixed by #1730
Assignees
Labels
bug Something isn't working tests pytest

Comments

@matthewfeickert
Copy link
Member

matthewfeickert commented Dec 8, 2021

Summary

In PR #1702 we adopted using xfail_strict = true for pytest to become aware of when tests that should be failing aren't. With the release of jax v0.2.26 (which includes a fix for jax-ml/jax#8513 — c.f. PR #817)

pyhf/tests/test_tensor.py

Lines 379 to 396 in 43c1567

@pytest.mark.fail_jax
def test_percentile(backend):
tb = pyhf.tensorlib
a = tb.astensor([[10, 7, 4], [3, 2, 1]])
assert tb.tolist(tb.percentile(a, 0)) == 1
assert tb.tolist(tb.percentile(a, 50)) == 3.5
assert tb.tolist(tb.percentile(a, 100)) == 10
assert tb.tolist(tb.percentile(a, 50, axis=1)) == [7.0, 2.0]
# FIXME: PyTorch doesn't yet support interpolation schemes other than "linear"
# c.f. https://github.com/pytorch/pytorch/pull/59397
# c.f. https://github.com/scikit-hep/pyhf/issues/1693
@pytest.mark.fail_pytorch
@pytest.mark.fail_pytorch64
@pytest.mark.fail_jax
def test_percentile_interpolation(backend):

should be failing given xfail_strict as jax v0.2.26 should be passing here. However, they don't fail.

This is because in conftest.py we use pytest.xfail

pyhf/tests/conftest.py

Lines 101 to 102 in 43c1567

if fail_backend:
pytest.xfail(f"expect {func_name} to fail as specified")

and according to the xfail tutorial

Note that no other code is executed after the pytest.xfail() call, differently from the marker

so the use of pytest.xfail in conftest.py is actually skipping tests instead of running them and checking for failure (which is a surprise to me!). The pytest.xfail docs note that

It is better to use the pytest.mark.xfail marker when possible to declare a test to be xfailed under certain conditions like known bugs or missing features.

Note also that

pyhf/pyproject.toml

Lines 41 to 42 in 43c1567

[tool.pytest]
xfail_strict = true

should actually be under [tool.pytest.ini_options] as noted in the pytest docs (this was done wrong in PR #1702).

OS / Environment

All

Steps to Reproduce

$ git diff
diff --git a/pyproject.toml b/pyproject.toml
index 9c81dc60..6616d586 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -38,11 +38,9 @@ ignore = [
     'AUTHORS',
 ]
 
-[tool.pytest]
-xfail_strict = true
-
 [tool.pytest.ini_options]
 minversion = "6.0"
+xfail_strict = true
 addopts = [
     "--ignore=setup.py",
     "--ignore=validation/",
diff --git a/tests/conftest.py b/tests/conftest.py
index 0e2537b7..e1c3f87e 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -99,7 +99,9 @@ def backend(request):
         )
 
     if fail_backend:
+        print(f"expect {func_name} to fail as specified")
         pytest.xfail(f"expect {func_name} to fail as specified")
+        print("this won't ever run")
 
     # actual execution here, after all checks is done
     pyhf.set_backend(*request.param)
diff --git a/tests/test_tensor.py b/tests/test_tensor.py
index 616b6ca5..3b8768e2 100644
--- a/tests/test_tensor.py
+++ b/tests/test_tensor.py
@@ -377,6 +377,7 @@ def test_boolean_mask(backend):
 
 
 @pytest.mark.fail_jax
+@pytest.mark.fail_numpy
 def test_percentile(backend):
     tb = pyhf.tensorlib
     a = tb.astensor([[10, 7, 4], [3, 2, 1]])

test_percentile[numpy] should pass, but use of pytest.xfail skips it:

$ pytest -sx tests/test_tensor.py -k test_percentile[numpy]
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 221 deselected / 1 selected                                                                                                                                                         

tests/test_tensor.py expect test_percentile[numpy] to fail as specified
x

Expected Results

For tests marked with xfail to fail:

$ git diff
diff --git a/pyproject.toml b/pyproject.toml
index 9c81dc60..6616d586 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -38,11 +38,9 @@ ignore = [
     'AUTHORS',
 ]
 
-[tool.pytest]
-xfail_strict = true
-
 [tool.pytest.ini_options]
 minversion = "6.0"
+xfail_strict = true
 addopts = [
     "--ignore=setup.py",
     "--ignore=validation/",
diff --git a/tests/test_tensor.py b/tests/test_tensor.py
index 616b6ca5..421015f3 100644
--- a/tests/test_tensor.py
+++ b/tests/test_tensor.py
@@ -377,6 +377,7 @@ def test_boolean_mask(backend):
 
 
 @pytest.mark.fail_jax
+@pytest.mark.xfail()
 def test_percentile(backend):
     tb = pyhf.tensorlib
     a = tb.astensor([[10, 7, 4], [3, 2, 1]])
$ pytest -sx tests/test_tensor.py -k test_percentile[numpy]
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 221 deselected / 1 selected                                                                                                                                                         

tests/test_tensor.py F

================================================================================================ FAILURES =================================================================================================
_________________________________________________________________________________________ test_percentile[numpy] __________________________________________________________________________________________
[XPASS(strict)] 

Actual Results

$ pytest -sx tests/test_tensor.py -k test_percentile[numpy]
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 221 deselected / 1 selected                                                                                                                                                         

tests/test_tensor.py expect test_percentile[numpy] to fail as specified
x

pyhf Version

master at commit 43c1567

Code of Conduct

  • I agree to follow the Code of Conduct
@matthewfeickert matthewfeickert added bug Something isn't working tests pytest needs-triage Needs a maintainer to categorize and assign labels Dec 8, 2021
@matthewfeickert
Copy link
Member Author

@kratsg If you can take a look at this with me it would be helpful for making sure the tests are working as expected, especially for the lower bounds constraints tests.

@matthewfeickert matthewfeickert removed the needs-triage Needs a maintainer to categorize and assign label Dec 8, 2021
@matthewfeickert
Copy link
Member Author

Okay this might actually be pretty straightforward to add. The following

$ git diff
diff --git a/pyproject.toml b/pyproject.toml
index 9c81dc60..6616d586 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -38,11 +38,9 @@ ignore = [
     'AUTHORS',
 ]
 
-[tool.pytest]
-xfail_strict = true
-
 [tool.pytest.ini_options]
 minversion = "6.0"
+xfail_strict = true
 addopts = [
     "--ignore=setup.py",
     "--ignore=validation/",
diff --git a/tests/conftest.py b/tests/conftest.py
index 0e2537b7..bed0f9ae 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -99,7 +99,10 @@ def backend(request):
         )
 
     if fail_backend:
-        pytest.xfail(f"expect {func_name} to fail as specified")
+        request.node.add_marker(
+            pytest.mark.xfail(reason=f"expect {func_name} to fail as specified")
+        )
+        # print(request.node.own_markers)  # This is for debug purposes only
 
     # actual execution here, after all checks is done
     pyhf.set_backend(*request.param)

works as expected for the following scenarios:

  • First install versions of jax and jaxlib that will fail the test
$ cat /tmp/old-jax-requirements.txt 
jax==0.2.25
jaxlib==0.1.74
$ python -m pip install -r /tmp/old-jax-requirements.txt
  • Now run pytest and verify that the tests fail so that they are properly caught by the xfail we added and get reported as x in the summary. This works
$ pytest -sx tests/test_tensor.py -k test_percentile  # This should pass as jax v0.2.25 should fail
$ pytest -sx tests/test_tensor.py -k test_percentile
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 198 deselected / 24 selected                                                                                                                                                        

tests/test_tensor.py ....x..xx.x.ssss.sssss.s
  • Now install versions of jax and jaxlib that will pass the test
$ cat /tmp/new-jax-requirements.txt 
jax==0.2.26
jaxlib==0.1.75
$ python -m pip install -r /tmp/new-jax-requirements.txt
  • Now run pytest. As jax v0.2.26 is now able to pass the test, the xfail that we added catches the test didn't fail and now raises and error as we have xfail_strict. Everything works as intended now. 👍
$ pytest -sx tests/test_tensor.py -k test_percentile
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.6, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
Matplotlib: 3.5.0
Freetype: 2.6.1
benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/feickert/Code/GitHub/pyhf, configfile: pyproject.toml
plugins: mock-3.6.1, console-scripts-1.2.1, mpl-0.13, requests-mock-1.9.3, benchmark-3.4.1, cov-3.0.0, anyio-3.3.3
collected 222 items / 198 deselected / 24 selected                                                                                                                                                        

tests/test_tensor.py ....F

================================================================================================ FAILURES =================================================================================================
__________________________________________________________________________________________ test_percentile[jax] ___________________________________________________________________________________________
[XPASS(strict)] expect test_percentile[jax] to fail as specified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working tests pytest
Projects
None yet
1 participant