Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UnitTests] Automatic parametrization over targets, with explicit opt-out #8010

Merged
merged 16 commits into from
Jun 24, 2021

Conversation

Lunderberg
Copy link
Contributor

@Lunderberg Lunderberg commented May 10, 2021

Implemented features for the python tests to automatically parametrize over enabled targets, and to explicitly list the targets that were skipped. PR includes testing framework changes, along with changes to a single test file (test_topi_relu.py) as a proof of concept.

Link to RFC Link to RFC on tvm-rfcs, documenting differences in the testing style, advantages of the proposed style, and changes needed to use the new style.

@Lunderberg Lunderberg force-pushed the unittests_explicit_skip branch from 57b495d to 89a3aaf Compare May 11, 2021 19:19
@Lunderberg Lunderberg force-pushed the unittests_explicit_skip branch 3 times, most recently from ab4261e to 3e7fcb6 Compare May 25, 2021 22:02
Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Lunderberg for this great PR! I think one major thing that would really improve the error handling situation would be a document that describes how to use the testing infrastructure. Something that we could point developers to when we are receiving PRs.

logging.warning(
"None of the following targets are supported by this build of TVM: %s."
" Try setting TVM_TEST_TARGETS to a supported target. Defaulting to llvm.",
target_str,
)
return {"llvm"}
return _get_targets("llvm")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this loop forever if llvm is not enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, it would. Updating to check tvm.runtime.enabled('llvm'). If enabled, maintain current behavior. Otherwise, raise an exception.

xfail_targets = set()

target_marks = []
for t in _get_targets():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't _get_targets filter out all non-unable targets? So we are not including unrunable targets here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated implementation of _get_targets returns all targets without filtering, but marks them as running or un-runnable. This allows enabled_targets() to maintain its current behavior of filtering out un-runnable targets, while the _pytest_target_params can return all targets, but marked with pytest.skipif to indicate which ones cannot run on the current platform.

Use this decorator when you want your test to be run over a
variety of targets and devices (including cpu and gpu devices).

Alternatively, a test that accepts the "target" and "dev" will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe specify that you want to use parameterize_targets when you have a specific set of targets you want to run over. Otherwise users should not use the decorator. Also mention that exclude_targets may be a better option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, edited documentation to reflect new intended usage, and to recommend that exclude_targets or known_failing_targets should typically be used instead.

metafunc.parametrize(names, value_sets, indirect=True)


def fixture(func=None, *, cache_return_value=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you allowed to have an optional parameter before regular arguments? I think lint will not be happy with this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In python2 it would be an error, but in python3 it is allowed, and passes the linter both locally and on the CI. I did this intentionally so that cache_return_value would be a keyword-only argument. My goal is to make it as obvious as possible at the fixture-definition site whether a fixture is going to be cached or not. Mandating fixture(cache_return_value=True) makes that obvious, where fixture(True) may not be.

Comment on lines 727 to 831
>>> @tvm.testing.parametrize("llvm", "cuda")
>>> @tvm.testing.parametrize_targets
>>> def test_mytest(target, dev):
>>> ... # do something
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just remove this example as we want people to only use the decorator with arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, and removed.

@Lunderberg Lunderberg force-pushed the unittests_explicit_skip branch from 19eb0e6 to 692c50b Compare June 1, 2021 15:20
@Lunderberg
Copy link
Contributor Author

Rebased on main to start CI again, now that the CI fix #8160 is in.

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the hard work @Lunderberg!

@Lunderberg
Copy link
Contributor Author

Added one more bugfix. First implementation of removing fixture functions from module scope was a bit overzealous, also removed any objects that implement __getattr__, such as caffe.layers.

Copy link
Contributor

@jwfromm jwfromm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for these changes @Lunderberg. My only concern is that you've introduced a lot of new features but not all of them are used yet in the codebase. Do you think its worth adding a meta-test file to make sure things like known_failing_targets work as expected?

@Lunderberg
Copy link
Contributor Author

@jwfromm That's a good point. I had initially thought that there were few enough features that they could be implicitly tested by their use in other tests, but with the additional features that I added following discussion, it would be good to have dedicated tests for the testing features. I will add them.

Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR - LGTM! One note is that the error message link won't work until apache/tvm-rfcs#7 gets merged, but that should happen fairly soonish, so I don't see this as blocking. Given that only the relu tests is being ported to the parameterized tests, this is a low-risk merge.

@Lunderberg
Copy link
Contributor Author

@jwfromm And added meta-tests for all the new functionality.

@tmoreau89 Good point, that was an intentional choice to point to the main branch of tvm-rfcs. I figured that since the main discussion was on the intended behavior, it would be likely that the two would be accepted or rejected together.

Thank you both for the reviews, and I think the only thing remaining is the CI.

# Optional cls parameter in case a parameter is defined inside a
# class scope.
@pytest.fixture(params=values, ids=ids)
def as_fixture(*cls, request):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lunderberg looks like the linter was not too happy about the unused argument here, that's the only thing blocking CI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, modified to _cls which passes the linter when running locally.


# Optional cls parameter in case a parameter is defined inside a
# class scope.
def fixture_func(*cls, request):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And same change made here.

@Lunderberg Lunderberg force-pushed the unittests_explicit_skip branch from acfef75 to 68947c8 Compare June 23, 2021 03:45
…TS but were skipped

Previously, these were removed by a filter in
tvm.testing._get_targets(), and weren't listed at all.  With this
change, they are instead removed by pytest.skipif, and show up as
explicitly skipped tests in pytest's summary when using
tvm.testing.parametrize_targets.
…dev)

Should make it easier to convert tests from using
tvm.testing.enabled_targets to use pytest's parametrized tests
instead.
…cular test

Uses tvm_exclude_targets variable, which can be set (1) in the
conftest.py to apply to a test directory, (2) in a test script to
apply to that module, or (3) on an individual test function to apply
to it.  The @tvm.testing.exclude_targets decorator is provided for
readability in case apache#3.
Intended to mark tests that fail for a particular target, and are
intended to be fixed in the future.  Typically, these would result
either from implementing a new test, or from an in-progress
implementation of a new target.
These were implemented to exclude or mark as failing an entire file or
directory of tests.  In
https://discuss.tvm.apache.org/t/rfc-parametrized-unit-tests/9946/4,
it was pointed out that the global variables would be vulnerable to
typos in the names, resulting in the option being silently ignored.
The decorators `@tvm.testing.exclude_targets` and
`@tvm.testing.known_failing_targets` do not have this failure mode,
and are the preferred version.
- tvm.testing.parameter() defines a parameter that can be passed to
  tests.  Tests that accept more than one parameter are run for all
  combinations of parameter values.

- tvm.testing.parameters() defines multiple sets of parameter values.
  Tests that accept more than one parameter are run once for each set
  of parameter values.

- tvm.testing.fixture() is a decorator that defines setup code.  The
  `cache=True` argument can be passed to avoid repeating expensive
  setup across multiple tests.
Previously, if the @parametrize_targets were present, but had other
@pytest.mark.parametrize after it, "target" would get parametrized a
second time.  Now, it checks more than just the closest "parametrize"
marker.
As recommended by @tkonolige:

- Avoid infinite loop if LLVM target isn't enabled

- Update documentation for preferred use cases of
  tvm.testing.parametrize_targets, and recommended alternatives.
- Documentation, removed previous example usage of tvm.testing.parametrize_targets
- Previously, a fixture function defined in a module was accessible
  through the global scope, and the function definition is accessible
  if a test function uses that name but fails to declare the fixture
  as a parameter.  Now, it will result in a NameError instead.
…bal scope.

- Initial implementation only checked hasattr(obj, "_pytestfixturefunction")
  before removing obj, which gave false positives for objects that implement
  __getattr__, such as caffe.layers.  Now, also check that the value
  contained is a FixtureFunctionMarker.
…eturn_value=True)

To avoid unit tests being able to influence each other through a
shared cache, all cached fixtures are passed through copy.deepcopy
prior to use.
@Lunderberg Lunderberg force-pushed the unittests_explicit_skip branch from 68947c8 to 0f17d65 Compare June 23, 2021 16:33
Copy link
Contributor

@jwfromm jwfromm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the tests, this is an excellent PR.

@tmoreau89 tmoreau89 merged commit 07701f2 into apache:main Jun 24, 2021
@tmoreau89
Copy link
Contributor

Thank you @tkonolige @jwfromm @Lunderberg the PR is now merged!

@Lunderberg Lunderberg deleted the unittests_explicit_skip branch June 24, 2021 20:09
try:
cached_value = cache[cache_key]
except KeyError:
cached_value = cache[cache_key] = func(*args, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the exception case tested here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. what happens if func itself raises another exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the fixture definition func raises an exception, then the exception gets passed on to pytest, and it gets treated as a failure to generate the fixture. These still result in the test failing, but are recorded as a failed setup. The test itself is never run in that case. This behavior is pytest's default, and is the same in both the cached and uncached versions of tvm.testing.fixture.

I don't have a unit test yet to verify this behavior, but I'll add one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit test added in #8343

# numpy array as input, then calculates uses a slow method to
# compute a known correct output for that input. Therefore,
# including a fallback for serializable types.
def get_cache_key(*args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this guaranteed to be deterministic? pickle.dumps and maybe hash with stuff like dicts might not be...though maybe the dict thing is fixed now?

in any case, tbh i think this is pretty complicated for a cache key function. since we are trying to use this with parameterizable test cases, can't we just whitelist types that have an obvious, stable conversion to a cache key, and then error on the rest? i am not going to ever run python tests/python/unittest/test_bar.py --param=<pickled data>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For hash, it is guaranteed to be deterministic, but pickle.dumps is not. For numpy arrays, pickle.dumps is, but that isn't guaranteed across all types.

The difficult part here is that the caching should work for fixtures that are based on other fixtures. For example, consider the following case. If we want to cache correct_output, then the cache needs to be based on the input_data argument. I agree that I don't think anybody will ever input pickled data from the command line, but this pattern of comparing to the correct output feels like it would be pretty common.

arr_size = tvm.testing.parameter(1, 16, 256)

@tvm.testing.fixture
def input_data(arr_size):
    return np.random.uniform(size=arr_size)

@tvm.testing.fixture
def correct_output(input_data):
    run_very_slow_method(input_data)

def test_func(target, dev, input_data, correct_output):
    output = func(target, dev, input_data)
    tvm.testing.assert_allclose(target, dev, correct_output)

The other scheme I considered was to look up which parameters were indirectly involved in computing a particular fixture and caching based on that parameter or parameters. In this case, correct_output is indirectly based on arr_size. However, that would have introduced a potential failure mode if correct_output is cached but input_data is not. In that case, the second target to use arr_size==1 would look up the cached version correct_output associated with arr_size==1, but would generate a new random value for input_data. This felt like a worse failure mode than the current one of repeating the fixture setup, which is why I used pickle.dumps as the fallback.

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
…-out (apache#8010)

* [UnitTests] Explicitly list tests that were enabled by TVM_TEST_TARGETS but were skipped

Previously, these were removed by a filter in
tvm.testing._get_targets(), and weren't listed at all.  With this
change, they are instead removed by pytest.skipif, and show up as
explicitly skipped tests in pytest's summary when using
tvm.testing.parametrize_targets.

* [UnitTests] Automatic parametrize_targets for tests that use (target,dev)

Should make it easier to convert tests from using
tvm.testing.enabled_targets to use pytest's parametrized tests
instead.

* [UnitTests] Added ability to explicitly exclude a target from a particular test

Uses tvm_exclude_targets variable, which can be set (1) in the
conftest.py to apply to a test directory, (2) in a test script to
apply to that module, or (3) on an individual test function to apply
to it.  The @tvm.testing.exclude_targets decorator is provided for
readability in case apache#3.

* [UnitTests] Refactored test_topi_relu.py to use pytest.mark.parametrize

* [UnitTests] Added tvm_known_failing_targets option for the unittests.

Intended to mark tests that fail for a particular target, and are
intended to be fixed in the future.  Typically, these would result
either from implementing a new test, or from an in-progress
implementation of a new target.

* [UnitTests] Known failing targets now marked with xfail instead of skipif

* [UnitTests] Removed tvm_excluded_targets and tvm_known_failing_targets

These were implemented to exclude or mark as failing an entire file or
directory of tests.  In
https://discuss.tvm.apache.org/t/rfc-parametrized-unit-tests/9946/4,
it was pointed out that the global variables would be vulnerable to
typos in the names, resulting in the option being silently ignored.
The decorators `@tvm.testing.exclude_targets` and
`@tvm.testing.known_failing_targets` do not have this failure mode,
and are the preferred version.

* [UnitTests] Added helper functions to tvm.testing.

- tvm.testing.parameter() defines a parameter that can be passed to
  tests.  Tests that accept more than one parameter are run for all
  combinations of parameter values.

- tvm.testing.parameters() defines multiple sets of parameter values.
  Tests that accept more than one parameter are run once for each set
  of parameter values.

- tvm.testing.fixture() is a decorator that defines setup code.  The
  `cache=True` argument can be passed to avoid repeating expensive
  setup across multiple tests.

* [UnitTests] Bugfix for auto parametrizing of "target"

Previously, if the @parametrize_targets were present, but had other
@pytest.mark.parametrize after it, "target" would get parametrized a
second time.  Now, it checks more than just the closest "parametrize"
marker.

* [UnitTests] Renamed "cache" argument of tvm.testing.fixture to "cache_return_value"

* [UnitTests] Minor updates to parametrized test implementation.

As recommended by @tkonolige:

- Avoid infinite loop if LLVM target isn't enabled

- Update documentation for preferred use cases of
  tvm.testing.parametrize_targets, and recommended alternatives.

* [UnitTests] Minor updates to parametrized test implementation

- Documentation, removed previous example usage of tvm.testing.parametrize_targets

* [UnitTests] Changed accidental use of pytest fixtures to a NameError.

- Previously, a fixture function defined in a module was accessible
  through the global scope, and the function definition is accessible
  if a test function uses that name but fails to declare the fixture
  as a parameter.  Now, it will result in a NameError instead.

* [UnitTests] More careful removal of fixture functions from module global scope.

- Initial implementation only checked hasattr(obj, "_pytestfixturefunction")
  before removing obj, which gave false positives for objects that implement
  __getattr__, such as caffe.layers.  Now, also check that the value
  contained is a FixtureFunctionMarker.

* [UnitTests] Copy cached values when using tvm.testing.fixture(cache_return_value=True)

To avoid unit tests being able to influence each other through a
shared cache, all cached fixtures are passed through copy.deepcopy
prior to use.

* [UnitTests] Added meta-tests for tvm.testing functionality

Co-authored-by: Eric Lunderberg <[email protected]>
zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Mar 4, 2022
…-out (apache#8010)

* [UnitTests] Explicitly list tests that were enabled by TVM_TEST_TARGETS but were skipped

Previously, these were removed by a filter in
tvm.testing._get_targets(), and weren't listed at all.  With this
change, they are instead removed by pytest.skipif, and show up as
explicitly skipped tests in pytest's summary when using
tvm.testing.parametrize_targets.

* [UnitTests] Automatic parametrize_targets for tests that use (target,dev)

Should make it easier to convert tests from using
tvm.testing.enabled_targets to use pytest's parametrized tests
instead.

* [UnitTests] Added ability to explicitly exclude a target from a particular test

Uses tvm_exclude_targets variable, which can be set (1) in the
conftest.py to apply to a test directory, (2) in a test script to
apply to that module, or (3) on an individual test function to apply
to it.  The @tvm.testing.exclude_targets decorator is provided for
readability in case apache#3.

* [UnitTests] Refactored test_topi_relu.py to use pytest.mark.parametrize

* [UnitTests] Added tvm_known_failing_targets option for the unittests.

Intended to mark tests that fail for a particular target, and are
intended to be fixed in the future.  Typically, these would result
either from implementing a new test, or from an in-progress
implementation of a new target.

* [UnitTests] Known failing targets now marked with xfail instead of skipif

* [UnitTests] Removed tvm_excluded_targets and tvm_known_failing_targets

These were implemented to exclude or mark as failing an entire file or
directory of tests.  In
https://discuss.tvm.apache.org/t/rfc-parametrized-unit-tests/9946/4,
it was pointed out that the global variables would be vulnerable to
typos in the names, resulting in the option being silently ignored.
The decorators `@tvm.testing.exclude_targets` and
`@tvm.testing.known_failing_targets` do not have this failure mode,
and are the preferred version.

* [UnitTests] Added helper functions to tvm.testing.

- tvm.testing.parameter() defines a parameter that can be passed to
  tests.  Tests that accept more than one parameter are run for all
  combinations of parameter values.

- tvm.testing.parameters() defines multiple sets of parameter values.
  Tests that accept more than one parameter are run once for each set
  of parameter values.

- tvm.testing.fixture() is a decorator that defines setup code.  The
  `cache=True` argument can be passed to avoid repeating expensive
  setup across multiple tests.

* [UnitTests] Bugfix for auto parametrizing of "target"

Previously, if the @parametrize_targets were present, but had other
@pytest.mark.parametrize after it, "target" would get parametrized a
second time.  Now, it checks more than just the closest "parametrize"
marker.

* [UnitTests] Renamed "cache" argument of tvm.testing.fixture to "cache_return_value"

* [UnitTests] Minor updates to parametrized test implementation.

As recommended by @tkonolige:

- Avoid infinite loop if LLVM target isn't enabled

- Update documentation for preferred use cases of
  tvm.testing.parametrize_targets, and recommended alternatives.

* [UnitTests] Minor updates to parametrized test implementation

- Documentation, removed previous example usage of tvm.testing.parametrize_targets

* [UnitTests] Changed accidental use of pytest fixtures to a NameError.

- Previously, a fixture function defined in a module was accessible
  through the global scope, and the function definition is accessible
  if a test function uses that name but fails to declare the fixture
  as a parameter.  Now, it will result in a NameError instead.

* [UnitTests] More careful removal of fixture functions from module global scope.

- Initial implementation only checked hasattr(obj, "_pytestfixturefunction")
  before removing obj, which gave false positives for objects that implement
  __getattr__, such as caffe.layers.  Now, also check that the value
  contained is a FixtureFunctionMarker.

* [UnitTests] Copy cached values when using tvm.testing.fixture(cache_return_value=True)

To avoid unit tests being able to influence each other through a
shared cache, all cached fixtures are passed through copy.deepcopy
prior to use.

* [UnitTests] Added meta-tests for tvm.testing functionality

Co-authored-by: Eric Lunderberg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants