Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bignum test case generation script #6093

Merged

Conversation

wernerlewis
Copy link
Contributor

@wernerlewis wernerlewis commented Jul 18, 2022

Description

Resolves #5921

Adds a test case generation framework for bignum tests, similar to generate_psa_tests.py. This PR adds a few bignum arithmetic test cases to show the usage of the test framework, and it will be expanded to cover more bignum/ECC functionality in subsequent PRs.

This PR should ideally be merged after #6070. The temporary radix argument commit can then be removed before merging this PR.

Status

READY

Requires Backporting

Yes 2.28 #6307

@wernerlewis wernerlewis added needs-preceding-pr Requires another PR to be merged first component-test Test framework and CI scripts labels Jul 18, 2022
@wernerlewis wernerlewis force-pushed the bignum_test_script branch 5 times, most recently from c899567 to 1c3affd Compare July 20, 2022 09:34
Adds python script for generation of bignum test cases, with initial
classes for mpi_cmp_mpi test cases. Build scripts are updated to
generate test data.

Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
@wernerlewis wernerlewis removed the needs-preceding-pr Requires another PR to be merged first label Aug 8, 2022
@wernerlewis wernerlewis added needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review labels Aug 8, 2022
@gilles-peskine-arm gilles-peskine-arm self-requested a review August 11, 2022 18:43
@yanesca yanesca requested a review from minosgalanakis August 22, 2022 10:47
@yanesca yanesca added the priority-high High priority - will be reviewed soon label Aug 22, 2022
minosgalanakis
minosgalanakis previously approved these changes Aug 22, 2022
Copy link
Contributor

@minosgalanakis minosgalanakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, and will do the job it is supposed to do. My main concerns are:

  • Complexity. The code may be hard to maintain, especially since we are combining iterator and class generation. It assumes the reviewer is quite apt with modern Python.
  • Error handling. A lot of edge cases rely on assumed correct input, especially in earlier stages of inheritance.

Would having a document or a comment block descibing how you can easily add new tests be benefical?

tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved

def write_test_data_file(self, basename: str,
test_cases: Iterable[test_case.TestCase]) -> None:
"""Write the test cases to a .data file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the formatting is to comply with PEP257? We do not follow strict PEP8 guidelines but it is nice to see a standard approach.

tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/CMakeLists.txt Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved

@property
def description(self) -> str:
"""Create a numbered test description."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numbered test case descriptions should be a last resort, when there are multiple test cases with pseudorandom data. We should write/generate meaningful test case descriptions as much as possible.

Looking at the output, I think you intended for the numbering to be additional to meaningful descriptions? That's well-meaning, but it does have a downside, which is that the numbering will change whenever we add test cases, or change the order in which they're generated. This would make it hard to compare test results across different commits. So please don't systematically add numbering to test case names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that having numbered test is useful and is worth the cost of that 3-4 characters in the title. Comparing test results across different commits is only a problem if we added test cases or changed generation order between the two commits. Also, in the overwhelming majority of cases we are looking at the tests within a single commit, where having a number makes identifying the failing test cases much easier.

Could we please keep the test case numbering?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is value in having an easy-to-compare unique identifier on test cases. That unique identifier doesn't need to be meaningful, so it can be a number. Now the question is, what kind of stability do we expect on this number?

At one extreme, we can just use the line number. This changes pretty much whenever we add or remove test cases (or comments). Nice properties: lookup by number is extremely easy; the numbers are increasing (but not consecutive); the numbering can be done fully automatically by the test framework.

At another extreme, we can assign a number (or other short identifier) when we write a test case, and change to it. Advantage: the unique identifier is permanent. Downside: has to be managed manually, may cause conflicts if two concurrent tracks of work add the same unique identifier to different test cases.

With the intermediate solution we'd be introducing here, test case numbers would apply only to automatically generated test cases, so this isn't a general solution. We can easily tweak the scope of the numbering (per test function, per data file…) but any scope has its own downside (per test function: makes it hard to figure out what the unique part is; per test suite: makes the numbers unstable when we add unrelated test cases). I think this is not a good design for test case numbering. So I maintain my opposition.

I think the idea of test numbering is a good thing. I, too, have spent time copy-pasting descriptions around to search them, and sometimes looking at the wrong test case because I missed a difference in the description. So far it's been low on my tech debt list, but seeing I'm not the only one interested, then by all means let's add it to our tech debt. But let's first agree on a design, and introduce it as its own feature rather than part of some unrelated work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to have discussion about the design, but could we please not block this PR with that discussion?

This is a test framework, whatever we decide, won't affect the test cases we add while discussing the matter. I really would like to start using this in the Bignum refactoring work as soon as possible. We could tackle the global numbering problem in a different PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we please not block this PR with that discussion?

Agreed, absolutely. Test case numbers are going to cause merge conflicts if we work on this script in parallel, which is likely in the near future. So let's not put them in now, and add them later if we decide on that strategy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks, sounds good!

Regarding the temporary presence of test numbering in automated tests:

The script output is generated files, they won't be committed and won't cause merge conflicts. If we work in parallel on the script, we won't change the part that adds the numbering and the other PR will likely remove it. Again, it seems as a clean merge, I can't see the conflict.

On the Bignum refactoring side, while we don't have the global numbering solution, having this local numbering would be really helpful. Some tests have hundreds of test cases, it is very difficult to navigate them without any numbering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script output is generated files, they won't be committed and won't cause merge conflicts

Well, I was thinking of 2.2x, where they are committed. But it's true that on 2.2x, we typically have quick cycle from making the backport to merging it, so the risk of conflicts is small.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to be backported? I know it says "Yes" in the description, but this is clearly a new feature. And many of the newly-generated tests won't be applicable to bignum-2.28. Of course, having them the same has advantages, but I thought we'd agreed that bignum-development and bignum-2.28 would diverge?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test framework, which is helpful to have on both and the test framework doesn't have to (or shouldn't) diverge. Also, this would make backporting tests easier. (We are adding more tests to other areas as well, not just the new code.)

tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
Copy link
Contributor

@minosgalanakis minosgalanakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nitpicking items and design discussions. They do not need refactor, but would be good to have it other reviewers request changes.

tests/scripts/generate_bignum_tests.py Show resolved Hide resolved
super().__init__(val_a.strip("-"), val_b.strip("-"))


class BignumAdd(BignumOperation):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker.

To my understanding tests will be implemented by implementing BignumOperation and overriding the result(). Depending on how much that functionality is about to grow, it may be of value to move those classes onto a separate files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a strong unifying theme here, so I don't see any value in splitting. We may want to reorganize some things when we start generating ECC arithmetic tests, by creating a common module used by both bignum and ecc test generators, but I think it's too early to tell what code we'll want to share.

tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
scripts/mbedtls_dev/test_generation.py Outdated Show resolved Hide resolved
targets = {} # type: Dict[str, Callable[..., Iterable[test_case.TestCase]]]

def generate_target(self, name: str, *target_args) -> None:
"""Generate cases and write to data file for a target.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent whitepace after function describing docstring

test_name = ""

def __new__(cls, *args, **kwargs):
# pylint: disable=unused-argument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are moving to metaclasses, we could consider doing something more fancy than incrementing a counter.

We have two classes BaseTarget and TestGenerator which are mostly consumed by different modules, but in effect are doing the same thing. The could be merged into one class that works for everything, and optimised in consecutive pr's without having to change the consuming interfaces.

Methods like:

  • TestGenerator
  • write_test_data_file

are generic IO methods which can be common for both and BaseTarget.generate_target() TestGenerator.generate_tests() can be unified, or simply renamed.

Or for a more lazy approach we could even have a self.subtype == "PSA_Test_Target/ BigNum_Test_Target"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we'd want anything fancier here.

Also please keep in mind that the average maintainer of this file is primarily a C programmer, and not a Python expert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was more of a proposal as a stepping stone if we are aiming to intergrate those two into the future. Not a strong ask by any means ;)

tests/scripts/generate_bignum_tests.py Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Show resolved Hide resolved
Copy link
Contributor

@gilles-peskine-arm gilles-peskine-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor documentation issues, and an unnecessary cast. Other than that looks good to me.

tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Outdated Show resolved Hide resolved
scripts/mbedtls_dev/test_generation.py Outdated Show resolved Hide resolved
scripts/mbedtls_dev/test_generation.py Outdated Show resolved Hide resolved
scripts/mbedtls_dev/test_generation.py Outdated Show resolved Hide resolved
test_name = ""

def __new__(cls, *args, **kwargs):
# pylint: disable=unused-argument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we'd want anything fancier here.

Also please keep in mind that the average maintainer of this file is primarily a C programmer, and not a Python expert.

tests/scripts/generate_bignum_tests.py Show resolved Hide resolved
tests/scripts/generate_bignum_tests.py Show resolved Hide resolved
super().__init__(val_a.strip("-"), val_b.strip("-"))


class BignumAdd(BignumOperation):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a strong unifying theme here, so I don't see any value in splitting. We may want to reorganize some things when we start generating ECC arithmetic tests, by creating a common module used by both bignum and ecc test generators, but I think it's too early to tell what code we'll want to share.

This reverts commit f156c43. Adds a
comment to explain reasoning for current implementation.

Signed-off-by: Werner Lewis <[email protected]>
Wrapper function for itertools.combinations_with_replacement, with
explicit cast due to imprecise typing with older versions of mypy.

Signed-off-by: Werner Lewis <[email protected]>
@wernerlewis wernerlewis added needs-review Every commit must be reviewed by at least two team members, and removed needs-work labels Sep 15, 2022
Previous changes used the docstring of the test_generation module,
which does not inform a user about the script.

Signed-off-by: Werner Lewis <[email protected]>

if __name__ == '__main__':
# Use the section of the docstring relevant to the CLI as description
test_generation.main(sys.argv[1:], "\n".join(__doc__.splitlines()[:4]))
Copy link
Contributor

@minosgalanakis minosgalanakis Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is precicely why using "doc" was established. The same can be achieved by

__short_doc __ = """Generate test data for bignum functions. ""
__file_help__ = """With no arguments, generate all test data. With non-option arguments,

__doc__ = __short_doc __ + __file_help__
generate only the specified files.

Class structure:

Child classes of test_generation.BaseTarget (file targets) represent an output
"""

.....
.....
.....

test_generation.main(sys.argv[1:], __file_help__ ))


if __name__ == "__main__":
        print(__doc__)

This shall not block this PR, this is just for the purposes of discussion

Copy link
Contributor

@minosgalanakis minosgalanakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Contributor

@gilles-peskine-arm gilles-peskine-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. I have two comments but neither are blockers.

In the interest of making it possible to use this script in the ongoing bignum work as soon as possible, I intend to merge this as soon as CI passes. We still need to make a 2.28 backport (which should be an identical script, plus in 2.28 we commit the output into version control).

Comment on lines +79 to +80
The return value is cast, as older versions of mypy are unable to derive
the specific type returned by itertools.combinations_with_replacement.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocker: this information is relevant when reading the function's code, but not when using the function. So it should be a comment, not part of the documentation string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#6296 has fixes for this and the other non-blocker issues I raised in this review.

Comment on lines +191 to +194
# The `--directory` option is interpreted relative to the directory from
# which the script is invoked, but the default is relative to the root of
# the mbedtls tree. The default should not be set above, but instead after
# `build_tree.chdir_to_root()` is called.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, not quite. --directory is still interpreted relative to the root of the mbedtls tree. An abspath call is missing somewhere (or alternatively we could stop using chdir, but that would be annoying).

But since this is preexisting (I know it worked when I originally wrote it, but I might have broken it before I even committed), this is a non-blocker.

@@ -0,0 +1,219 @@
"""Common test generation classes and main function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only just realized, but we already have test code generation (tests/scripts/generate_test_code.py, turns .function files into .c), so “test generation” is ambiguous: we should specify that this is test data generation. I propose (in a follow-up) to rename this module to test_data_generation.

Comment on lines +29 to +30
from mbedtls_dev import build_tree
from mbedtls_dev import test_case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be relative imports. No reason to assume that scripts is in the search path.

Again, this is a preexisting bug which doesn't matter in our normal usage, so not a blocker.

@gilles-peskine-arm gilles-peskine-arm merged commit 1716f06 into Mbed-TLS:development Sep 17, 2022
@wernerlewis wernerlewis added needs-backports Backports are missing or are pending review and approval. and removed needs-review Every commit must be reviewed by at least two team members, labels Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component-test Test framework and CI scripts needs-backports Backports are missing or are pending review and approval. priority-high High priority - will be reviewed soon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a script for generating bignum tests
5 participants