-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bignum test case generation script #6093
Add bignum test case generation script #6093
Conversation
c899567
to
1c3affd
Compare
Adds python script for generation of bignum test cases, with initial classes for mpi_cmp_mpi test cases. Build scripts are updated to generate test data. Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
74b4634
to
75ef944
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, and will do the job it is supposed to do. My main concerns are:
- Complexity. The code may be hard to maintain, especially since we are combining iterator and class generation. It assumes the reviewer is quite apt with modern Python.
- Error handling. A lot of edge cases rely on assumed correct input, especially in earlier stages of inheritance.
Would having a document or a comment block descibing how you can easily add new tests be benefical?
|
||
def write_test_data_file(self, basename: str, | ||
test_cases: Iterable[test_case.TestCase]) -> None: | ||
"""Write the test cases to a .data file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the formatting is to comply with PEP257? We do not follow strict PEP8 guidelines but it is nice to see a standard approach.
|
||
@property | ||
def description(self) -> str: | ||
"""Create a numbered test description.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numbered test case descriptions should be a last resort, when there are multiple test cases with pseudorandom data. We should write/generate meaningful test case descriptions as much as possible.
Looking at the output, I think you intended for the numbering to be additional to meaningful descriptions? That's well-meaning, but it does have a downside, which is that the numbering will change whenever we add test cases, or change the order in which they're generated. This would make it hard to compare test results across different commits. So please don't systematically add numbering to test case names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that having numbered test is useful and is worth the cost of that 3-4 characters in the title. Comparing test results across different commits is only a problem if we added test cases or changed generation order between the two commits. Also, in the overwhelming majority of cases we are looking at the tests within a single commit, where having a number makes identifying the failing test cases much easier.
Could we please keep the test case numbering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is value in having an easy-to-compare unique identifier on test cases. That unique identifier doesn't need to be meaningful, so it can be a number. Now the question is, what kind of stability do we expect on this number?
At one extreme, we can just use the line number. This changes pretty much whenever we add or remove test cases (or comments). Nice properties: lookup by number is extremely easy; the numbers are increasing (but not consecutive); the numbering can be done fully automatically by the test framework.
At another extreme, we can assign a number (or other short identifier) when we write a test case, and change to it. Advantage: the unique identifier is permanent. Downside: has to be managed manually, may cause conflicts if two concurrent tracks of work add the same unique identifier to different test cases.
With the intermediate solution we'd be introducing here, test case numbers would apply only to automatically generated test cases, so this isn't a general solution. We can easily tweak the scope of the numbering (per test function, per data file…) but any scope has its own downside (per test function: makes it hard to figure out what the unique part is; per test suite: makes the numbers unstable when we add unrelated test cases). I think this is not a good design for test case numbering. So I maintain my opposition.
I think the idea of test numbering is a good thing. I, too, have spent time copy-pasting descriptions around to search them, and sometimes looking at the wrong test case because I missed a difference in the description. So far it's been low on my tech debt list, but seeing I'm not the only one interested, then by all means let's add it to our tech debt. But let's first agree on a design, and introduce it as its own feature rather than part of some unrelated work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to have discussion about the design, but could we please not block this PR with that discussion?
This is a test framework, whatever we decide, won't affect the test cases we add while discussing the matter. I really would like to start using this in the Bignum refactoring work as soon as possible. We could tackle the global numbering problem in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we please not block this PR with that discussion?
Agreed, absolutely. Test case numbers are going to cause merge conflicts if we work on this script in parallel, which is likely in the near future. So let's not put them in now, and add them later if we decide on that strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks, sounds good!
Regarding the temporary presence of test numbering in automated tests:
The script output is generated files, they won't be committed and won't cause merge conflicts. If we work in parallel on the script, we won't change the part that adds the numbering and the other PR will likely remove it. Again, it seems as a clean merge, I can't see the conflict.
On the Bignum refactoring side, while we don't have the global numbering solution, having this local numbering would be really helpful. Some tests have hundreds of test cases, it is very difficult to navigate them without any numbering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script output is generated files, they won't be committed and won't cause merge conflicts
Well, I was thinking of 2.2x, where they are committed. But it's true that on 2.2x, we typically have quick cycle from making the backport to merging it, so the risk of conflicts is small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to be backported? I know it says "Yes" in the description, but this is clearly a new feature. And many of the newly-generated tests won't be applicable to bignum-2.28. Of course, having them the same has advantages, but I thought we'd agreed that bignum-development and bignum-2.28 would diverge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a test framework, which is helpful to have on both and the test framework doesn't have to (or shouldn't) diverge. Also, this would make backporting tests easier. (We are adding more tests to other areas as well, not just the new code.)
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nitpicking items and design discussions. They do not need refactor, but would be good to have it other reviewers request changes.
super().__init__(val_a.strip("-"), val_b.strip("-")) | ||
|
||
|
||
class BignumAdd(BignumOperation): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker.
To my understanding tests will be implemented by implementing BignumOperation
and overriding the result()
. Depending on how much that functionality is about to grow, it may be of value to move those classes onto a separate files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a strong unifying theme here, so I don't see any value in splitting. We may want to reorganize some things when we start generating ECC arithmetic tests, by creating a common module used by both bignum and ecc test generators, but I think it's too early to tell what code we'll want to share.
targets = {} # type: Dict[str, Callable[..., Iterable[test_case.TestCase]]] | ||
|
||
def generate_target(self, name: str, *target_args) -> None: | ||
"""Generate cases and write to data file for a target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent whitepace after function describing docstring
test_name = "" | ||
|
||
def __new__(cls, *args, **kwargs): | ||
# pylint: disable=unused-argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are moving to metaclasses, we could consider doing something more fancy than incrementing a counter.
We have two classes BaseTarget
and TestGenerator
which are mostly consumed by different modules, but in effect are doing the same thing. The could be merged into one class that works for everything, and optimised in consecutive pr's without having to change the consuming interfaces.
Methods like:
TestGenerator
write_test_data_file
are generic IO methods which can be common for both and BaseTarget.generate_target()
TestGenerator.generate_tests()
can be unified, or simply renamed.
Or for a more lazy approach we could even have a self.subtype == "PSA_Test_Target/ BigNum_Test_Target"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we'd want anything fancier here.
Also please keep in mind that the average maintainer of this file is primarily a C programmer, and not a Python expert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was more of a proposal as a stepping stone if we are aiming to intergrate those two into the future. Not a strong ask by any means ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor documentation issues, and an unnecessary cast. Other than that looks good to me.
test_name = "" | ||
|
||
def __new__(cls, *args, **kwargs): | ||
# pylint: disable=unused-argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we'd want anything fancier here.
Also please keep in mind that the average maintainer of this file is primarily a C programmer, and not a Python expert.
super().__init__(val_a.strip("-"), val_b.strip("-")) | ||
|
||
|
||
class BignumAdd(BignumOperation): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a strong unifying theme here, so I don't see any value in splitting. We may want to reorganize some things when we start generating ECC arithmetic tests, by creating a common module used by both bignum and ecc test generators, but I think it's too early to tell what code we'll want to share.
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
This reverts commit f156c43. Adds a comment to explain reasoning for current implementation. Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Wrapper function for itertools.combinations_with_replacement, with explicit cast due to imprecise typing with older versions of mypy. Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Signed-off-by: Werner Lewis <[email protected]>
Previous changes used the docstring of the test_generation module, which does not inform a user about the script. Signed-off-by: Werner Lewis <[email protected]>
|
||
if __name__ == '__main__': | ||
# Use the section of the docstring relevant to the CLI as description | ||
test_generation.main(sys.argv[1:], "\n".join(__doc__.splitlines()[:4])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is precicely why using "doc" was established. The same can be achieved by
__short_doc __ = """Generate test data for bignum functions. ""
__file_help__ = """With no arguments, generate all test data. With non-option arguments,
__doc__ = __short_doc __ + __file_help__
generate only the specified files.
Class structure:
Child classes of test_generation.BaseTarget (file targets) represent an output
"""
.....
.....
.....
test_generation.main(sys.argv[1:], __file_help__ ))
if __name__ == "__main__":
print(__doc__)
This shall not block this PR, this is just for the purposes of discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. I have two comments but neither are blockers.
In the interest of making it possible to use this script in the ongoing bignum work as soon as possible, I intend to merge this as soon as CI passes. We still need to make a 2.28 backport (which should be an identical script, plus in 2.28 we commit the output into version control).
The return value is cast, as older versions of mypy are unable to derive | ||
the specific type returned by itertools.combinations_with_replacement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocker: this information is relevant when reading the function's code, but not when using the function. So it should be a comment, not part of the documentation string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#6296 has fixes for this and the other non-blocker issues I raised in this review.
# The `--directory` option is interpreted relative to the directory from | ||
# which the script is invoked, but the default is relative to the root of | ||
# the mbedtls tree. The default should not be set above, but instead after | ||
# `build_tree.chdir_to_root()` is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, not quite. --directory
is still interpreted relative to the root of the mbedtls tree. An abspath
call is missing somewhere (or alternatively we could stop using chdir
, but that would be annoying).
But since this is preexisting (I know it worked when I originally wrote it, but I might have broken it before I even committed), this is a non-blocker.
@@ -0,0 +1,219 @@ | |||
"""Common test generation classes and main function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only just realized, but we already have test code generation (tests/scripts/generate_test_code.py
, turns .function
files into .c
), so “test generation” is ambiguous: we should specify that this is test data generation. I propose (in a follow-up) to rename this module to test_data_generation
.
from mbedtls_dev import build_tree | ||
from mbedtls_dev import test_case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be relative imports. No reason to assume that scripts
is in the search path.
Again, this is a preexisting bug which doesn't matter in our normal usage, so not a blocker.
Description
Resolves #5921
Adds a test case generation framework for bignum tests, similar to
generate_psa_tests.py
. This PR adds a few bignum arithmetic test cases to show the usage of the test framework, and it will be expanded to cover more bignum/ECC functionality in subsequent PRs.This PR should ideally be merged after #6070. The temporary radix argument commit can then be removed before merging this PR.
Status
READY
Requires Backporting
Yes 2.28 #6307