-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for minimization of amplified tests #54
Comments
initial attempt in #154 |
I think it would make sense to remove all amplifications that have no impact on the increase of mutation score. Simple instrumentation could be used to detect useless generated assertions. As for input amplification, I think we have to define a limit:
Because if we apply general unit test or even source code minimisation it might be harder for the developer to identify the original test? And they can apply general-purpose minimisation on their own anyway. |
See also the idea of "delta debugging" to minimize.
|
Yes, that the idea.
The major con with this approach, is the time consumption. In fact, it will take "a lot" of execution of PIT, and so a lot of time.
What do you suggest? In addition to this, we introduce comments in amplified tests and I think they create a lot of noise. Maybe we could first, remove them, when we aim at presenting amplified test to developers. Would you think that this minimization should be automatically done, and enabled by default, or we should provide it, as an "external service tool" of DSpot? |
Would you think that this minimization should be automatically done, and enabled by default,
yes, I think so, in order to maximize the prettiness of the generated tests, so that people like
them, also by their look'n'feel. (In Dspot, we generate tests for humans, not for machines)
|
I was thinking of adding a call to a counter after each added assertion. The test would be executed on the new detected mutants and if an assertion never lowers the counter then that means that it never fails, thus is useless.
If comments were removed, we (DSpot or the developer) would have to rely on a
It would also be easier to interact with the main amplification process. To have a more powerful interface. |
The problem is, that we execute the mutation analysis through maven goals. So, it is a new JVM, we will need serialization to obtain infos about the runs and it is kinda of tricky, right?
I think you can rely on the print of Spoon.
We need to minimize only test that have been selected. In one hand, if there is a selection it means that the minimization is tight to the selection, right? In the other hand, some minimization can be done regardless any test criterion such as the in-lining of local variable. I set up some classes and a test about that: #338. I'm gonna at least this general minimization, using static analysis of the program. WDYT? |
What if each test wrote a report in a file?
Yes but what I don't really understand is that it will modify the original test. What if the author of the test though it was clearer to use a variable? |
It will be the same than serialization / deserialization. I have some issues here. During the mutation analysis:
In addition to this, we have another dimensions: What we do with the amplified test?
I'll think about it.
You made a point here. Maybe we should only minimize what DSpot added. We may rely on name convention of local variables, DSpot names them something like In any case, we won't be able to satisfy everybody, and need to make choices. |
I agree. In the second case, would we still want new mutants to be located in the same method? |
Would INRIA/spoon#1874 be useful? |
Hi @sbihel Would you mind to have look to #354 I propose a minimizer for the The The goal is to have amplified tests that encode a change, e.g a new feature or a regression bug. My idea is to perform a delta-diff on assertions, i.e. remove one by one assertions and see if the amplified test still fail. WDYT? |
Hi @danglotb, Wouldn't we need a list of input programs to have all mutants detected by the test case? Thanks for your efforts 👍 |
As I said, some minimization are related to the test criterion used. For instance, if I use the mutation score as a test criterion, the minimization must keep the mutation score obtained after the amplification. Here, I am talking about another test criterion: encode a behavioral changes. The point is, with this selector, that we obtain amplified tests that pass on a given version, and fail on the other one. Such amplified tests, encode the, desired or undesired, behavioral changes. In one hand, when I say desired, it means that maybe, the developer want that the behavior of the program changes, i.e. it creates a new feature or fix something. In both case, we win, because we can enhance the test suite. Back to the minimization of a such test criterion, Do you think that we should only keep assertions that make the amplified test fails? If yes, does the failure should be the same? |
If a behavioural change is detected, that means we keep both versions in the test suite. And thus we can apply general minimisation on the amplified version, using the improved criterion for the combined tests. I was thinking that a generated assertion could be a duplicate of an existing one. In that case the new assertion would falsely be useful. But if we focus on amplified assertions, with the delta-diff we would detect them. And I think we should only keep amplified assertions that make the test fail because it enforces clarity on the generated test. If we wanted to keep the exact same failures as before, would it not greatly reduce the range of acceptable amplifications? |
there are two kinds of minimization
|
See also: Fine-grained test minimization. | Arash Vahabzadeh, Andrea Stocco, Ali Mesbah |
RW:
|
Motivation: During amplification, there is some neutral test evolution happening. This results in very long and unreadable tests. However, many changes in the amplified test are not required. The goal is minimization is to reduce the size and increase the readability of amplified test cases.
What: Implement a minimization algorithm (such as delta-debugging) to remove useless statements in amplified test cases.
Hints: For instance, useless statement are local variable that are set and never modified such as Object myObject = null; The local variable should be in-lined in this case. For tests that expect an exception, every statement after the exception the one that throws it can be removed.
The text was updated successfully, but these errors were encountered: