Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Mutator): Add mutator for GeneratedRegex #3106

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Pentiva
Copy link
Contributor

@Pentiva Pentiva commented Nov 11, 2024

Adds the ability for Stryker to apply regex mutations to code using the GeneratedRegex attribute. Also switches usage in the library to use the regex source generator.

There are several things that I think aren't perfect about this implementation:

  • All of the mutation is done in the orchestrator, instead of a mutator.
  • The actual mutation does not match a manual mutation.
    • Therefore the JSON report has to be faked to make sense.
  • The typical tests for the orchestrator was causing my IDE to lag very heavily, so VerifyTests was used.
  • The SemanticModel has to be recomputed after a change, otherwise it is unusable in further mutations.
  • Since GeneratedRegexOrchestrator needs to generate the mutants, it is now made available through the MutationContext.

Comment on lines +34 to +36
public string ReplacementText { get; init; }

public FileLinePositionSpan OriginalLocation { get; init; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason you are adding these as properties of Mutant? Mutant should never be able to have a ReplacementText or location that is different from it's Mutation, this could lead to undesirable behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the things i'm not especially happy about. This is needed because the html report would otherwise be wrong, because the mutation is applied to new source that is injected and the report does not show this source. I originally had derived classes of GeneratedRegexMutant and GeneratedRegexMutation to not have these properties on regular mutants, but that became too complex. All existing mutations do not set these values, and they default to what they would have been before in JsonMutant, and that is the only place these properties are used.

@rouke-broersma
Copy link
Member

I have never used Verify before, does it make sense to have Verify tests in the same project as unit tests? Is there some guidance/contributor doc you could add for using Verify in the context of stryker? Also could you clarify why the usual method we use for unit testing was making your IDE lag?

@Pentiva
Copy link
Contributor Author

Pentiva commented Nov 18, 2024

I have never used Verify before, does it make sense to have Verify tests in the same project as unit tests?

I believe so. ApprovalTests is another similar library and on its readme it states ApprovalTests is an open source assertion/verification library to aid unit testing. It is esentally a more powerful assertion that can deal with complex objects automatically.

Is there some guidance/contributor doc you could add for using Verify in the context of stryker?

I can, it boils down to replacing source/expected units tests, like the majority in CsharpMutantOrchestratorTests, and instead having them call Verify to therefore not require expected to be manually chosen.

Also could you clarify why the usual method we use for unit testing was making your IDE lag?

I'm not entirely sure, otherwise I probably would have investigated other possible solutions. I believe it was just the sheer length of the stings, possibly because they were valid c#, as each Verify file in this PR would have been a string instead of a separate file. It might have been because I was using test cases instead of separate tests per input, and having large strings in attributes might have been the problem. It could also be my specific combination of Visual Studio and Resharper that was the problem, I doubt it would have lagged out a simple text editor.

@dupdob
Copy link
Member

dupdob commented Nov 22, 2024

This code alters several important design contracts for which we care about (mostly due to hard learned experience).
Number one golden rule is that mutators must not modify the original code; this should be delegated to the orchestrator which secures the proper workflow. Also, you should never have to regenerate the semantic model: it does not make any sens as we do not want recursive mutation.

Have you read the overall orchestration design document?

I think your main obstacles is that the current design simply does not cover MemberDefinition level mutations.

I had'nt the time yet to think it through, but I guess a cleaner way to do this implies:

  1. Adding the ability to store Memberlevel mutations

  2. Completing the implementation of the MemberDefinitionOrchestrator so that

    1. Mutations can be generated at member level
    2. These can be stored appropriately
    3. And can be injected in the proper way, that is by redirecting the original member to the appropriate mutated versions as you do

This should do most of the heavy lifting for you. I hope it would also fix the problems with the json report, at least simplify it.

I am willing to help you. For context, I was heavily involved in the existing design.
May I suggest you DM via the project Slack so we can discuss this further?

I must say I like very much what you are trying to achieve, and it could help us on other fronts as well, mutation wise

@Pentiva
Copy link
Contributor Author

Pentiva commented Nov 22, 2024

@dupdob I completely agree that this deviates from the norm in not very good ways, and I've sat on this feature for quite a while because I was never truly happy with my design choices.

Number one golden rule is that mutators must not modify the original code; this should be delegated to the orchestrator which secures the proper workflow.

Is that not what I am currently doing? I agree that it's wrong in the other direction (the orchestrator is creating mutants), but no mutator is modifying the original code, the orchestrator is orchestrating its child nodes, applying the mutations, and then doing the wacky bits required to apply the regex mutation. Unless in this case the orchestrator counts as a mutator, but it has to modify the original code in some fashion, I originally had a mutator to create the mutations with the orchestrator only providing the detection and scaffolding, but it was more complex and resulted in work duplication, or bespoke communication between the orchestrator and the mutator.

Also, you should never have to regenerate the semantic model: it does not make any sens as we do not want recursive mutation.

I also agree with this, and I have spent way too much time thinking I have found a solution to not need a recompile, only for it not to work. The problem is Roslyns immutability, I need the semantic model to find the mutatable partial regex method, but if I then mutate the children, I lose the reference to the nodes, and I can't search again because the semantic model is now old, and the same is true if I mutate the children afterwards. I'm not recompiling for recursive mutations, I'm doing it so that I can track the nodes to replace, after the children have been mutated.

Have you read the overall orchestration design document?

Yes, probably a few times, but not recently. I'm pretty sure I would have read through it to try and understand how the MutationStore works, as that was what I believed was correct way to implement this initially.

I think your main obstacles is that the current design simply does not cover MemberDefinition level mutations.

Does it not? MutationControl has a Member member, and it doesn't appear to be any less used than Block or Statement levels? I will say that I still don't fully understand how the MutationStore works other than the simple case of an expression needing to be mutated further out than its parent orchestrator.

I hope it would also fix the problems with the json report, at least simplify it.

I don't think it could, because the mutation can never be real. By which I mean the mutation can not be to replace the regex string with another in the attribute, because that cant be compiled. Unless the orchestrator intercepts and re-interprets that mutation, while making sure the child orchestrators don't actually perform the mutation. My current method is very simple (although wrong from the design standpoint) in that there is no difference for any case except this one, where a escape hatch override was implemented.

I am willing to help you. For context, I was heavily involved in the existing design.
May I suggest you DM via the project Slack so we can discuss this further?

Thanks, the reason I finally submitted it was because it wasn't great and I couldn't improve it, so I wanted to get the thoughts from anyone who had a better understanding of the internals of stryker. I will DM you soon after I post this comment (I've never used Slack, but I assume it's simple enough to get started).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants