Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ARP matching for single ARP change, and consider publisher+name for matching #2119

Merged
merged 4 commits into from
Apr 28, 2022

Conversation

florelis
Copy link
Member

@florelis florelis commented Apr 25, 2022

  • For ARP matching we had a rule that would consider something a match if it was the only ARP change, but that is likely to give us false matches. This change removes that rule.
  • For the string matching confidence algorithm, we would only consider matching package name with ARP name, and package publisher with ARP publisher, but sometimes the publisher would be included in the name and that would decrease confidence. For example, for package name = "Microsoft PowerToys" and ARP name = "PowerToys". This change makes the string matching also consider the string publisher+name.
  • Also made the matching requirements for the unit tests stricter
Microsoft Reviewers: Open in CodeFlow

@florelis florelis requested a review from a team as a code owner April 25, 2022 23:39
dataSet.RequiredTrueMatchRatio = 0.5;
dataSet.RequiredFalseMatchRatio = 0.1;
dataSet.RequiredTrueMatchRatio = 0.7;
dataSet.RequiredFalseMatchRatio = 0.05;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like false positives to be at 0. If the current algorithm is not at 0 in these unit tests, we should reconsider.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll send another PR once I figure out how to reduce the false positives.

@florelis
Copy link
Member Author

One of the tests for winget list was expecting the output to contain the product code, but now that we can match it the output has the Id :)

@florelis florelis merged commit 312ad6d into microsoft:master Apr 28, 2022
@florelis florelis deleted the arp-match branch April 28, 2022 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants