Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Adds diff algorithm and unit tests for scalar array values #186323

Merged
merged 11 commits into from
Jul 2, 2024

Conversation

dplumlee
Copy link
Contributor

@dplumlee dplumlee commented Jun 17, 2024

Summary

Related ticket: #180162

Adds the diff algorithm for arrays of scalar values (right now we only have fields of strings) and unit tests to cover all the base cases of what the algorithm can return.

Checklist

Delete any items that are not applicable to this PR.

@dplumlee dplumlee added release_note:skip Skip the PR/issue when compiling release notes Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Rule Management Security Detection Rule Management Team Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area v8.15.0 labels Jun 17, 2024
@dplumlee dplumlee self-assigned this Jun 17, 2024
@dplumlee dplumlee requested a review from a team as a code owner June 17, 2024 17:41
@dplumlee dplumlee requested a review from maximpn June 17, 2024 17:41
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management)

Copy link
Contributor

@jpdjere jpdjere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first algorithm in which we actually need to change the way in which we calculate how we determine which scenario we are in.

Until now we have been comparing values A, B and C with LoDash's isEqual, but now that we're dealing with array of scalars we need to take into account that the order of the elements of the array should't matter. Consider this unit test, which passes with the code as is now:

  it.only('returns custom merged version as merged output if all three versions are different', () => {
    const mockVersions: ThreeVersionsOf<string[]> = {
      base_version: ['A', 'B', 'C'],
      current_version: ['C', 'A', 'B'],
      target_version: ['B', 'C', 'A'],
    };
    const expectedMergedVersion = ['A', 'B', 'C'];

    const result = scalarArrayDiffAlgorithm(mockVersions);

    expect(result).toEqual(
      expect.objectContaining({
        has_update: true,
        merged_version: expectedMergedVersion,
        diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
        merge_outcome: ThreeWayMergeOutcome.Merged,
        has_conflict: false,
      })
    );

Base, current and target are "different" when considered with LoDash's isEqual but a reordering of the scalar values should not mean that the arrays have changed; semantically, they are the same for Kibana.

In the test above, has_update should not be true but false and the diff_outcome should actually be ThreeWayDiffOutcome.StockValueNoUpdate. And this applies to all situation where a different order leads to an "incorrect" calculation of the ThreeWayDiffOutcome.

So we need to update (or create a new specific) determineDiffOutcome function in x-pack/plugins/security_solution/common/api/detection_engine/prebuilt_rules/model/diff/three_way_diff/three_way_diff_outcome.ts to update the isEqual to something specific to the case of arrays of scalar values, which checks that the two arrays contain the same scalar values, independently from their order.


On a side note, for the case in the test above, I think that we should be returning the current_version as the merged version; that is, we give "preference" to the reordering of the scalar values that the user created, not the one Elastic created or the original base version.

@dplumlee
Copy link
Contributor Author

dplumlee commented Jun 20, 2024

@jpdjere great catch, I've added another determineDiffOutcome function to handle that case (we can still fall back on the other one if there are other fields that are order specific) and factored that into some of the unit tests.

And to your side note, we already return the current_version in cases like that so this logic should slot right in

@dplumlee dplumlee requested a review from jpdjere June 20, 2024 20:05
@@ -14,7 +14,7 @@ import {
import { numberDiffAlgorithm } from './number_diff_algorithm';

describe('numberDiffAlgorithm', () => {
it('returns current_version as merged output if there is no update', () => {
it('returns current_version as merged output if there is no update - scenario AAA', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these 🙇‍♂️

@banderror banderror self-requested a review June 21, 2024 17:30
@xcrzx xcrzx self-requested a review June 24, 2024 11:54
@jpdjere
Copy link
Contributor

jpdjere commented Jun 24, 2024

Hi @approksiu, one thing which is UX related here and we'd like your opinion on:

Currently, we have "mixed" behaviour when dealing with duplicated values in arrays. Take for example, the tags field; if a rule has as its tags the array: ['A', 'B', 'C'].

  • Via the UI, in the Rule Editing page, you cannot add 'A' again. The UI blocks you from doing it.
  • Via the UI, in the Rules Managament table, when using Bulk Editing to add tags:
    • you cannot add to the selected rules a duplicated value in an array. So if you want to add ['A', 'B', 'C', 'A'] to the selected rules, the second 'A' gets blocked from being added by the UI.
    • if a rule already has 'A' as its tag and you want to add it again, the UI doesn't block you from it (since the addition can apply to multiple rules, some of which may not have the 'A' tag yet), but the API handles the issue by skipping the update for any rules that already have that tag. So for example:
      • if a rule has ['A', 'B', 'C'] and another one has ['B'] and you want to add 'A' to both, the first rule will be skipped and the second one will succeed.
      • if a rule has ['A', 'B', 'C'] and another one has ['B'] and you want to add 'A', 'Z' to both, both will succeed, but 'A' won't be duplicated in the end result of the first. So we actually do de-duplication here.
  • Via the API, you can replace the whole array with ['A', 'B', 'C', 'A']. The request is successful and the 'A' does not get de-duplicated.

So the question is how do we want to deal with possible duplicated values in the update workflow: if a rule has as its current version ['A', 'B', 'C', 'A'], and the target version is ['A', 'B', 'C'], do we want to consider these to values to be equal, or not?

My own personal opinion is that we should de-duplicate arrays before comparing them to another version, since duplicate values don't serve any real purpose, at least from the Kibana point of view. If they exist, they seem to be more mistakes on the customer side than actual intended values.

And not de-duplicating these values before starting the upgrade workflow would add more overhead on the client's side, and more decisions to make when upgrading a rule, which is something I think we should avoid, and rather simplify the process as much as we can. Consider the scenario for tags:

base: `['A', 'B']`
current: ['A', 'B', 'C', 'A']`
target: ['A', 'B', 'C']

Without de-duplication, this would lead to a conflict because the three versions are different. However, current and target are semantically the same (no difference for Kibana), and we could avoid the conflict here by deduplicating, since current and target would match. This would mean one less problem for the user.

What do you think?

cc @dplumlee @banderror

@approksiu
Copy link

@jpdjere I think deduplication before comparing makes sense. Additional question - is comparison case-sensitive or not?

@jpdjere
Copy link
Contributor

jpdjere commented Jun 24, 2024

@approksiu Good question - it NOT case sensitive, meaning that we consider the tag "Windows" to be the same as "WINDOWS" and all the same behaviours I described above apply with this consideration in mind.

@dplumlee
Copy link
Contributor Author

@jpdjere I would agree with that too, I looked through the fields and don't believe any of the proposed fields listed here would have any impact with duplicate values. I don't believe case sensitivity would matter for any of those fields either

@jpdjere
Copy link
Contributor

jpdjere commented Jun 27, 2024

@dplumlee Sorry for the delay in re-reviewing. Just took another look and we need to account for the fact that our arrays should be case-insensitive when dealing with strings, as @approksiu mentioned. Something like:

const lowercaseIfString = (val: unknown): unknown => {
  return typeof val === 'string' ? val.toLowerCase() : val;
};

const arraysHaveSameElements = <T>(arr1: T[], arr2: T[]) => {
  const set1 = new Set(arr1.map((val) => lowercaseIfString(val)));
  const set2 = new Set(arr2.map((val) => lowercaseIfString(val)));

  if (set1.size !== set2.size) {
    return false;
  }

  for (const val of set1) {
    if (!set2.has(val)) {
      return false;
    }
  }

  return true;
};

And we should add some tests covering this behaviour as well.

Copy link
Contributor

@jpdjere jpdjere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✅ - thanks for addressing all the comments!

Copy link
Contributor

@banderror banderror left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dplumlee @jpdjere I have a concern about case sensitivity. Posting it right away and continuing the review.

*
* NOTE: Diffing logic will be agnostic to array order
*/
export const scalarArrayDiffAlgorithm = <TValue>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For now, we could restrict elements to be either strings or numbers. There's a chance we'll never need to support other primitive types.

Suggested change
export const scalarArrayDiffAlgorithm = <TValue>(
export const scalarArrayDiffAlgorithm = <TValue = string | number>(

@dplumlee dplumlee requested a review from banderror July 1, 2024 17:47
@jpdjere jpdjere removed the request for review from xcrzx July 2, 2024 09:34
@banderror
Copy link
Contributor

@elasticmachine merge upstream

Copy link
Contributor

@banderror banderror left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for optimizing the implementation and making the test coverage comprehensive. It looks clean and feels very robust now.

@banderror banderror enabled auto-merge (squash) July 2, 2024 11:35
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @dplumlee

@banderror banderror merged commit 73b695d into elastic:main Jul 2, 2024
37 checks passed
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Jul 2, 2024
@dplumlee dplumlee deleted the scalar-array-diff-algorithm branch July 2, 2024 14:23
dplumlee added a commit that referenced this pull request Jul 9, 2024
…186325)

## Summary

Related ticket: #180162

Adds test plan for diff algorithm for arrays of scalar values
implemented here: #186323

### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jul 9, 2024
…lastic#186325)

## Summary

Related ticket: elastic#180162

Adds test plan for diff algorithm for arrays of scalar values
implemented here: elastic#186323

### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

(cherry picked from commit 0b405a0)
kibanamachine referenced this pull request Jul 10, 2024
… plan (#186325) (#187873)

# Backport

This will backport the following commits from `main` to `8.15`:
- [[Security Solution] Array of scalar values diff algorithm test plan
(#186325)](#186325)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Davis
Plumlee","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-07-09T14:33:04Z","message":"[Security
Solution] Array of scalar values diff algorithm test plan
(#186325)\n\n## Summary\r\n\r\nRelated ticket:
https://github.com/elastic/kibana/issues/180162\r\n\r\nAdds test plan
for diff algorithm for arrays of scalar values\r\nimplemented here:
https://github.com/elastic/kibana/pull/186323\r\n\r\n### For
maintainers\r\n\r\n- [ ] This was checked for breaking API changes and
was
[labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"0b405a0d656e496775a2c6efa1392ec3ed3fdd4f","branchLabelMapping":{"^v8.16.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","test-plan","Team:Detections
and Resp","Team: SecuritySolution","Team:Detection Rule
Management","Feature:Prebuilt Detection
Rules","v8.15.0","v8.16.0"],"title":"[Security Solution] Array of scalar
values diff algorithm test
plan","number":186325,"url":"https://github.com/elastic/kibana/pull/186325","mergeCommit":{"message":"[Security
Solution] Array of scalar values diff algorithm test plan
(#186325)\n\n## Summary\r\n\r\nRelated ticket:
https://github.com/elastic/kibana/issues/180162\r\n\r\nAdds test plan
for diff algorithm for arrays of scalar values\r\nimplemented here:
https://github.com/elastic/kibana/pull/186323\r\n\r\n### For
maintainers\r\n\r\n- [ ] This was checked for breaking API changes and
was
[labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"0b405a0d656e496775a2c6efa1392ec3ed3fdd4f"}},"sourceBranch":"main","suggestedTargetBranches":["8.15"],"targetPullRequestStates":[{"branch":"8.15","label":"v8.15.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/186325","number":186325,"mergeCommit":{"message":"[Security
Solution] Array of scalar values diff algorithm test plan
(#186325)\n\n## Summary\r\n\r\nRelated ticket:
https://github.com/elastic/kibana/issues/180162\r\n\r\nAdds test plan
for diff algorithm for arrays of scalar values\r\nimplemented here:
https://github.com/elastic/kibana/pull/186323\r\n\r\n### For
maintainers\r\n\r\n- [ ] This was checked for breaking API changes and
was
[labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"0b405a0d656e496775a2c6efa1392ec3ed3fdd4f"}}]}]
BACKPORT-->

Co-authored-by: Davis Plumlee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area release_note:skip Skip the PR/issue when compiling release notes Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants