Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs]Detection exceptions and lists APIs #35

Closed
wants to merge 1 commit into from

Conversation

benskelker
Copy link
Contributor

@benskelker benskelker commented Jun 24, 2020

Documents detections lists and exceptions APIs.

Exceptions API preview

Lists API preview

Resolves: #41

@benskelker benskelker changed the title [Docs]Exceptions and allowlist API [Docs]Detection rules and endpoint exceptions API Jul 8, 2020
@benskelker benskelker changed the title [Docs]Detection rules and endpoint exceptions API [Docs]Detection exception lists and indices API Jul 9, 2020
@benskelker benskelker marked this pull request as ready for review July 9, 2020 16:31
@benskelker
Copy link
Contributor Author

@FrankHassanabad @yctercero @spong - I'll open a separate PR for endpoint exclusion lists.
Thanks!

@benskelker benskelker changed the title [Docs]Detection exception lists and indices API [Docs]Detection exception and lists APIs Jul 14, 2020
@benskelker benskelker changed the title [Docs]Detection exception and lists APIs [Docs]Detection exceptions and lists APIs Jul 19, 2020
@benskelker benskelker added the v7.9.0 Features in the 7.9 Release label Jul 20, 2020
|`entries` |<<entries-object-schema, entries[]>> |Array containing the
exception queries. Boolean `AND` logic is used to evaluate the relationship
between array elements. If you want to use `OR` logic, create a separate
exception item. |No, defaults to an empty `entries` array.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to update this, this can no longer be an empty array or undefined as an exception item is useless without it.

|==============================================
|Name |Type |Description |Required

|`field` |String |The field used to define the exception. |Yes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it might be helpful to separate these out more clearly into the allowed fields for each type (match, match_any, etc).

Also just a note that the field is now typed to a non empty string so has to have a value. This is also true of value. For match, value must be a non empty string. For match_any, value must be a non empty string array. For nested, the entries field must be a non empty array of entries (that exclude lists or nested types - ie: not recursive structure).

Copy link
Contributor Author

@benskelker benskelker Aug 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yctercero - There was a formatting issue. Are things clearer now? thanks

|`entries` |<<entries-object-schema, entries[]>> |Array containing the
exception queries. Boolean `AND` logic is used to evaluate the relationship
between array elements. If you want to use `OR` logic, create a separate
exception item. |No, defaults to an empty `entries` array.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as with creation, entries can no longer be an empty array or undefined.


|`comments` |comments[] a|Array of `comment` fields:

* `comment` (string): Comments about the exception item.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments can currently be updated or added, not deleted. I'm working on a PR that adds uuid to comments. So on an update, if comments exist you need to pass in the existing comments as { comment: 'some comment', id: '123' }, any new comments can be appended to the array as { comment: "new comment" }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had to go back on this a bit, and comments are now append only (for the time being). So no editing or deleting of comments 😞

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id part still stands though, so comment structure is { comment: '', id: ''}

You can then add the exception container to a rule's `exceptions_list` object.

IMPORTANT: Before you can create lists, you must create `.lists` and `.items`
indices for the {kib} space (see <<lists-index-api-overview>>).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you want to add a note here but when you go to the detections page and upload a list we auto-create the .lists and .items index for them within the UI.

They only have to create them manually by calling them if they do not plan on using the UI.

"type": "ip",
"updated_at": "2020-07-07T03:46:39.853Z",
"updated_by": "Threat Hunter"
}
Copy link

@FrankHassanabad FrankHassanabad Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have had a few minor changes in the last few weeks where you can optionally submit a _version with the POST call as well as receive one for OCC.

Here is what the new response looks like when using post_list.sh:

{
  "_version": "WzgzLDFd",
  "id": "ip_list",
  "created_at": "2020-08-04T19:24:51.228Z",
  "created_by": "yo",
  "description": "This list describes bad internet ip",
  "immutable": false,
  "name": "Simple list with an ip",
  "tie_breaker_id": "f42ef979-a6bf-447a-8477-ea34df0197a2",
  "type": "ip",
  "updated_at": "2020-08-04T19:24:51.228Z",
  "updated_by": "yo",
  "version": 1
}

The new fields are:

version - A numeric number that is only used for seeing when a list changes
immutable - A boolean that tells you if the list is modifiable or not and is for future use, we do nothing with it right now. Users should ignore it and its fine if we don't add it to the documentation at this point
_version - A base 64 encoded value of if_seq_no and if_primary_term used for OCC

OCC docs are here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/optimistic-concurrency-control.html

And when the user does an PUT/PATCH (update) of a list they can optionally send in _version in those requests and will get back an error/rejection if someone else has modified the list.

Ref PR:
elastic/kibana#72730

Example error when it comes back with a conflict:

{
  "message": "[version_conflict_engine_exception] [ip_list]: version conflict, required seqNo [83], primary term [1]. current document has seqNo [84] and primary term [1], with { index_uuid=\"MGa__JJ0SAGBWKW8m0Lbeg\" & shard=\"0\" & index=\".lists-frank-default-000001\" }",
  "status_code": 409
}

List items and exception list and exception list items will also have these features.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_version is totally optional btw, so if the user does not submit it with a request then they will not have any OCC to worry about if they don't want to. It will just update the same way.

[[lists-api-import-list-items]]
=== Import list items

Imports a list of items from a `.txt` file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the UI, it will import anything from a .txt or .csv file. For the API on the backend, it can technically import form any file type. At the very least saying .txt or .csv would probably be the best but feel free to add any other words about how the API can actually do more and import from any file if you think that is important enough to call out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FrankHassanabad - added csv files. For other file types, what is used as the delimiter between values?

"type": "ip",
"updated_at": "2020-07-07T04:09:55.028Z",
"updated_by": "Threat Hunter"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will technically return the same extra three things as mentioned above:

"_version": "Wzg1LDFd"
"version": 1
"immutable": false

But for the put/post we do not allow the version for OCC to be sent in. I would update the response but don't put any docs for the input being able to have those as that is just an oversight on my part here and a bug.

Once I fix that bug I will ping you and you can then update the docs to include it for the input as well.

"id": "internal-ip-address",
"list_id": "internal-ip-excludes",
"value": "10.0.0.1"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be important would be to show two examples for the Add and Get around ip and ip-range because the security user is going to ask about both of these concepts.

The differences and why they're important is that if you upload and use a list that consists of only ip then you can actually do a GET utilizing a CIDR.

For example using some of the scripts for illustration and ease:

 ./import_list_items_by_filename.sh ip ./lists/files/ips.txt
{
  "_version": "Wzg2LDFd",
  "id": "ips.txt",
  "created_at": "2020-08-04T20:35:54.416Z",
  "created_by": "yo",
  "description": "File uploaded from file system of ips.txt",
  "immutable": false,
  "name": "ips.txt",
  "tie_breaker_id": "a9c3a101-d121-45a6-815a-07130e115f47",
  "type": "ip",
  "updated_at": "2020-08-04T20:35:54.416Z",
  "updated_by": "yo",
  "version": 1
}

We can see the contents of the list like so:

 ./export_list_items.sh ips.txt
127.0.0.4
127.0.0.9
127.0.0.5
127.0.0.1
127.0.0.3
127.0.0.7
127.0.0.8
127.0.0.6
127.0.0.2

And now I can use a CIDR to query the list and get back an array of values up to a maximum of 10k items. If there are more than 10k items on a CIDR type query then it will cap out at 10k (which is good for us to document). I had to put the 10k cap out with a TODO block that later we have to re-stream more than that :-/ hehe. But you know, for completeness they should get most of what they want from the feature.

Here is a 32 CIDR that is returning only one element which is 127.0.0.1

 ./get_list_item_by_value.sh ips.txt 127.0.0.1/32
[
  {
    "created_at": "2020-08-04T20:35:55.194Z",
    "created_by": "yo",
    "id": "mbwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "47d3d57b-c5fe-4948-8d73-d147cfcf5c22",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.194Z",
    "updated_by": "yo",
    "value": "127.0.0.1"
  }
]

Here is a 30 CIDR which is going to return three things:

 ./get_list_item_by_value.sh ips.txt 127.0.0.1/30
[
  {
    "created_at": "2020-08-04T20:35:55.194Z",
    "created_by": "yo",
    "id": "mbwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "47d3d57b-c5fe-4948-8d73-d147cfcf5c22",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.194Z",
    "updated_by": "yo",
    "value": "127.0.0.1"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "mrwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "e77c0764-98b8-4a6a-8a04-bb0bf62ff56c",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.2"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "m7wwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "4838d3d7-9958-4f97-8833-63fa07aa5a0b",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.3"
  }
]

Why is this part important in relation to the exception list feature? It's important because they can upload a list of plain ip's and then if they have an ECS field or string field which contains a CIDR and are looking within a list it should be able to exclude anything using that CIDR when it finds it inside the list.

ECS does not seem to indicate a CIDR type field yet ... I will admit that:
elastic/ecs#86

But I still think this is important in case they are storing somewhere CIDR strings and using those to go against a list (which they might be in some companies and places)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok second example which I flip this around and show an ip_range and them utilizing it would be like so:

 ./import_list_items_by_filename.sh ip_range ./lists/files/ip_range.txt
{
  "_version": "Wzg3LDFd",
  "id": "ip_range.txt",
  "created_at": "2020-08-04T20:36:26.406Z",
  "created_by": "yo",
  "description": "File uploaded from file system of ip_range.txt",
  "immutable": false,
  "name": "ip_range.txt",
  "tie_breaker_id": "9de426bf-fdad-461a-881e-c4b09485df71",
  "type": "ip_range",
  "updated_at": "2020-08-04T20:36:26.406Z",
  "updated_by": "yo",
  "version": 1
}

Then we have these values if we do an export:

 ./export_list_items.sh ip_range.txt
192.168.0.5-192.168.0.8
192.168.0.10-192.168.1.20
192.168.1.5-192.168.1.8
192.168.1.0-192.168.1.3
92.168.0.10-192.168.0.20
192.168.0.1-192.168.0.3

Now, what we can do here is the opposite of above, we can query and use plain IP's in fields such as source.ip or destination.ip with the exceptions feature or just plain query it like so:

 ./get_list_item_by_value.sh ip_range.txt 192.168.0.11
[
  {
    "created_at": "2020-08-04T20:36:27.284Z",
    "created_by": "yo",
    "id": "TwEwu3MB7Zv7jX8-iuYZ",
    "list_id": "ip_range.txt",
    "tie_breaker_id": "ad94aba5-4ecc-4038-b967-96f0c84ff170",
    "type": "ip_range",
    "updated_at": "2020-08-04T20:36:27.284Z",
    "updated_by": "yo",
    "value": "92.168.0.10-192.168.0.20"
  },
  {
    "created_at": "2020-08-04T20:36:27.284Z",
    "created_by": "yo",
    "id": "UgEwu3MB7Zv7jX8-iuYZ",
    "list_id": "ip_range.txt",
    "tie_breaker_id": "19b06de1-ce1d-4793-bce3-395c74861683",
    "type": "ip_range",
    "updated_at": "2020-08-04T20:36:27.284Z",
    "updated_by": "yo",
    "value": "192.168.0.10-192.168.1.20"
  }
]

And that IP can show up in two entries/ranges which means we have found it in two different spots and there is some overlap. This has the same 10k limit as the one above it.

These two combined make it easier for users to reserve different blocks of either IPv4 or IPv6 and use CIDR or ranges depending on what their goals are to be able to do exclusions based on their network topology.

"updated_at": "2020-07-07T04:10:26.733Z",
"updated_by": "Threat Hunter",
"value": "10.0.0.1"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a GET won't return the _version where POST will return the _version.

Example:

 ./post_list_item.sh
{
  "_version": "WzUyMjMwMTUsMV0=",
  "id": "ip_item",
  "type": "ip",
  "value": "127.0.0.1",
  "created_at": "2020-08-04T21:04:11.940Z",
  "created_by": "yo",
  "list_id": "ip_list",
  "tie_breaker_id": "ce78d9d0-3dfe-49db-8718-31fe3182cd71",
  "updated_at": "2020-08-04T21:04:11.940Z",
  "updated_by": "yo"
}

compared to GET:

 ./get_list_item_by_value.sh ips.txt 127.0.0.1
[
  {
    "created_at": "2020-08-04T20:35:55.194Z",
    "created_by": "yo",
    "id": "mbwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "47d3d57b-c5fe-4948-8d73-d147cfcf5c22",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.194Z",
    "updated_by": "yo",
    "value": "127.0.0.1"
  }
]

I would document things accurately here and later when I fix bugs or expose missing pieces I will update you to update the docs.

However, as I illustrated in the previous examples I think it's important that we should that the GET will return an array of items (up to a 10k limit) and the reasoning is that you can have a list of IP's and be doing a CIDR query or you could have a list of ip_range and your IP you query for will intersect with multiple ip_range's.


The URL query must include the list container's `id`:

`id` - `GET /api/lists/items/_export?list_id=<id>`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? Did you want POST /api/lists/items like you have below and suggest in the docs? It looks like it should be a POST here and not a GET

==== Response code

`200`::
Indicates a successful call.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to mention that if you delete a container then it will in turn delete all of the container's items as well.

// KIBANA

==== Response code

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this will return the body of the thing it just deleted. Not all our REST endpoints do from Kibana but I typically do within the ones I write so the user knows not just that they deleted something but that they did delete the item they were expecting. Optional if you want to show that or not.

==== Response code

`200`::
Indicates a successful call.
Copy link

@FrankHassanabad FrankHassanabad Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other feature not shown here within the API is a delete by value capability which works well with IP ranges and regular IP's.

For example if I import a set of ip's like I show above with an import I can actually delete a set of them using a CIDR query:

 ./delete_list_item_by_value.sh ips.txt 127.0.0.1/16
[
  {
    "created_at": "2020-08-04T20:35:55.194Z",
    "created_by": "yo",
    "id": "mbwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "47d3d57b-c5fe-4948-8d73-d147cfcf5c22",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.194Z",
    "updated_by": "yo",
    "value": "127.0.0.1"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "mrwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "e77c0764-98b8-4a6a-8a04-bb0bf62ff56c",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.2"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "m7wwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "4838d3d7-9958-4f97-8833-63fa07aa5a0b",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.3"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "nLwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "04e46f26-56dc-4d6a-8d65-e40725078183",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.4"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "nbwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "40f39f31-5c54-4e71-98ef-c2808836bc1b",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.5"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "nrwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "e2ff48c6-2c37-4d5a-ae19-84b8362b32bb",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.6"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "n7wwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "4dbaa815-2352-4cc2-bd50-1a5211d973e9",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.7"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "oLwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "4e5ff207-9639-494b-8446-44629221c2e6",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.8"
  },
  {
    "created_at": "2020-08-04T20:35:55.195Z",
    "created_by": "yo",
    "id": "obwwu3MBM_3LKPKZDKzB",
    "list_id": "ips.txt",
    "tie_breaker_id": "3db33bb3-e726-4bec-9c1f-ed56900e92cc",
    "type": "ip",
    "updated_at": "2020-08-04T20:35:55.195Z",
    "updated_by": "yo",
    "value": "127.0.0.9"
  }
]

The REST is:

/api/lists/items?list_id=ips.txt&value=127.0.0.1/16"

This enables the users of lists to quickly add different values and then delete those values using CIDR.

However, for some list types such as ip_range you currently cannot delete by query as you can see below as they are more problematic. The "non range" data types you can. We might support deleting when you have ranges better in the future but this is where we are at the moment with limitations and bugs.

 ./delete_list_item_by_value.sh ip_range.txt 192.168.0.6
{
  "message": "[query_shard_exception] failed to create query: '192.168.0.5-192.168.0.8' is not an IP string literal., with { index_uuid=\"D41WP1pwRa2INSafOfZ0Pw\" & index=\".items-frank-default-000001\" }",
  "status_code": 400
}
 ./delete_list_item_by_value.sh ip_range.txt 192.168.0.5-192.168.0.8
{
  "message": "[query_shard_exception] failed to create query: '192.168.0.5-192.168.0.8' is not an IP string literal., with { index_uuid=\"D41WP1pwRa2INSafOfZ0Pw\" & index=\".items-frank-default-000001\" }",
  "status_code": 400
}

of:

* *List containers*: A container for values of the same {es}
{ref}/mapping-types.html[data type] (such as, `IP`, `keyword`, and `text`).
Copy link

@FrankHassanabad FrankHassanabad Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would careful about putting the IP in caps as the REST API takes it in lower case. The main ones we are showing off in the UI are at the moment are:
Screen Shot 2020-08-04 at 5 23 07 PM

Which is this exact text that has to be used within the API all lowercase

ip
ip_range
keyword
text

We technically support a few others outside of UI but within the API of:

boolean
byte
date
date_nanos
date_range
double
double_range
float
float_range
half_float
integer
integer_range
long
long_range
short

For the other data types we do not support them enough to mention and then other data types we do not support at all as they don't make any sense from a list use case perspective.

"page": 1,
"per_page": 2,
"total": 3
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will have the newer fields of _version, version, immutable and the same rules apply as before which is they can be used for OCC and the immutable is not being utilized at the moment but in the future it might be for immutable lists and pre-packaged lists.

   {
      "_version": "Wzg2LDFd",
      "id": "ips.txt",
      "created_at": "2020-08-04T20:35:54.416Z",
      "created_by": "yo",
      "description": "File uploaded from file system of ips.txt",
      "immutable": false,
      "name": "ips.txt",
      "tie_breaker_id": "a9c3a101-d121-45a6-815a-07130e115f47",
      "type": "ip",
      "updated_at": "2020-08-04T20:35:54.416Z",
      "updated_by": "yo",
      "version": 1
    },

"page": 1,
"per_page": 20,
"total": 2
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has just the additional field of _version for items for OCC

 ./find_list_items.sh
{
  "cursor": "WzIwLFsiY2U3OGQ5ZDAtM2RmZS00OWRiLTg3MTgtMzFmZTMxODJjZDcxIl1d",
  "data": [
    {
      "_version": "WzUyMjMwMTUsMV0=",
      "created_at": "2020-08-04T21:04:11.940Z",
      "created_by": "yo",
      "id": "ip_item",
      "list_id": "ip_list",
      "tie_breaker_id": "ce78d9d0-3dfe-49db-8718-31fe3182cd71",
      "type": "ip",
      "updated_at": "2020-08-04T21:04:11.940Z",
      "updated_by": "yo",
      "value": "127.0.0.1"
    }
  ],
  "page": 1,
  "per_page": 20,
  "total": 1
}

and it makes sense as we keep the immutable and version at the top level list level but do not update the list items with that information. Also, when we update a list item we do not update the version number at the list item level but only if the list top level changes at the moment.

}
--------------------------------------------------

<1> The returned `id` is required to associate the exception container with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We updated so that what is required to associate an exception list to a rule are id, list_id, namespace_type, type. We have plans to make id optional in the future but it's required for now. Reason for planned change is that with users often importing/exporting rules - using list_id ill allow them to easily exercise that functionality without generating a new list every time since id have to be unique.


|`comments` |comments[] a|Array of `comment` fields:

* `comment` (string): Comments about the exception item.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how much detail is needed, but for now comments are append only. Working on full CRUD for next release.

|No, defaults to empty array.

|`description` |String |Describes the exception item. |Yes
|`entries` |<<entries-object-schema, entries[]>> |Array containing the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I had mentioned it before in a comment, but entries are now required and cannot be empty.

|`tags` |String[] |String array containing words and phrases to help categorize
exception items. |No
|`type` |String a|Exception query type, must be `simple`. |Yes
|`_tags` |String[] a|For endpoint rules only, defines the OS on which the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peluja1012 @madirey @marshallmain @FrankHassanabad is this a field we want to expose to users? Usually the _ fields are just internal, right? Would we want users to use the tags instead as _tags is more for us?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Not good if the user is able to override _tags ...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUT that's the only way for them to target an OS for endpoint exceptions...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have them use tags and it gets copied over to _tags in the background. Because if we start relying on things to be in _tags and the user has access then _tags becomes totally unreliable, no?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yctercero I think at minimum, we need some additional validation on _tags at the API level... I think 1) requiring either endpoint or detection as one of the tags, and 2) if endpoint, then at least 1 accompanying os:<os-name> tag ...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a separate issue, but if we want to prevent the user from accessing _tags directly, it seems like we'd just want to add separate fields for 1) OS and for 2) exception type ... it's a little awkward to have these as part of _tags from an API usability standpoint, but it will work a lot better if we add the validation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I like your suggestion of adding those as separate fields. Are we prevented adding that in a next release if we leave it as is for now?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would require an SO migration (which I think we've determined we can do, and will need to do soon anyway) along with code changes. Depending on how we want to hook it up, it may only require 1 new field ... we could, for example, assume that the existence of any OS tag makes it an endpoint exception. We'd need to talk through the different scenarios... @peluja1012 @spong

|==============================================
|Name |Type |Description |Required

|`field` |String |The source event field used to define the exception. |Yes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note - cannot be empty string.

|Name |Type |Description |Required

|`field` |String |The source event field used to define the exception. |Yes
|`list` |list |Object containing the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where to best describe this - but for now list entries cannot be combined with non-list entries within a single exception item.


|Yes

|`value` |String[] |Array of field values. |Yes, except when `type` is `exists`.
Copy link
Contributor

@yctercero yctercero Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the examples are al 👍 , wasn't sure if here it was clear that the following is true:

For match, value is string.

For match_any, value is string[]

For type: list, value is not present, instead of value it's list: { id:'', type:''}

rules. The container can then be associated with all the relevant rules.
* *Exception items*: The query (fields, values, and logic) used to prevent
rules from generating alerts. When an exception item's query evaluates to
`true`, the rule does not generate an alert.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you want to repeat the explanation of items getting OR here or not. I know you had it up above, but it was such a great explanation, figured it might be useful here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yctercero - hopefully the diagram under this makes things clear. Can you take a look at the preview and let me know: https://security-docs_35.docs-preview.app.elstc.co/guide/en/security/master/exceptions-api-overview.html

as IP addresses, which are used to determine when an exception prevents an
alert from being generated.

IMPORTANT: You cannot use lists with endpoint rule exceptions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also can't use operator: "excluded" with endpoint exceptions, only operator: "included"

@FrankHassanabad
Copy link

FrankHassanabad commented Aug 5, 2020

@benskelker , for the documentation around named capture groups and mustache type handles for lists is when you do a post of a list you can optionally define "serializer" and/or "deserializer" when posting a list container.

you can see examples here:

https://github.com/elastic/kibana/blob/master/x-pack/plugins/lists/server/scripts/lists/new/lists/ip_custom_format.json

https://github.com/elastic/kibana/blob/master/x-pack/plugins/lists/server/scripts/lists/new/lists/date_range_custom_format.json

https://github.com/elastic/kibana/blob/master/x-pack/plugins/lists/server/scripts/lists/new/lists/keyword_custom_format.json

All single values such as ip, long, date, keyword, text use this named capture group regular expression for their serializer if not defined:

(?<value>.+)

And this mustache handle for their deserializer:

{{{value}}}

ip_range, double_range, float_range, integer_range, and long_range use this named capture group regular expression for their default serializer if not defined:

(?<gte>.+)-(?<lte>.+)|(?<value>.+)

And this mustache handle for their deserializer if not defined:

{{{gte}}}-{{{lte}}}

date_range uses this named capture group regular expression serializer if not defined:

(?<gte>.+),(?<lte>.+)|(?<value>.+)

And then date_range it uses this mustache handle for its deserializer if not defined:

{{{gte}}},{{{lte}}}

Named Capture groups are shown here in docs here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Groups_and_Ranges
Screen Shot 2020-08-05 at 5 28 55 PM

And mustache handles here:
https://handlebarsjs.com/guide/expressions.html#basic-usage

So how does this work?

For single values such as ip, long, date, keyword, text, etc... it by default uses
(?<value>.+) which means when importing a list to capture everything using the regular expression .+. Then it stores it within "value" and you must name them all as value using the capture group syntax.

If a user wanted to only import ipv4 as a keyword they would define their serializer like so when posting:

"serializer": "(?<value>((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))"

But what about when they export or do a get call or a find? Well that's what the handle bars "{{{value}}}" is. If they wanted to prepend text for some reason to the export of the words, "Hello" it would be this:

 "deserializer": "Hello {{value}}"

Ok so what about ranges such as date times or ip_range? What if their ip_range is not separated by "dashes" but rather by a slash in their files? Then they would do post this:

"serializer": "(?<gte>.+)/(?<lte>.+)"

Using gte and lte as their capture group. What if their IP ranges are flipped the other way where in their file where gte is on the right? then they flip it like so:

"serializer": "(?<lte>.+)/(?<gte>.+)"

What if they want to deserialize it with the slash like that? Then they do this and use gte and lte for their handle bar variables:

"deserializer": "{{gte}}/{{lte}}

But what if the user has a mix of both ranges and non-ranges in a list separated by a dash? Then they use an or pip "|" like this below and an additional named capture group of value.

(?<gte>.+)-(?<lte>.+)|(?<value>.+)

That will first try to get match the line of the form 192.168.0.1-192.168.0.2 but if it cannot match that line item because let's say it encounters just 192.168.0.1 then it will match just 192.168.0.1 and store that as 192.168.0.1-192.168.0.1 as the range.

Now if that doesn't blow your mind and maybe cause a few questions, the final part that you should document when doing ip_range is that you know you can import something like this?

 cat ./lists/files/ip_range_mixed.txt
127.0.0.1
192.168.0.1-192.168.0.3
192.168.0.1/16
192.168.1.0/16
192.168.2.0/16

 ./import_list_items_by_filename.sh ip_range ./lists/files/ip_range_mixed.txt
{
  "_version": "Wzg4LDFd",
  "id": "ip_range_mixed.txt",
  "created_at": "2020-08-05T23:45:30.732Z",
  "created_by": "yo",
  "description": "File uploaded from file system of ip_range_mixed.txt",
  "immutable": false,
  "name": "ip_range_mixed.txt",
  "tie_breaker_id": "90ab9e71-832a-429d-bc9d-2f1b8dd2d082",
  "type": "ip_range",
  "updated_at": "2020-08-05T23:45:30.732Z",
  "updated_by": "yo",
  "version": 1
}

 ./export_list_items.sh ip_range_mixed.txt
192.168.1.0/16
192.168.0.1-192.168.0.3
192.168.2.0/16
192.168.0.1/16
127.0.0.1-127.0.0.1

Let's look at this one a bit careful.

Our default named capture group serializer is:

(?<gte>.+)-(?<lte>.+)|(?<value>.+)

since we imported the list container using defaults.

The original input file is mixed with 5 different line items:

127.0.0.1
192.168.0.1-192.168.0.3
192.168.0.1/16
192.168.1.0/16
192.168.2.0/16

First line 127.0.0.1 is just a single value. That will match this part of the serializer: (?<value>.+) and store:

127.0.0.1-127.0.0.1

The second line:

192.168.0.1-192.168.0.3

Will match this part of the serializer: (?<gte>.+)-(?<lte>.+) and store it as:

192.168.0.1-192.168.0.3

The third, fourth, and fifth lines:

192.168.0.1/16
192.168.1.0/16
192.168.2.0/16

Will match this part of the serializer (?<value>.+) and store them as CIDR values:

192.168.0.1/16
192.168.1.0/16
192.168.2.0/16

Because yes IP ranges can be stored as CIDR, https://www.elastic.co/guide/en/elasticsearch/reference/current/range.html#ip-range

When we serialize them back out our default serializer is:

{{{gte}}}-{{{lte}}}

But if it detects a single value such as a CIDR it will fall back on not using a dash.

You can see the same effect with a find:

 ./find_list_items.sh ip_range_mixed.txt
{
  "cursor": "WzIwLFsiZjg5M2Y3OWItOTdmYi00MTQ2LWI5YmQtMDJhZTVjMjA3OGZjIl1d",
  "data": [
    {
      "_version": "WzUyMjMwMjgsMV0=",
      "created_at": "2020-08-05T23:45:31.371Z",
      "created_by": "yo",
      "id": "nBgDwXMB7Zv7jX8-_kDr",
      "list_id": "ip_range_mixed.txt",
      "tie_breaker_id": "41a1b81c-307f-4011-b8cf-c6c4654e86c3",
      "type": "ip_range",
      "updated_at": "2020-08-05T23:45:31.371Z",
      "updated_by": "yo",
      "value": "192.168.1.0/16"
    },
    {
      "_version": "WzUyMjMwMjYsMV0=",
      "created_at": "2020-08-05T23:45:31.371Z",
      "created_by": "yo",
      "id": "mhgDwXMB7Zv7jX8-_kDr",
      "list_id": "ip_range_mixed.txt",
      "tie_breaker_id": "97aa9743-3d1b-43d1-8af3-a380b844ba0a",
      "type": "ip_range",
      "updated_at": "2020-08-05T23:45:31.371Z",
      "updated_by": "yo",
      "value": "192.168.0.1-192.168.0.3"
    },
    {
      "_version": "WzUyMjMwMjksMV0=",
      "created_at": "2020-08-05T23:45:31.371Z",
      "created_by": "yo",
      "id": "nRgDwXMB7Zv7jX8-_kDr",
      "list_id": "ip_range_mixed.txt",
      "tie_breaker_id": "aa16d0c7-ec67-4a53-85b7-bd297741375e",
      "type": "ip_range",
      "updated_at": "2020-08-05T23:45:31.371Z",
      "updated_by": "yo",
      "value": "192.168.2.0/16"
    },
    {
      "_version": "WzUyMjMwMjcsMV0=",
      "created_at": "2020-08-05T23:45:31.371Z",
      "created_by": "yo",
      "id": "mxgDwXMB7Zv7jX8-_kDr",
      "list_id": "ip_range_mixed.txt",
      "tie_breaker_id": "b76531fb-c113-47f5-a9b7-ad5dcceedf87",
      "type": "ip_range",
      "updated_at": "2020-08-05T23:45:31.371Z",
      "updated_by": "yo",
      "value": "192.168.0.1/16"
    },
    {
      "_version": "WzUyMjMwMjUsMV0=",
      "created_at": "2020-08-05T23:45:31.370Z",
      "created_by": "yo",
      "id": "mRgDwXMB7Zv7jX8-_kDr",
      "list_id": "ip_range_mixed.txt",
      "tie_breaker_id": "f893f79b-97fb-4146-b9bd-02ae5c2078fc",
      "type": "ip_range",
      "updated_at": "2020-08-05T23:45:31.370Z",
      "updated_by": "yo",
      "value": "127.0.0.1-127.0.0.1"
    }
  ],
  "page": 1,
  "per_page": 20,
  "total": 5
}

Ok, so what about if you set a custom serializer and deserializer does that change your REST output? Yes it will. If you set them both then you will get a new property of serializer and deserializer coming back:

 ./post_list.sh ./lists/new/lists/keyword_custom_format.json
{
  "_version": "Wzg5LDFd",
  "id": "keyword_custom_format_list",
  "created_at": "2020-08-05T23:59:52.857Z",
  "created_by": "yo",
  "description": "This parses the first found ipv4 only",
  "deserializer": "{{value}}",
  "immutable": false,
  "name": "Simple list with a keyword using a custom format",
  "serializer": "(?<value>((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))",
  "tie_breaker_id": "f9132af0-d734-4272-a789-b3124f4a78cf",
  "type": "keyword",
  "updated_at": "2020-08-05T23:59:52.857Z",
  "updated_by": "yo",
  "version": 1
}

If you do a get you will see the two extra key/values in the container of serializer and deserializer:

 ./get_list.sh keyword_custom_format_list
{
  "_version": "Wzg5LDFd",
  "id": "keyword_custom_format_list",
  "created_at": "2020-08-05T23:59:52.857Z",
  "created_by": "yo",
  "description": "This parses the first found ipv4 only",
  "deserializer": "{{value}}",
  "immutable": false,
  "name": "Simple list with a keyword using a custom format",
  "serializer": "(?<value>((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))",
  "tie_breaker_id": "f9132af0-d734-4272-a789-b3124f4a78cf",
  "type": "keyword",
  "updated_at": "2020-08-05T23:59:52.857Z",
  "updated_by": "yo",
  "version": 1
}

if neither is defined you will not see them in the rest response for the container

created when it is not provided.
|`meta` |Object |Placeholder for metadata about the exception item. |No
|`name` |String |The exception item's name. |Yes.
|`namespace_type` |String a|Determines whether the exception item is available
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should specify that it should match the parent? @FrankHassanabad is that the case that the items namespace_type should match the parent's?

Copy link
Contributor

@yctercero yctercero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I just focused on the exceptions API portions and left one minor comment. Thanks so much!

Copy link

@FrankHassanabad FrankHassanabad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 ! Nice and super awesome to always do doc reviews. Love seeing all the time and energy to condense and keep it so simple to read and understand (unlike my code sometimes! :-))

Seeing docs and the time taken for them always makes me feel like I am writing good quality stuff as I can glance at them and understand the concepts very very quickly.

author Ben Skelker <[email protected]> 1593006652 +0300
committer Ben Skelker <[email protected]> 1597251767 +0300

starts lists api
@benskelker
Copy link
Contributor Author

Closing this and adding the docs with #108

@benskelker benskelker closed this Aug 12, 2020
@benskelker benskelker deleted the detection-lists branch August 12, 2020 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v7.9.0 Features in the 7.9 Release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document exceptions and lists APIs
6 participants