Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced TLS certificate metadata #64

Closed
andrewvc opened this issue Aug 1, 2018 · 15 comments
Closed

Enhanced TLS certificate metadata #64

andrewvc opened this issue Aug 1, 2018 · 15 comments

Comments

@andrewvc
Copy link

andrewvc commented Aug 1, 2018

The current specification only declares a handful of TLS certificate fields. Additionally, these fields are centered around an active connection, not so much around TLS metadata. For instance tls.servername specifies that it be the servername requested by the client. If the cert is a wildcard cert, there is no place in the schema for that.

There is a certificates field, but that is somewhat under-defined. Should the certs be x.509 PEM or DER encoded? I think that description needs to be tightened up as well.

I propose that we the full list of x.509 fields to ECS, and make it clear that tls.* should be exclusively what's in the certificate(s), not the context around the given connection.

Furthermore, I think we should remove the tls.servername field, destination.hostname should suffice for use in those situations.

@bndabbs
Copy link

bndabbs commented Aug 1, 2018

The fields from Bro's SSL log might be a good starting point:

https://www.bro.org/sphinx/scripts/base/protocols/ssl/main.bro.html#id2

@andrewvc
Copy link
Author

andrewvc commented Aug 1, 2018

Thanks for the link @bndabbs . Thinking about it further, I think what we possibly really need is an x509 type to describe the certificate.

These things are commonly conflated, but I think it makes sense to keep them separate. Ideally tls.certificates would be a list of x509 fields, instead of just the vague keyword blob it currently is.

So, the schema would be more like:

tls: {
  // connection stuff
  certificates: {
    // x509 fields
    raw: "---- BEGIN CERTIFICATE----", // PEM encoded x509 cert.
  }
}

@bndabbs
Copy link

bndabbs commented Aug 1, 2018

I agree with them being different, but related.

There is some overlap here, but Bro break the session information down in the SSL log and then puts the actual x509 file information into the x509 log. If we want to eventually have the majority of the Bro fields mapped to ECS, we will need to account for both of these.

Here is an example of the information written to each log for a single connection:

"ssl": {
  "established": true,
  "cipher": "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
  "server_name": "sa.scorecardresearch.com",
  "client_cert_chain_fuids": [],
  "curve": "secp256r1",
  "subject": "CN=*.scorecardresearch.com,OU=COMODO SSL Wildcard,OU=Domain Control Validated",
  "cert_chain_fuids": [
    "FL0gSF4BtUVC67r6xb",
    "FZIS8w319zmNuIacH",
    "FMYaSdTqvds74iNC5",
    "FDCjVu160RpzNmAaX2"
  ],
  "next_protocol": "http/1.1",
  "validation_status": "ok",
  "resumed": false,
  "version": "TLSv12",
  "issuer": "CN=COMODO RSA Domain Validation Secure Server CA,O=COMODO CA Limited,L=Salford,ST=Greater Manchester,C=GB"
}
"x509": {
  "certificate_version": 3,
  "certificate_subject": "CN=*.scorecardresearch.com,OU=COMODO SSL Wildcard,OU=Domain Control Validated",
  "san_dns": [
    "*.scorecardresearch.com",
    "scorecardresearch.com"
  ],
  "certificate_sig_alg": "sha256WithRSAEncryption",
  "certificate_key_type": "rsa",
  "certificate_key_length": 2048,
  "certificate_not_valid_before": "2017-11-27T00:00:00.000000Z",
  "certificate_key_alg": "rsaEncryption",
  "certificate_serial": "816E2422058E2390DF7CE5C6FA282C26",
  "certificate_exponent": "65537",
  "basic_constraints_ca": false,
  "fuid": "FL0gSF4BtUVC67r6xb",
  "certificate_not_valid_after": "2018-12-20T23:59:59.000000Z",
  "certificate_issuer": "CN=COMODO RSA Domain Validation Secure Server CA,O=COMODO CA Limited,L=Salford,ST=Greater Manchester,C=GB"
}

@bndabbs
Copy link

bndabbs commented Aug 1, 2018

I do think x509 should sit at the top level instead of being nested under tls, though. There may be some cases where x509 keys are being used for something other than TLS encryption.

@ph
Copy link
Contributor

ph commented Aug 1, 2018

tls.certificates I should have defined it with PEM format in the original proposal, if you look at the keyword it can be an array, what you are proposing is to transform that into an object?

Moving it to a single object would mean the raw field will have a single certificate chain? I think we should probably keep it an array but move it to an object for more flexibility.

@andrewkroh
Copy link
Member

Packetbeat has fairly extensive TLS metadata.

@andrewvc
Copy link
Author

andrewvc commented Aug 3, 2018

I agree with @bndabbs that we should hoist x509 up to the top level.

I propose we add new root level key to ECS under x509 as follows.

{
	"x509": {
		"properties": {
			"raw": "-----BEGIN CERTIFICATE-----...", // Raw PEM encoded cert
			"role": {
				"type": "keyword"  // Typically either 'server' or 'client'. Potentially "peer" as well.
			}
			"not_before": {
				"type": "date"
			},
			"not_after": {
				"type": "date"
			},
			"public_key_size": {
				"type": "integer"
			},
			"alternative_names": {
				"type": "keyword" // Should we use path-hierarchy to tokenize these?
			}, 
			"subject": {
				"province": {
					"type": "keyword"
				},
				"common_name": {
					"type": "keyword"  // Should we use path-hierarchy to tokenize these?
				}, 
				"country": {
					"type": "keyword"
				},
				"organization": {
					"type": "keyword"
				},
				"locality": {
					"type": "keyword"
				}
			},
			"signature_algorithm": {
				"type": "keyword"
			} ,
			"public_key_algorithm": {
				"type": "keyword"
			},
			"version": {
				"type": "short"
			},
			"serial_number": {
				"type": "keyword"  // hex encoded representation, up to 20 octets
			},
			"issuer": {
				"country": {
					"type": "keyword"
				},
				"organization": {
					"type": "keyword"
				},
				"common_name": {
					"type": "keyword"
				},
			},
			"extensions": {
				"properties": {
					"name": {
						"type": "string"
					},
					"value": {
						"type": "keyword"
					}
				}
			}
		}
	}
}

@andrewkroh
Copy link
Member

I'm not sure if any of these make sense to include in the schema but they might be useful.

  • x509.expired - A simple boolean that indicates if the certificate was expired at the time of use / time of observation.
  • x509.validity_period - Number of seconds between the not_before and not_after dates. Sometimes it can be useful for filtering on short duration certs like Let's Encrypt or free ones issued by other CAs.

@andrewvc
Copy link
Author

andrewvc commented Aug 4, 2018

I think the period is great.

I'm a little hesitant about expired. That feels weird to compute a value whose result will be different in the future. The validity period cannot change, but expired can. Do we do that elsewhere with dates?

@andrewvc
Copy link
Author

andrewvc commented Aug 4, 2018

OTOH, it would be good for users looking back to see if they have a history of expired certs in use, so maybe that's not a great point to make. What do others think?

@webmat
Copy link
Contributor

webmat commented Aug 6, 2018

Not sure why, but I also feel uneasy about calculating a boolean based on an expiration date.

On the other hand I think it would be really useful to have. It's a very important thing that's straightforward to alert on, or display in red & so on. Everyone should catch those earlier by working off of the expiration date, but somehow there's always that one cert somewhere, that slips through the cracks :-)

So I'm in favour 👍

@andrewvc
Copy link
Author

andrewvc commented Aug 6, 2018

One other thought, I think we need to make each x509 cert a nested document. It would commonly be the case that one would search for both NotValidBefore and NotValidAfter and have both client/server certs OR have a certificate chain.

Are there any objections to making these documents nested.

@ruflin
Copy link
Contributor

ruflin commented Aug 8, 2018

I don't think we should start to add any nested datatypes to ECS. It makes querying it trickier and can be challenging on the Kibana side. The way we handled this so far in Beats is that each "nested object" is it's own document (which in Lucene it is anyways) but we have an id for example to correlate them.

To move this forward I would suggest to open a PR against the use-cases with this to have it in there quickly to show to users and discuss the fields in more details. This will make it then pretty easy to migrate it to ECS later as it's just copying it over.

@webmat webmat mentioned this issue Sep 18, 2018
26 tasks
@ruflin ruflin mentioned this issue Oct 31, 2018
22 tasks
@rw-access
Copy link
Contributor

I ran into a little bit of x.509 when working on digital code signatures #681 (PR #733).

Some of the fields used for the most common use cases are

  • subject name (often called signature_signer)
  • valid (bool): checksum match, etc
  • trusted (bool): does the chain check out and trace back to a trusted root?
  • status (string): logging for all of the other properties we couldn't capture

There's a good chance that I'm significantly oversimplifying things, but I also don't want to make my current use case more complicated than necessary

@andrewvc
Copy link
Author

andrewvc commented May 7, 2020

Closing due to #762 being merged.

@andrewvc andrewvc closed this as completed May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants