-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove src/dst.hostname, rename url.hostname to url.domain. #175
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
5eee744
Bring `source` description in line with `destination`'s
d9f461f
Remove src/dst.hostname
c43a788
Replace `url.hostname` with `url.domain`.
42ee364
Adjust the description of `url.domain`
f4d4305
Add 'www.' to elastic.co in the URL examples.
f7da266
url hasn't been made into an official 'reuseable' object.
893defb
Add changelog entries
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still the case? should IP addresses be stored in the
domain
field?I think that having sources that don't differentiate names from IPs is quite frequent when parsing logs, having IPs stored in a field called
domain
can be a bit confusing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsoriano the
url.domain
field should generally not be populated with an IP address, however, when an event or log indicates that an IP address was used in a URL as the host/hostname, such as inhttp://15.73.192.108/index.html
, then the string15.73.192.108
would be entered into theurl.domain
field. This is useful, for example, for a security analyst to know that a user or system is attempting to bypass the DNS infrastructure to access a web resource. In some cases, this could be an indicator of malware activity. So searching through all values ofurl.domain
and looking for values that have the form of an IP address can be a useful analysis.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsoriano We were in a tough spot and couldn't actually call it
host
, as this concept of "name or IP" is generally called. In ECS,host
is a field set that can be reused in multiple places.Just to be clear, the concept of "hostname" is often misused in exactly the same way. The definition of a "host" is "IP address or hostname" (check out the RFC survey I summarized in #166).
An IP address doesn't belong in a "hostname" either. Using "hostname" this way is a mistake that's more common, though, so it doesn't stick out quite as much when people see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanations, I see the problematics and the reasoning, but using
domain
to store IPs still sounds a bit weird to me. In any case I don't have a better proposal if we cannot usehost
, but I'm afraid this can be a bit confusing also for users.Maybe after moving some beat modules to ECS and playing a bit more with it I see it in a different way :)
Will it be the same for source and destination? Maybe we should add some explanations about this also there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsoriano Under source and destination, we actually have both
domain
andip
. Having started one transition for the Traefik module here elastic/beats#9005 outlines a difference in approach.Most of the modules dealt with this indirectly and named a
keyword
field something likeremote_ip
and attempted to do GeoIP on it. In other words, the duality of this value being either an IP or a hostname was not addressed. Hopefully IN's GeoIP just fails gracefully when it encounters a hostname (I haven't checked). So I think in most of the modules right now, the lie is the other way around ;-) Theremote_ip
field can contain either an IP or a hostname, the field iskeyword
to support this, which ultimately means you can't do CIDR searches on it & so on.The approach I took in the PR linked above is to add a tiny bit of logic to the module in this conversion. The ambiguous value is stored in a custom field, then via Grok I try to grab an IP out of the field, and store that in
source.ip
. If that grok fails, the fallback is to grab whatever's there and store it insource.domain
. This happens hereThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but having hosts in two fields can be a source of problems:
We could have a field for "ambiguous" hosts in addition to
domain
andip
, so all cases can be covered. Modules that try to parse the host asdomain
orip
could still keep the original in the ambiguous field, modules that don't care can use the less-featured ambiguous keyword field.Even if ambiguous and not fully compliant with standards this field can still be common and useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree that having a field for the ambiguous information would be useful.
It used to be
hostname
, which was an incorrect name, as demonstrated in #166 (see the RFC survey in the issue body).And we're back to square 1: the "correct" name according to the RFCs is "host" which is defined as either an IP or a hostname. And we don't want to use "host" because it will conflict with the current
host
field set, if people want to enrich their stream by nesting known host information there.My best name for the ambiguous field for now is
host_address
at this time. But we haven't gotten together yet to plan the next steps towards ECS Beta 2.Note also that the simplest way to avoid this problem now is to store the ambiguous "host" in a custom field. To get back to my Traefik module example, I'm leaving the incorrectly named
traefik.access.remote_ip
field there, with the exact same meaning as before.