Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: make SQUID3 captures ecs compliant #270

Merged
merged 8 commits into from
Aug 10, 2020

Conversation

kares
Copy link
Contributor

@kares kares commented Jul 27, 2020

"1525334330.556 3 120.65.1.1 TCP_REFRESH_MISS/200 2014 GET http://www.sample.com/hellow_world.txt public-user DIRECT/www.sample.com text/plain 902351708.872"

matching before and after:

      "timestamp"=>"1525334330.556",
      "duration"=>"3",
      "client_address"=>"120.65.1.1",
      "cache_result"=>"TCP_REFRESH_MISS",
      "status_code"=>"200",
      "bytes"=>"2014",
      "request_method" => "GET",
      "url" => "http://www.sample.com/hellow_world.txt",
      "user"=>"public-user",
      "hierarchy_code"=>"DIRECT",
      "server"=>"www.sample.com",
      "content_type"=>"text/plain",
      "timestamp"=>"1525334330.556",
      "source"=>{"ip"=>"120.65.1.1"},
      "http"=>{
          "request"=>{"method"=>"GET"},
          "response"=>{"status_code"=>200, "body"=>{"bytes"=>2014}}
      },
      "destination"=>{"address"=>"www.sample.com"},
      "url"=>{"original"=>"http://www.sample.com/hellow_world.txt"},
      "user"=>{"name"=>"public-user",
      "squid"=>{
          "request"=>{"duration"=>3, "content_type"=>"text/plain"},
          "hierarchy_code"=>"DIRECT",
          "cache_result_code"=>"TCP_REFRESH_MISS"
      }

@kares
Copy link
Contributor Author

kares commented Jul 31, 2020

this is going to get some ECS specs (for now the specs part in only for the legacy behavior - tests were missing).
could you please do an ecs review for this - looking at the PR description with a sample log // cc @webmat @ebeahan

@ebeahan
Copy link

ebeahan commented Jul 31, 2020

I've left some suggestions. A squid module was recently added in elastic/beats#19713, and I did some comparison against the expected test data there in addition to peeking at the Squid LogFormat docs.

  • The cache result code could be mapped to event.action: event.action: TCP_REFRESH_MISS
  • I'd opt for http.response.bytes vs http.response.body.bytes since both the HTTP response body and headers are counted in the size.
  • The content_type field refers to HTTP reply header, so I'd opt for using squid.response.content_type. Good discussion has taken place over in Added http request and response content-type field elastic/ecs#554 proposing a place for all HTTP request and response headers. The conversation has stalled a bit, but we're looking to move that forward in the future.

@kares kares mentioned this pull request Aug 4, 2020
21 tasks
@kares kares marked this pull request as ready for review August 6, 2020 12:22
@kares kares requested a review from yaauie August 6, 2020 12:22
@kares
Copy link
Contributor Author

kares commented Aug 6, 2020

Thanks Eric, I've addressed your concerns :

The cache result code could be mapped to event.action: event.action: TCP_REFRESH_MISS

I'd opt for http.response.bytes vs http.response.body.bytes since both the HTTP response body and headers are counted in the size.

The content_type field refers to HTTP reply header, so I'd opt for using squid.response.content_type.

Been following Beats' repo from a tag so I missed squid support (on master), thanks for the links.
Did not look in detail but compared to others I've been following Beats's squid support is a bit different.
Probably lack a lot of context on the PR but (besides event.action) they match http.request.method under event.code.

Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming either the build goes green OR you plan to fix the build issues in the downstream feature branch.

One comment about the legacy change being unnecessary, feel free to resolve with either an ACK or a NAK.

@@ -1,4 +1,4 @@
# Pattern squid3
# Documentation of squid3 logs formats can be found at the following link:
# http://wiki.squid-cache.org/Features/LogFormat
SQUID3 %{NUMBER:timestamp}\s+%{NUMBER:duration}\s%{IP:client_address}\s%{WORD:cache_result}/%{POSINT:status_code}\s%{NUMBER:bytes}\s%{WORD:request_method}\s%{NOTSPACE:url}\s(%{NOTSPACE:user}|-)\s%{WORD:hierarchy_code}/%{IPORHOST:server}\s%{NOTSPACE:content_type}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 ah. The way this was originally specified, the intent was clearly to make user not capture a literal - (which indicates user was not provided), but because NOTSPACE matches -, this always ended up capturing anyway 🤦‍♀️ . That took me a minute to figure out. Good work in making it capture IFF present in the ecs-v1 pattern.

However, I don't think this change to the legacy patterns is necessary here because it is effectively a no-op, and its presence muddies the history of the legacy patterns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay than, same for me - took me a few to figure.
we'll let the next person wonder about this than, hopefully no one will have to.

@webmat
Copy link

webmat commented Aug 6, 2020

Yeah I like this mapping.

Reviewing this made me realize we added file.mime_type a while ago, but we hadn't added http.[request|response].mime_type yet. I'll see if we can get those in for ECS 1.6. Watch out for it the next ECS release.

But for now this is good 👍

@kares
Copy link
Contributor Author

kares commented Aug 10, 2020

Thanks Mat, I've added a check-list item to consider mime_type before we polish out the ECS efforts here.

@kares kares merged commit 84632b2 into logstash-plugins:ecs-wip Aug 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants