Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log HTTP requests, responses #13241

Closed
3 tasks done
azasypkin opened this issue Aug 1, 2017 · 16 comments · Fixed by #87939
Closed
3 tasks done

Log HTTP requests, responses #13241

azasypkin opened this issue Aug 1, 2017 · 16 comments · Fixed by #87939
Assignees
Labels
blocker discuss Feature:Logging Feature:New Platform Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@azasypkin
Copy link
Member

azasypkin commented Aug 1, 2017

Currently Kibana is able to log requests, corresponding responses and various operations data (CPU, memory, disk, and other metrics). It's done with the help of even-better HapiJS plugin (fork of good plugin). As far as I can tell this information is only logged with logging.json: true.

Also there is additional logging.filter configuration value that allows Kibana to remove/censor some sensitive data from logs (e.g. authorization or cookie headers).

In the new-platform we're not going to use HapiJS so we'll have to handle this differently. So let's start from the following questions first:

  • Are we still interested in all this data or is there anything we don't need anymore?
  • What additional "service" data we'd like to log, but don't do right now? It's a good time for enhancement requests.
  • Is "censoring" used primarily for authorization and cookie headers (i.e. only request/response use case)?

Any thoughts?

/cc @elastic/kibana-platform @elastic/kibana-operations


Tasks:

@epixa
Copy link
Contributor

epixa commented Aug 9, 2017

Are we still interested in all this data or is there anything we don't need anymore?

To ensure that this can wrong alongside the existing core, I think we need to preserve all of this data in exactly the same format as before.

What additional "service" data we'd like to log, but don't do right now? It's a good time for enhancement requests.

I'm not against new data or anything, but if we add new data in the new platform, then we'll need to do the same in the old core until that aspect is moved entirely over to the new platform, so I recommend avoiding it.

Is "censoring" used primarily for authorization and cookie headers (i.e. only request/response use case)?

Those are the only things I can think of

@azasypkin
Copy link
Member Author

To ensure that this can wrong alongside the existing core, I think we need to preserve all of this data in exactly the same format as before.

Yep, filed #13412 to track that.

I'm not against new data or anything, but if we add new data in the new platform, then we'll need to do the same in the old core until that aspect is moved entirely over to the new platform, so I recommend avoiding it.

Yeah, the plan is to carefully and selectively replace current plugins/sub-systems/features with counterparts from the new platform one by one (e.g. logging). To avoid double work we'll be introducing breaking changes/new behavior only for already incorporated stuff. Here I just wanted to collect ideas in advance :)

Those are the only things I can think of

Good, thanks.

@kimjoar kimjoar added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Sep 12, 2017
@epixa
Copy link
Contributor

epixa commented May 5, 2018

@azasypkin Anything else we need from this issue? We're moving forward with hapi, so that makes part of this moot. The rest seems relevant, but it also seems like we have a decision.

@azasypkin
Copy link
Member Author

Hmm, I think we need to do this eventually anyway. The fact that we're going to use Hapi in new platform doesn't change our intention to rely on our own logging system and get rid of good/even-better, right?

At some point we'll reach feature parity between new and old logging systems and just get rid of Hapi-based logging (aka even-better) and switch to new logging. At least that was the idea initially.

@kobelb
Copy link
Contributor

kobelb commented Dec 7, 2018

Is "censoring" used primarily for authorization and cookie headers (i.e. only request/response use case)?

As far as I'm aware from the even-better docs, these filters only apply to the response events from HapiJS

filter can be used to remove potentially sensitive information (credit card numbers, social security numbers, etc.) from the log payloads before they are sent out to reporters. This setting only impacts response events and only if payloads are included via requestPayload and responsePayload

I'm dealing with a situation where a user has a custom auth header showing up in their logs, and they need to filter it out and I recommended taking advantage of logging.filter to do so. I'd be helpful to see a similar facility in place if we switch the logging infrastructure in the new platform.

@azasypkin
Copy link
Member Author

I'd be helpful to see a similar facility in place if we switch the logging infrastructure in the new platform.

Yep, indeed, thanks for the input!

@tsullivan
Copy link
Member

Hi, is this advice still relevant to users that want to have the username from Auth header logged? https://discuss.elastic.co/t/how-can-i-setup-kibana-server-logs-for-capturing-the-user-logged-in-and-dashboards-viewed-info/174575/2?u=tsullivan

The way I've dealt with this requirement was to have a reverse proxy in front of Kibana that logs the request, including the URI and the auth username. You can have the proxy's logs ingested into Elasticsearch using Filebeat.

@azasypkin
Copy link
Member Author

Hi, is this advice still relevant to users that want to have the username from Auth header logged? https://discuss.elastic.co/t/how-can-i-setup-kibana-server-logs-for-capturing-the-user-logged-in-and-dashboards-viewed-info/174575/2?u=tsullivan

Hey, sorry for the late reply! I guess you can also achieve that with something like this:

logging:
  filter.authorization: none # it's `remove` by default
  json: true

@lukeelmers
Copy link
Member

Per our discussion this morning, I'm going to adjust the scope of this issue:

@lukeelmers lukeelmers self-assigned this Dec 16, 2020
@lukeelmers lukeelmers added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Dec 16, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@joshdover joshdover changed the title Log request, response and ops data Log HTTP requests, responses Dec 17, 2020
@lukeelmers
Copy link
Member

lukeelmers commented Jan 6, 2021

I’ve spent awhile thinking through our options for how request/response logging could work in the new logging system and am working toward a draft PR, but first I'd like to get some input (cc @elastic/kibana-core).

Legacy request/response logging

In the legacy platform, there are two ways I’m aware of to enable request/response logging:

  1. logging.verbose: true
  2. configuring logging.events which is passed through to hapi/good (this method is not documented in the Configuring Kibana docs):
logging:
  events:
    request: "*"
    response: "*"

With verbose logging, a typical response log when logging.dest: stdout will look like this:

server  response [15:51:58.332]  GET /path/to/thing 200 16ms - 9.0B

And when logging.json: true we will get more info outside of the message (headers and some info about the response payload, just not the payload itself):

{
  "type": "response",
  "@timestamp": "2021-01-06T10:00:14-07:00",
  "tags": [],
  "pid": 57746,
  "method": "post",
  "statusCode": 200,
  "req": {
    "url": "/path/to/thing",
    "method": "post",
    "headers": {
      "content-type": "application/json",
      "authorization": "Basic ABC123==", // can be removed with logging.filters.authorization=none
      "cookie": "sid=ABC123" // can be removed with logging.filters.cookie=none
      // ...etc
    },
    "remoteAddress": "127.0.0.1",
    "userAgent": "...",
    "referer": "http://localhost:5601/app/discover"
  },
  "res": {
    "statusCode": 200,
    "responseTime": 16,
    "contentLength": 9
  },
  "message": "POST /path/to/thing 200 16ms - 9.0B"
}

(Note: the calculated response contentLength is broken right now anyway but can be fixed as part of this effort)

So how do we translate all of this to the new platform logging system?

KP request/response logging

There are a few questions to answer here:

How to configure?

The simplest / most natural way to integrate these with the existing logging infrastructure would probably be to just put them under a dedicated context (response or maybe under http: http.server.Kibana.response). The issue here would be with the default logging config — legacy response logs are opt-in and aren’t included by default, so we’d need to log these as debug if we want to preserve that behavior out of the box... Otherwise everyone is going to have every request logged by default if they haven't specified anything in their logging config.

Also worth noting that folks who already enabled verbose logs would be getting a ton of new/duplicate response logs, unless we try to use this as a drop-in replacement for legacy response logging and integrate with the existing legacy configs. I'm not sure this makes sense since we should be aiming to conform to ECS which will require breaking changes.

Other options:

  • Dedicated top-level config logging.response: true or similar. Doesn’t feel like the right move as it isn’t in the spirit of the log4j-style config we’ve been using, but it would be the closest approximation to the legacy behavior.

How to actually log the responses?

The first thing that comes to mind is to register lifecycle preResponseHandlers with the http server as we’ve done in src/core/server/http/lifecycle_handlers.ts. This feels like the cleanest approach, but it would require us to expose more response info (specifically response time & content length) to all plugins using preResponseHandlers.

Other options:

  • We could manually register a one-off preresponse hook directly with hapi instead of going through the server contract. The main benefit would be that we avoid having to add the additional content length & response time data described above (unless we want to do this regardless).
    • Or actually, I guess we'd want to ensure this executes as the last preresponse hook -- in case the ones before it make any other modifications that should be reflected in the logs -- so perhaps this method actually makes more sense. The current core lifecycle handlers are registered before any plugin handlers are, and therefore get executed first.
    • EDIT: Should probably just use server.events.on('response') which is what hapi/good does

TLDR

  1. Are we comfortable with dropping the config for this altogether and instead relying on the log level & context to let users filter these out if they don't want them? If we set it to debug it won't be logged by default.
  2. Thoughts on implementing via the events response listener, (or onPreResponse lifecycle in the http server?) Preferences for alternatives?
  3. Any other thoughts? What am I missing?

@pgayvallet
Copy link
Contributor

pgayvallet commented Jan 7, 2021

Are we comfortable with dropping the config for this altogether and instead relying on the log level & context to let users filter these out if they don't want them? If we set it to debug it won't be logged by default

That's a hard one.

I agree that having the request/response logs under a specific context seems way more consistent to the new logging philosophy (and log4j fwiw), so I would say that it sounds alright.

In that case, I also agree that using the debug log level seems the only viable option to avoid flooding all users logs.

Also, we will only remove the legacy logging system on 8.0, so using logging.verbose would still work on 7.x releases.

So overall, I would say that I'm personally ok doing that. But I would like to hear what the others think.

Thoughts on implementing via the events response listener, (or onPreResponse lifecycle in the http server?) Preferences for alternatives?

As that's internal code, I would just go with the underlying events API to avoid increasing the API surface when there is no consumer needs for doing so. Using the underlying HAPI API in the http service code seems totally fine to me.

@lukeelmers
Copy link
Member

I agree that having the request/response logs under a specific context seems way more consistent to the new logging philosophy (and log4j fwiw), so I would say that it sounds alright.

Thinking about this more, we have a similar situation with elasticsearch.logQueries. Currently the new ES client still uses the config, but could alternatively be taking the same approach of relying on a user to configure logging for that context accordingly.

I think this is what @restrry was proposing in #57546, if I'm understanding correctly? Re-reading that issue, I'm not 100% clear on that part.

@mshustov
Copy link
Contributor

mshustov commented Jan 9, 2021

Are we comfortable with dropping the config for this altogether and instead relying on the log level & context to let users filter these out if they don't want them? If we set it to debug it won't be logged by default.

yes, imo it should align with the logging config across the stack.

Currently the new ES client still uses the config, but could alternatively be taking the same approach of relying on a user to configure logging for that context accordingly.

Yeah, that how I was thinking of it.
However, the situation with the duplicated request/response log might be confusing for the users and our support. I'd suggest explicitly deprecate these settings

logging:
  events:
    request: "*"
    response: "*"

and disable Hapi network logs even when logging.verbose: true.

Thoughts on implementing via the events response listener, (or onPreResponse lifecycle in the http server?) Preferences for alternatives?

+1 to use events to keep API surface small.

Any other thoughts? What am I missing?

How are you going to implement sensitive data (cookie, authorization header) filtering? We can start with filtering them out until we have a configurable filtering mechanism. @elastic/kibana-security do we want to provide the ability to log cookie, authorization at all?

@azasypkin
Copy link
Member Author

@elastic/kibana-security do we want to provide the ability to log cookie, authorization at all?

I can see the value in being able to log cookies and authorization headers for debug purposes assuming it's not enabled by default and can only be enabled intentionally and explicitly.

@lukeelmers
Copy link
Member

the situation with the duplicated request/response log might be confusing for the users and our support. I'd suggest explicitly deprecate these settings

++ Agreed, I was thinking along these lines too

and disable Hapi network logs even when logging.verbose: true.

Hadn't considered this, but sounds like a good idea.

How are you going to implement sensitive data (cookie, authorization header) filtering? We can start with filtering them out until we have a configurable filtering mechanism

@restrry I added a note to the issue on filtering with some possible approaches to discuss. My thinking was to remove cookie/authorization headers by default here (as is done in legacy platform). And then when #57547 is addressed, handle the filtering of these headers by that new mechanism instead.

An alternative would be introducing the filtering mechanism first, and then revisiting this feature as a follow-up step. I think either approach would work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker discuss Feature:Logging Feature:New Platform Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants