Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(kuma-dp) separate tcp access logs with a new line #566

Merged
merged 2 commits into from
Feb 5, 2020

Conversation

jakubdyszkiewicz
Copy link
Contributor

Summary

The main usecase of using TCP access logs is to use Logstash to send logs to Elasticsearch.

With current demo setup which is

apiVersion: kuma.io/v1alpha1
kind: Mesh
metadata:
  name: default
spec:
  mtls:
    ca:
      builtin: {}
    enabled: true
  logging:
    backends:
    - name: logstash
      format: |
        {
            "destination": "%KUMA_DESTINATION_SERVICE%",
            "destinationAddress": "%UPSTREAM_HOST%",
            "source": "%KUMA_SOURCE_SERVICE%",
            "sourceAddress": "%KUMA_SOURCE_ADDRESS%",
            "bytesReceived": "%BYTES_RECEIVED%",
            "bytesSent": "%BYTES_SENT%"
        }
      tcp:
        address: logstash.logging:5000

and
logstash config

    input {
      tcp {
          port => 5000
          codec => "json"
      }
    }
    output{
      loggly{
        key => "API_KEY"
        tag => "logstash"
      }
    }

We noticed that on K8S, Logstash has trouble parsing some logs (like 5-10% if you generate them quickly)

[2020-02-03T14:49:00,646][ERROR][logstash.codecs.json     ] JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unrecognized token 'd': was expecting ('true', 'false' or 'null')
 at [Source: (String)"d.kuma-demo.svc:80","sourceAddress": "172.17.0.13:0","bytesReceived": "326","bytesSent": "854"}
"; line: 1, column: 2]>, :data=>"d.kuma-demo.svc:80\",\"sourceAddress\": \"172.17.0.13:0\",\"bytesReceived\": \"326\",\"bytesSent\": \"854\"}\n"}

I found some issues like this elastic/logstash#11072 suggesting that switching codec from json to json_lines may solve the problem.

After switching the codec and modify the format to a single line like this:

apiVersion: kuma.io/v1alpha1
kind: Mesh
metadata:
  name: default
spec:
  mtls:
    ca:
      builtin: {}
    enabled: true
  logging:
    backends:
    - name: logstash
      format: '{"destination": "%KUMA_DESTINATION_SERVICE%", "destinationAddress": "%UPSTREAM_HOST%", "source": "%KUMA_SOURCE_SERVICE%", "sourceAddress": "%KUMA_SOURCE_ADDRESS%", "bytesReceived": "%BYTES_RECEIVED%","bytesSent": "%BYTES_SENT%"}'
      tcp:
        address: logstash.logging:5000

and adding sending the \n after every log, the issue is gone.

I'm still not sure what was the real issue with json codec. I could not reproduce it on localhost. I checked with Wireshark that the TCP packet of problematic log was send in a one frame.

Shouldn't \n in format field work?

      format: '{"destination": "%KUMA_DESTINATION_SERVICE%", "destinationAddress": "%UPSTREAM_HOST%", "source": "%KUMA_SOURCE_SERVICE%", "sourceAddress": "%KUMA_SOURCE_ADDRESS%", "bytesReceived": "%BYTES_RECEIVED%","bytesSent": "%BYTES_SENT%"}\n'

YAML interprets this with 2 character at the end \ and n, not with single new line character. It can be done with

      format: | {"destination": "%KUMA_DESTINATION_SERVICE%", "destinationAddress": "%UPSTREAM_HOST%", "source": "%KUMA_SOURCE_SERVICE%", "sourceAddress": "%KUMA_SOURCE_ADDRESS%", "bytesReceived": "%BYTES_RECEIVED%","bytesSent": "%BYTES_SENT%"}

      tcp:
        address: logstash.logging:5000

and without hardcoding extra \n in kuma-dp, but it's a little bit awkward.

If hardcoding \n is a problem, we could introduce an option to enable it explicitly like

  logging:
    backends:
    - name: logstash
      format: '{"destination": "%KUMA_DESTINATION_SERVICE%", "destinationAddress": "%UPSTREAM_HOST%", "source": "%KUMA_SOURCE_SERVICE%", "sourceAddress": "%KUMA_SOURCE_ADDRESS%", "bytesReceived": "%BYTES_RECEIVED%","bytesSent": "%BYTES_SENT%"}'
      tcp:
        separateLogsWithNewline: true
        address: logstash.logging:5000

but I think we can do it if someone actually requests that this is a problem.

@jakubdyszkiewicz jakubdyszkiewicz requested a review from a team February 4, 2020 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants