Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsers: Support CRI-O and containerd #881

Merged
merged 2 commits into from
Nov 6, 2018
Merged

Conversation

StevenACoffman
Copy link
Contributor

@StevenACoffman StevenACoffman commented Nov 2, 2018

See #876 and #873
If a single character is detected, consider this the log tag for the line.
This is a part of the multiline handling for cri-o logs.

Example Fluent-bit log line on disk:

root@fluent-bit-29xjh:/# more /var/log/containers/fluent-bit-29xjh_dev-eeva-logging_fluent-bit-0db128860717a93c7d219425c9d49f4445c2d6c6e5fed3f4dc4298f03c582378.log 
2018-10-31T21:54:45.45487617Z stderr F Fluent-Bit v0.14.4
2018-10-31T21:54:45.454938595Z stderr F Copyright (C) Treasure Data
2018-10-31T21:54:45.454946545Z stderr F 
2018-10-31T21:54:45.455512949Z stderr F [2018/10/31 21:54:45] [ info] [engine] started (pid=1)
2018-10-31T21:54:48.837184585Z stderr F [2018/10/31 21:54:48] [ info] [filter_kube] https=1 host=kubernetes.default.svc.cluster.local port=443
2018-10-31T21:54:48.837218858Z stderr F [2018/10/31 21:54:48] [ info] [filter_kube] local POD info OK
2018-10-31T21:54:48.837226848Z stderr F [2018/10/31 21:54:48] [ info] [filter_kube] testing connectivity with API server...
2018-10-31T21:54:49.036009069Z stderr F [2018/10/31 21:54:49] [ info] [filter_kube] API server connectivity OK
2018-10-31T21:54:49.036783868Z stderr F [2018/10/31 21:54:49] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020

Example Elasticsearch JSON document:

{
  "_index": "logstash-2018.11.03",
  "_type": "flb_type",
  "_id": "sdIP22YBrbPWNMRHKBwz",
  "_version": 1,
  "_score": null,
  "_source": {
    "time": "2018-11-03T19:31:10.000Z",
    "stream": "stderr",
    "logtag": "F",
    "message": "\u001b[1mFluent-Bit v0.14.6\u001b[0m",
    "kubernetes": {
      "pod_name": "fluent-bit-wjnfq",
      "namespace_name": "dev-logging",
      "pod_id": "0293ad2d-df9f-11e8-8914-2e491e409191",
      "labels": {
        "controller-revision-hash": "1569803976",
        "k8s-app": "fluent-bit-logging",
        "kubernetes_io/cluster-service": "true",
        "pod-template-generation": "1",
        "version": "v1"
      },
      "annotations": {
        "prometheus_io/path": "/api/v1/metrics/prometheus",
        "prometheus_io/port": "2020",
        "prometheus_io/scrape": "true"
      },
      "host": "10.63.56.245",
      "container_name": "fluent-bit",
      "docker_id": "013362a984722805ede9cf323e34919890ea44ea0004897d3ed93178116676a2"
    }
  },
  "fields": {
    "time": [
      "2018-11-03T19:31:10.000Z"
    ]
  },
  "sort": [
    1541273470000
  ]
}

@edsiper
Copy link
Member

edsiper commented Nov 2, 2018

@StevenACoffman are we ok to merge this one ?

@edsiper edsiper self-assigned this Nov 2, 2018
@edsiper edsiper added enhancement waiting-for-user Waiting for more information, tests or requested changes labels Nov 2, 2018
@StevenACoffman
Copy link
Contributor Author

StevenACoffman commented Nov 3, 2018

@kskewes If you get a chance to run this through ElasticSearch, I would appreciate a comment with the output for documentation purposes. I have tested it with Kafka output and it works fine, but don't have an ElasticSearch cluster setup.

Format Regex
Regex /^(?<time>.+)\b(?<stream>stdout|stderr)\b(?<log>.*)$/
Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%N%:z
Copy link

@karlskewes karlskewes Nov 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this okay? %N and %:z

Should it be this?
%Y-%m-%dT%H:%M:%S.%L%z

Copy link
Contributor Author

@StevenACoffman StevenACoffman Nov 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The colon appears wrong. What is %L? Is it milliseconds as a decimal number [000, 999]? I don't see %L in the strptime documentation. I think I got %N from unix date man page.

The timestamps I see follow this format:

2016-02-17T00:04:05.931087621Z

With both your suggestions I see them in Kafka fine. Perhaps that's why I needed the Time_Keep On before?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed wrt %L.
It's not in that documentation but appears to be a custom fluent bit option for nano seconds.
See bottom of this page: https://fluentbit.io/documentation/0.14/parser/

# http://rubular.com/r/izM6olvshn
Name crio
# http://rubular.com/r/tjUt3Awgg4
Name cri
Format Regex

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make this regex although there was a commit to make this case insensitive merged.

Format Regex
Regex /^(?<time>.+)\b(?<stream>stdout|stderr)\b(?<log>.*)$/
Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%N%:z
Time_Keep On
Copy link

@karlskewes karlskewes Nov 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time_Keep On seems to break Elasticsearch for me. Could be my K8s filter or ES output?
Fluent-Bit logs this error.

[2018/11/03 19:26:13] [ warn] [out_es] Elasticsearch error
{"took":150,"errors":true,"items":[{"index":{"_index":"logstash-2018.11.03","_type":"flb_type","_id":"_vEK22YBtlziBPpiPhwA","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"i_o_exception","reason":"Duplicate field 'time'\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@2d5483; line: 1, column: 43]"}}}},{"index":{"_index":"logstash-2018.11.03","_type":"flb_type","_id":"__EK22YBtlziBPpiPhwA","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"i_o_exception","reason":"Duplicate field 'time'\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@4988de6b; line: 1, column: 43]"}}}},{"index":{"_index":"logstash-2018.11.03","_type":"flb_type","_id":"APEK22YBtlziBPpiPh0A","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse","caused_b

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't make a difference for my use case, so I can happily remove that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Do you end up with additional time fields in Kafka?
I can also remove locally and wonder if one of the maintainers or someone with more experience with this field can advise. I haven't been able to grok the use case.

@karlskewes
Copy link

With above changes I get this in Elasticsearch:

{
  "_index": "logstash-2018.11.03",
  "_type": "flb_type",
  "_id": "sdIP22YBrbPWNMRHKBwz",
  "_version": 1,
  "_score": null,
  "_source": {
    "time": "2018-11-03T19:31:10.000Z",
    "stream": "stderr",
    "logtag": "F",
    "message": "\u001b[1mFluent-Bit v0.14.6\u001b[0m",
    "kubernetes": {
      "pod_name": "fluent-bit-wjnfq",
      "namespace_name": "dev-logging",
      "pod_id": "0293ad2d-df9f-11e8-8914-2e491e409191",
      "labels": {
        "controller-revision-hash": "1569803976",
        "k8s-app": "fluent-bit-logging",
        "kubernetes_io/cluster-service": "true",
        "pod-template-generation": "1",
        "version": "v1"
      },
      "annotations": {
        "prometheus_io/path": "/api/v1/metrics/prometheus",
        "prometheus_io/port": "2020",
        "prometheus_io/scrape": "true"
      },
      "host": "10.63.56.245",
      "container_name": "fluent-bit",
      "docker_id": "013362a984722805ede9cf323e34919890ea44ea0004897d3ed93178116676a2"
    }
  },
  "fields": {
    "time": [
      "2018-11-03T19:31:10.000Z"
    ]
  },
  "sort": [
    1541273470000
  ]
}

@StevenACoffman
Copy link
Contributor Author

StevenACoffman commented Nov 4, 2018

Argh. My parser configuration is applied via a configmap, rather than what is baked into the docker image, so I tried copy pasting what I was running in my cluster using my phone yesterday. That was ill-advised, and I'm sorry for the confusion. This corrects the mistake. Thank you @kskewes

@karlskewes
Copy link

That would be tricky!
No problem. Was a good opportunity to revisit the docs and double check what we're doing here. Thanks~

@StevenACoffman
Copy link
Contributor Author

@edsiper This is okay to merge now.

@edsiper edsiper merged commit 767cc53 into fluent:master Nov 6, 2018
@edsiper
Copy link
Member

edsiper commented Nov 6, 2018

thanks!

rawahars pushed a commit to rawahars/fluent-bit that referenced this pull request Oct 24, 2022
* pipelines: inputs: exec_wasi: Add document for in_exec_wasi plugin

Signed-off-by: Hiroshi Hatake <[email protected]>

* summary: Add Exec Wasi section

Signed-off-by: Hiroshi Hatake <[email protected]>

* pipelines: filters: wasm: Add document for filter_wasm plugin

Signed-off-by: Hiroshi Hatake <[email protected]>

* summary: Add wasm section

Signed-off-by: Hiroshi Hatake <[email protected]>

Signed-off-by: Hiroshi Hatake <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement waiting-for-user Waiting for more information, tests or requested changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants