Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aws-for-fluent-bit] Log Group retention time setting not working. #436

Open
luisamador opened this issue Jan 27, 2021 · 8 comments · May be fixed by #1168
Open

[aws-for-fluent-bit] Log Group retention time setting not working. #436

luisamador opened this issue Jan 27, 2021 · 8 comments · May be fixed by #1168
Assignees
Labels
bug Something isn't working

Comments

@luisamador
Copy link

luisamador commented Jan 27, 2021

Describe the bug
The option "cloudWatch.logRetentionDays" doesn't set the log retention days setting of the resulting CloudWatch log group.

Steps to reproduce

cloudWatch:
  region: us-east-1
  logGroupName: /aws/eks/blah/fluentbit-cloudwatch
  logRetentionDays: 3
  logKey: log
firehose:
  enabled: false
kinesis:
  enabled: false
elasticsearch:
  enabled: false

Expected outcome
The resulting log group should have a retention policy of 3 days. However it is set with a "Never expire" retention policy.

Environment

  • Chart name: aws-for-fluent-bit
  • Chart version: 0.1.5
  • Kubernetes version: EKS 1.18
@seansabour
Copy link

i'm also running into this issue as-well

@PettitWesley
Copy link
Contributor

The plugin used to only set the log retention on new log groups. This means if you have run the same config before then the log group might already exist, and the plugin will not update the retention.

We updated this recently and released it in AWS for Fluent Bit 2.10.0 for the cloudwatch plugin: aws/amazon-cloudwatch-logs-for-fluent-bit#121

@JonathanLachapelle
Copy link

I also have the same issue for both existing and new log groups.

@barantomasz83
Copy link

In my case it was missing action in AWS iam policy used by FB pods. "logs:putRetentionPolicy" solved problem

@illagrenan
Copy link

I have the same problem in version 2.21.4. Retention for new and existing log groups is always set to Never.

@illagrenan
Copy link

The problem was indeed the missing logs:putRetentionPolicy permission. I use EKSCTL to manage my EKS cluster and all my nodes have this IAM (ref.: https://eksctl.io/usage/iam-policies/#supported-iam-add-on-policies):

nodeGroups:
  - ...
    iam:
      withAddonPolicies:
        cloudWatch: true

In practice, nodes have this policy: arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy. It contains the following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricData",
                "ec2:DescribeVolumes",
                "ec2:DescribeTags",
                "logs:PutLogEvents",
                "logs:DescribeLogStreams",
                "logs:DescribeLogGroups",
                "logs:CreateLogStream",
                "logs:CreateLogGroup"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ssm:GetParameter"
            ],
            "Resource": "arn:aws:ssm:*:*:parameter/AmazonCloudWatch-*"
        }
    ]
}

And that's the problem, these permissions are insufficient.

@trallnag
Copy link

@illagrenan, exactly. The AWS documentation needs to be updated.

@mattduguid
Copy link

mattduguid commented Jun 7, 2022

Have been seeing similar issue where "log_retention_days" was not being set on our "additionalOutputs" and stayed at "Never expire",

Versions,

  • helm chart aws_for_fluentbit = "0.1.17"
  • image aws_for_fluentbit = "2.9.0" <-- worked, higher versions and we had a new different issue we still need to solve

Extract from aws-for-fluentbit-values.yaml

***etc***
additionalOutputs: |
[OUTPUT]
    Name                           cloudWatch
    Enabled                       true
    Match                          ebs-csi.*
    Region                         ap-southeast-2
    Log_Group_Name      /aws/eks/container-workload/xxxxx-ebs-csi
    Log_Stream_Prefix     fluentbit-
    Log_Retention_Days  14
    Auto_Create_Group    true

[OUTPUT]
    Name                           cloudWatch
    Enabled                       true
    Match                          xxxxx-sm.*
    Region                         ap-southeast-2
    Log_Group_Name      /aws/eks/container-workload/xxxxx-xxxxx-sm
    Log_Stream_Prefix    fluentbit-
    Log_Retention_Days  14
    Auto_Create_Group    true
***etc***

After reading the previous posts I observed that if the missing permission "logs:PutRetentionPolicy" is manually added (as not there by default) and I rerun the pipeline the permission is removed again, this should be added to the permanent list.

Error from the logs when trying to set log_retention_days,

time="2022-06-08T05:27:14Z" level=error msg="AccessDeniedException: User: arn:aws:sts::************:assumed-role/container-workload-aws-for-fluent-bit-sa-irsa/1654666034225831554 is not authorized to perform: logs:PutRetentionPolicy on resource: arn:aws:logs:ap-southeast-2:************:log-group:/aws/eks/container-workload/*****-cert-manager:log-stream: because no identity-based policy allows the logs:PutRetentionPolicy action\n\tstatus code: 400, request id: c2209996-****-4ae8-*****-03d4272f16f6" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:340"

Manually added the missing permission back, deleted the loggroups so they would be forced to recreate, restarted the daemonset for fluentbit which recreates the loggroups and the log retention is set correctly,

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
9 participants