Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LogDir #13010

Closed
wants to merge 1 commit into from
Closed

Add LogDir #13010

wants to merge 1 commit into from

Conversation

wulonghui
Copy link
Contributor

LogDir

Abstract

A proposal for implementing LogDir. A LogDir can preserve and manage a pod's log files.

Several existing issues and PRs were already created regarding that particular subject:

  • Collect logfiles inside the container #12567
  • Add LogDir #13010

Author: WuLonghui (@wulonghui)

Motivation

Some applications will write log to files, so need to add LogDir. It can add some useful infomation to make logging agent easy to collect the logs, also we can manage the log files(like log rotation and maximum log file sizes).

Implementation

Create LogDir as a policy to api.VolumeMount struct :

// VolumeMount describes a mounting of a Volume within a container.
type VolumeMount struct {
    // Required: This must match the Name of a Volume [above].
    Name string `json:"name"`
    // Optional: Defaults to false (read-write).
    ReadOnly bool `json:"readOnly,omitempty"`
    // Required.
    MountPath string `json:"mountPath"`
    // Optional: Policies of VolumeMount.
    Policy *VolumeMountPolicy `json:"policy,omitempty"`
}

// VolumeMountPolicy describes policies of a VolumeMount.
type VolumeMountPolicy struct {
    // Optional: LogDir policy.
    LogDir *LogDirPolicy `json:"logDir,omitempty"`
}

// LogDirPolicy describes a policy of logDir, include log rotation and maximum log file size.
type LogDirPolicy struct {
    // Optional: Glob pattern of log files.
    Glob string `json:"glob,omitempty"`
    // Optional: Log rotation.
    Rotate string `json:"rotate,omitempty"`
    // Optional: Maximum log file size.
    MaxFileSize int `json:"maxFileSize,omitempty"`
}

If users set the LogDir policy of the VolumeMount in a container:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: container1
    image: ubuntu:14.04
    command: ["bash", "-c", "i=\"0\"; while true; do echo \"`hostname`_container1: $i \"; date --rfc-3339 ns >> /varlog/container1.log; sleep 4; i=$[$i+1]; done"]
    volumeMounts:
    - name: log-storage
      mountPath: /varlog
      policy:
        logDir: {}
    securityContext:
        privileged: true
  volumes:
  - name: log-storage
    emptyDir: {}

Kubelet will create a symbolic link to the volume path in /var/log/containers.

/var/log/containers/<pod_full_name>_<contianer_name> => <volume_path>

Then the logging agent(e.g.Fluentd) can watch /var/log/containers on host to collect log files, add tag by <pod_full_name>_<contianer_name>, which can be used for search terms in Elasticsearch or for
labels for Cloud Logging.

Integrated with Fluentd

We can use Fluentd to collect log files in LogDir. Fluentd should be installed on each Kubernetes node, and it will watch LogDir /var/log/containers, the Fluentd's configuration as follows:

<source>
  type tail
  format none
  time_key time
  path /var/log/containers/*/**/*.log
  pos_file /lib/pods.log.pos
  time_format %Y-%m-%dT%H:%M:%S
  tag reform.*
  read_from_head true
</source>

<match reform.**>
  type record_reformer
  enable_ruby true
  tag kubernetes.${hostname}.${tag_suffix[4]}
</match>

<match **>
  type stdout
</match>

The Fluentd will tail any files in the LogDir /var/log/containers, and add tag kubernetes.<node_host_name>.<pod_full_name>_<container_name>.<file_name>. We only print the logs to stdout, also can forward to logging storage endpoint(e.g Elasticsearch).

Then the Fluentd prints the logs to stdout:

2015-09-11 11:00:10 +0000 kubernetes.8f5cd4af528a.my-app_default_container1.container1.log: {"message":"2015-09-11 10:59:53.331748730+00:00"}
2015-09-11 11:00:10 +0000 kubernetes.8f5cd4af528a.my-app_default_container1.container1.log: {"message":"2015-09-11 10:59:57.335719322+00:00"}
2015-09-11 11:00:10 +0000 kubernetes.8f5cd4af528a.my-app_default_container1.container1.log: {"message":"2015-09-11 11:00:01.339536181+00:00"}

Future work

  • Be able to limit maximum log file sizes
  • Be able to support log rotation

@k8s-bot
Copy link

k8s-bot commented Aug 21, 2015

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

@brendandburns
Copy link
Contributor

Thanks for taking the time to do a PR.

Rather than re-using the EmptyDirVolumeSource I think we're going to need a different LogDirVolumeSource since users will want to customize behaviors like log rotation and maximum log file sizes, etc.

How do you plan on integrating this with the fluentd logger that is onboard each node?

It might be most effective to add a short (1-2 page) design document for the complete solution to this PR. You don't have to implement it all in 1 PR, but having a design document to guide the implementation will help both with the review, as well as with splitting up the work.

Again, many thanks for taking this on.

--brendan

@brendandburns brendandburns added area/kubelet area/api Indicates an issue on api area. labels Aug 21, 2015
@brendandburns
Copy link
Contributor

Also you need to run

./hack/update-generated-conversions.sh
./hack/update-generated-deep-copies.sh

To regenerate some of the API conversion code that is generated from types.go

@brendandburns brendandburns self-assigned this Aug 21, 2015
@wulonghui
Copy link
Contributor Author

@brendandburns
Thanks for your advice, I will add a new LogDirVolumeSource .
And, where to put the design document?

@pmorie
Copy link
Member

pmorie commented Aug 21, 2015

@wulonghui agree that this should be a new volume source. The design document can start out in docs/proposals -- it will move to docs/design when accepted.

@wulonghui
Copy link
Contributor Author

@brendandburns
After running ./hack/update-generated-conversions.sh, it fails to go build:

# k8s.io/kubernetes/pkg/api/v1
_output/local/go/src/k8s.io/kubernetes/pkg/api/v1/conversion.go:239: undefined: convert_api_Volume_To_v1_Volume
_output/local/go/src/k8s.io/kubernetes/pkg/api/v1/conversion.go:303: undefined: convert_v1_Volume_To_api_Volume
!!! Error in /home/travis/gopath/src/github.com/kubernetes/kubernetes/hack/lib/golang.sh:387
  'go install "${goflags[@]:+${goflags[@]}}" -ldflags "${goldflags}" "${nonstatics[@]:+${nonstatics[@]}}"' exited with status 2

The new generated code don't include

convert_api_Volume_To_v1_Volume
convert_v1_Volume_To_api_Volume

How shoud I do?

@brendandburns
Copy link
Contributor

Can you reset to HEAD all of the *_generated.go files and try again? I'm not sure why the Volumes generated code was removed.

@wulonghui
Copy link
Contributor Author

@brendandburns
I figure out that why the Volumes generated code was removed:

couldn't find a corresponding field LogDir in v1.VolumeSource

Add the LogDir to the v1.VolumeSource

* LogDir: <kubelet_root_dir>/pods/<pod_id>/volumes/kubernetes.io~log-dir/<pod_full_name>/<volume_name>
```

As we can see, `emptyDir` volume add a directory level: `pod_full_name`, the purpose is to make logging agent(e.g.fluentd) easy to add tag by `pod_full_name`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see how we configure fluentd or other collectors to find these logs. I'm concerned that the coupling may be too loose (it's a side-effect of the pod spec) and also end up hardcoding things like kubelet root dir (or else add more flags that have to be syncronized across configs).

I wonder if this might make more sense to promote to a full API construct - something like:

pod:
    spec:
        containers:
            - name: foo
              logDir: /var/log/foo
        logPolicy:
            rotate: Daily

I just threw that out, it's probably not at all the form we would want.

@saad-ali
Copy link
Member

CC

@wulonghui
Copy link
Contributor Author

@brendandburns @thockin @saad-ali
I add the configuration of Fluentd, integrated-with-fluentd.
Is this clear?

@wulonghui
Copy link
Contributor Author

@brendandburns @thockin
Could anyone check the design, then I can update for any problem.
^-^

@k8s-github-robot k8s-github-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 27, 2015
@brendandburns
Copy link
Contributor

Design LGTM at a high level. Go ahead and code it up, please add an e2e test to validate that it works.

Thanks again!
--brendan

@saad-ali
Copy link
Member

I realize that there is a pain point you are trying to solve here. Namely that logs generated by containers don't end up in a persistent store. But I am not convinced that creating a new volume plugin is the solution. A volume plugin is the abstraction we use to enable new types of block storage. Log collection, log rotation, etc., I believe, are orthogonal to that and irrelevant to the volume layer.

@wulonghui
Copy link
Contributor Author

@saad-ali
Similar to EmptyDir volume, LogDir provide users to store log files,also hanle something that other volume cann't do. I can say ,it's kind of storage.
what better solution you think?

format none
time_key time
path /var/lib/kubelet/pods/*/volumes/kubernetes.io~log-dir/**/*.log
pos_file /lib/pods.log.pos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you mean /var/lib/... ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is a mistake.

@a-robinson
Copy link
Contributor

Like @saad-ali, I was kind of surprised to see this as a volume plugin rather than as a separate subfield of container, but can see that it does make some sense. It's somewhat symmetric to the git-puller, which is basically a read-only mount that pulls data from elsewhere, whereas this is effectively a write-only mount that pushes data to elsewhere. However, it's a very reasonable argument to say that the git-puller doesn't really belong as a volume plugin either.

The alternative we've considered in the past is to make this more of a first-class field in the Container struct, where the user would specify the log paths they want collected (and related options), without having to define a volume and a volumeMount. @bgrant0607 would be a good person to run that thought by.

@wulonghui
Copy link
Contributor Author

@brendandburns @a-robinson @saad-ali
Add LogDir to Container struct sounds fine to this issue, I will check and try.
Thanks for all your idea.

@wulonghui wulonghui changed the title Add LogDir volume plugin Add LogDir Aug 31, 2015
@wulonghui
Copy link
Contributor Author

@brendandburns @a-robinson @saad-ali
I add the simple design of the alternative way, so now we have:

Cloud all of you check the design.

Thanks

@saad-ali
Copy link
Member

Adding a LogDir field in the Container struct makes a lot more sense to me.

@wulonghui
Copy link
Contributor Author

@a-robinson Yes, I want to get this done. But I have been busy recently, would you take over this, and I want to help you.


## Motivation

Some applications will write log to files, so need to add `LogDir`. It can add some useful infomation to make logging agent easy to collect the logs, also we can manage the log files(like log rotation and maximum log file sizes).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to support apps that want to write to files? Isn't the docker way to write logs to stdout? Docker has a fluentd logging driver - can that obviate this whole proposal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker way certainly is to write logs to stdout, but this is a frequently recurring feature request from folks whose applications weren't necessarily built to be run in a container. It's come up as an important issue for a handful of companies I've talked to, but cc @aronchick for better product perspective on the importance of this.

The docker logging drivers are orthogonal to collecting logs from files -- they just let you take the logs from stdout and do something with them other than have them written to a file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone who has hacked about 500 different ways trying to get apache to do this, I can see a lot of motivation here - definitely a frequent request. As part of supporting legacy apps, I'd like to pick something like this up.

@thockin
Copy link
Member

thockin commented Jan 6, 2016

Overall I think the approach is ok, I just want to understand the bounds of it and how much this encompasses.

@k8s-bot
Copy link

k8s-bot commented Jan 28, 2016

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

2 similar comments
@k8s-bot
Copy link

k8s-bot commented Jan 28, 2016

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link

k8s-bot commented Feb 4, 2016

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

@priyankporwal
Copy link

@a-robinson / @wulonghui, I am interested in taking this PR forward. We have lot of existing services that pre-date docker and we are hoping to migrate them to hosted Kubernetes cluster. Having something like this for log-files collection would significantly ease their migration pain. Please let me know if you are ok with me working on it. Thanks!

@fejta fejta assigned thockin and unassigned brendandburns Apr 26, 2016
@fejta
Copy link
Contributor

fejta commented Apr 26, 2016

This PR has had no meaningful activity for multiple months. If it is still valid please rebase, push a new commit and reopen the PR. Thanks!

@harryge00
Copy link
Contributor

harryge00 commented Aug 9, 2016

Has any advance on this PR? This is a useful feature.

@itthought
Copy link

Is logDir related support implemented?

@thockin
Copy link
Member

thockin commented Sep 6, 2016

This did not get implemented. If someone wants to resurrect it, I think we
can get it through for 1.5...

On Sat, Sep 3, 2016 at 1:32 AM, radhey [email protected] wrote:

Is logDir related support implemented?


You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
#13010 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVJgI86jbjwxMoTK4eIlSWdzU6lvGks5qmTCxgaJpZM4FvmGR
.

@itthought
Copy link

It would be great help as this will solve my logging feature issues.

@ddysher
Copy link
Contributor

ddysher commented Sep 6, 2016

@itthought do you expect to work on this? we've seen a lot of requests for this feature. currently we just use sidecar, but are concerned with its performance and potential resource consumption as we'd have to run a sidecar per pod (possible resource waste as we give it 100m, 100M resource request and limit).

/cc @harryge00 since you asked this before.

@ae6rt
Copy link
Contributor

ae6rt commented Sep 16, 2016

This issue has been closed, but to where did the energy behind it migrate?

@laura-herrera
Copy link

I am very interested in this feature as i think it would solve all of my logging problems,
broken Stack traces, etc.
I followed the thread "Kubernetes logging, journalD, fluentD, and Splunk, oh my!" #24677, and I thought everyone had agreed and it was going to be implemented.
Is there any plan to add this?

@thockin
Copy link
Member

thockin commented Oct 25, 2016

This needs a chapion to drive it forward in the near term.

On Fri, Oct 21, 2016 at 6:21 AM, Laura Herrera [email protected]
wrote:

I am very interested in this feature as i think it would solve all of my
logging problems,
broken Stack traces, etc.
I followed the thread "Kubernetes logging, journalD, fluentD, and Splunk,
oh my!" #24677 #24677,
and I thought everyone had agreed and it was going to be implemented.

Is there any plan to add this?


You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
#13010 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVGLR8xaqcwB5FMp7_juWDPB0m_LAks5q2LxUgaJpZM4FvmGR
.

@coffeepac
Copy link
Contributor

If anyone is interested in this or other logging in k8s topics, please join SIG-Instrumentation where we will cover this and more. If you are going to be at KubeCon in Seattle this November, please consider attending the logging face to face meeting. Details for that can be found in the SIG-Instrumentation google group. Currently its the most recent thread and its fairly low volume group so it should still be near the top.

@piosz
Copy link
Member

piosz commented Nov 28, 2016

cc @crassirostris @fgrzadkowski

@bamb00
Copy link

bamb00 commented May 5, 2017

Hi,

What is the status of the logDir feature? Will the feature be in v1.1?

Thanks.

@crassirostris
Copy link

@bamb00 Currently there's no plan to implement this feature as it is in this PR. There will be a proposal, covering logging architecture in Kubernetes, where this aspect will be explicitly explained, both if it's going to be implemented or not. Stay tuned and look for a related proposals in the community repo.

@devasat
Copy link

devasat commented Jun 15, 2018

I'm interested in this feature too. I just need a pointer on the approach to use.

Below is the question I posted on SO.
https://stackoverflow.com/questions/50861591/kubernetes-how-to-aggregate-application-logs

I have a microservice deployed in a Tomcat container/pod. There are four different files generated in the container - access.log, tomcat.log, catalina.out and application.log (log4j output). What is the best approach to send these logs to Elasticsearch (or similar platform).

I read through the information on this page Logging Architecture - Kubernetes 5. Is “Sidecar container with a logging agent” the best option for my use case?

Is it possible to fetch pod labels (e.g.: version) and add it to each line? If it is doable, use a logging agent like fluentd? (I just want to know the direction I should take).

@thockin
Copy link
Member

thockin commented Jul 6, 2018

I didn't hate this proposal, but like so many things we have on our collective plate, it needs someone to own it to completion, and nobody has stepped up yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. area/kubelet kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/design Categorizes issue or PR as related to design. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.