Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add awscloudwatch Filebeat input #16524

Closed
wants to merge 1 commit into from
Closed

Add awscloudwatch Filebeat input #16524

wants to merge 1 commit into from

Conversation

kaiyan-sheng
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng commented Feb 24, 2020

[paused, will research more before continue on this path]

What does this PR do?

Why is it important?

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@kaiyan-sheng kaiyan-sheng self-assigned this Feb 24, 2020
@andresrc andresrc added [zube]: Inbox [zube]: In Progress Team:Platforms Label for the Integrations - Platforms team and removed [zube]: Inbox labels Mar 29, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@exekias
Copy link
Contributor

exekias commented Mar 31, 2020

One thing I'm wondering about, my understanding is that the GetLogEvents API (https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_GetLogEvents.html) will require us to store some checkpoint (in the form of next token or last timestamp). I'm not sure we are able to do this today without the new registry in place, so we may need to wait for it. Any thoughts on this?

@kaiyan-sheng kaiyan-sheng self-assigned this Mar 31, 2020
@kaiyan-sheng
Copy link
Contributor Author

@exekias I'm imagining this filebeat input would be pure polling depends on a given frequency/period. Assume period=5min, then every 5 minutes, GetLogEvents will be called using NewGetLogEventsPaginator to paginate through all results. Then 5 minutes later, the next collection with new StartTime and EndTime will start. I don't think we need to store any next token or timestamp in this case. But that's also assume if in this 5-minute, all logs are collected successfully. If user has larger amount of logs, then the period should be set to a longer time duration. WDYT?

type GetLogEventsInput struct {
	StartTime <current_timestamp-5min>
	EndTime <current_timestamp>
	Limit <100, 1000 or max 10,000 depends on the size of logs>
	LogGroupName <log_group_name>
	LogStreamName <log_stream_name>
}

@exekias
Copy link
Contributor

exekias commented Mar 31, 2020

That could be an option for an initial release, but at some I would expect users to need support for downtime in Filebeat without losing logs. We should be able to go back to were we were reading and start again from there.

If you think about it, this is the behavior you have today with the S3 input, where we guarantee that all events are actually sent to ES, if something fails (ie Filebeat dies), we will retry for the ones that were caught in fly.

@kaiyan-sheng
Copy link
Contributor Author

Yep I agree, only querying GetLogEvents API to collect logs periodically would be initial release. Later on, we need to figure out a way to support downtime in Filebeat without losing logs.
Maybe we can also make startTime & endTime as optional config parameter. This way, user can query for losing logs if they know the timestamp range. Or we can later add a config similar to start_position in https://github.com/lukewaite/logstash-input-cloudwatch-logs#start_position to give user the ability to read from the beginning of the log group or end or an arbitrary number of seconds in the past.

@exekias
Copy link
Contributor

exekias commented Apr 2, 2020

start_position seems indeed like a good option to start with!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants