Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Investigate adding cloudwatch input #17292

Closed
kaiyan-sheng opened this issue Mar 27, 2020 · 3 comments · Fixed by #19025
Closed

[Filebeat] Investigate adding cloudwatch input #17292

kaiyan-sheng opened this issue Mar 27, 2020 · 3 comments · Fixed by #19025
Assignees
Labels
Team:Platforms Label for the Integrations - Platforms team

Comments

@kaiyan-sheng
Copy link
Contributor

For collecting CloudWatch logs, CloudWatch API GetLogEvents or FilterLogEvents can be used. GetLogEvents can be used to lists log events from a specified log stream. FilterLogEvents can be used to list log events from the specified log group.

Limitation 1:
Using these two CloudWatch API to query logs is not very scalable due to the transactions per second (TPS) limits for GetLogEvents and FilterLogEvents:

  • GetLogEvents API quota: 10 requests per second per account per region
  • FilterLogEvents API: 5 transactions per second/account/region
  • DescribeLogGroups API: 5 transactions per second/account/region
  • DescribeLogStreams API: 5 transactions per second/account/region

Limitation 2:
By default, GetLogEvents and FilterLogEvents operations return as many log events as can fit in 1 MB (up to 10,000 log events), or all the events found within the time range that you specify. If the results include a token, then there are more log events available, and you can get additional results by specifying the token in a subsequent call.

Advantage:
Using CloudWatch API to get logs is much cheaper than using lambda functions.

@kaiyan-sheng kaiyan-sheng self-assigned this Mar 27, 2020
@kaiyan-sheng kaiyan-sheng added Filebeat Filebeat Team:Platforms Label for the Integrations - Platforms team labels Mar 27, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@kaiyan-sheng
Copy link
Contributor Author

Just for the record, here is the original draft PR: #16524

@kaiyan-sheng
Copy link
Contributor Author

After comparing FilterLogEvents API and GetLogEvents API, we decide to use FilterLogEvents in awscloudwatch Filebeat input.

Several main reasons:

  1. FilterLogEvents API makes requests per log group and GetLogEvents API makes requests per log stream. When there are multiple log streams under the same log group, using FilterLogEvents means less API call.
  2. GetLogEvents API requires both log group name and log stream name to be given. This means either user has to list all log stream names or Filebeat has to make an extra API call to list all log stream names under a given log group.
  3. FilterLogEvents API has a filterPattern input parameter can be used to filter only collecting logs you are interested in. (This is not in the first version of aws cloudwatch input.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants