Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add awscloudwatch filebeat input #19025

Merged
merged 27 commits into from
Jul 1, 2020
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
6fd5f17
Add awscloudwatch filebeat input
kaiyan-sheng Jun 6, 2020
473558d
run mage fmt update
kaiyan-sheng Jun 6, 2020
16c7031
Call FilterLogEventsRequest instead of GetLogEventsRequest
kaiyan-sheng Jun 17, 2020
81c778d
Add unit test for getStartPosition function
kaiyan-sheng Jun 18, 2020
4bba82b
add function for getLogEventsFromCloudWatch
kaiyan-sheng Jun 19, 2020
e99cd66
Remove debug print lines
kaiyan-sheng Jun 20, 2020
9226147
add doc and unit test
kaiyan-sheng Jun 22, 2020
4d6e574
Merge remote-tracking branch 'upstream/master' into aws_cloudwatch_input
kaiyan-sheng Jun 23, 2020
107157c
Use log group ARN instead of log group name and region name
kaiyan-sheng Jun 23, 2020
863ea55
run mage vendor
kaiyan-sheng Jun 24, 2020
df16a03
Fix endTime in getStartPosition when start_position set to end
kaiyan-sheng Jun 24, 2020
9abed73
update doc for beginning/end start_position
kaiyan-sheng Jun 24, 2020
33b2635
add api_sleep, log_group_name and region_name config
kaiyan-sheng Jun 26, 2020
84db179
Merge remote-tracking branch 'upstream/master' into aws_cloudwatch_input
kaiyan-sheng Jun 26, 2020
b8cdc41
remove cwContext
kaiyan-sheng Jun 26, 2020
59506ea
Add documentation on new parameters
kaiyan-sheng Jun 26, 2020
7e681b5
add _meta/fields.yml
kaiyan-sheng Jun 29, 2020
38d8394
Merge remote-tracking branch 'upstream/master' into aws_cloudwatch_input
kaiyan-sheng Jun 29, 2020
a407ab6
remove ecs field from awscloudwatch fields.yml
kaiyan-sheng Jun 29, 2020
88a6649
Fix unit tests
kaiyan-sheng Jun 29, 2020
76b67ee
Remove vendor
kaiyan-sheng Jun 29, 2020
e5f92fd
Merge remote-tracking branch 'upstream/master' into aws_cloudwatch_input
kaiyan-sheng Jun 30, 2020
7e429fc
Remove awscloudwatch.timestamp and change to use event.id
kaiyan-sheng Jun 30, 2020
2df45cd
Merge remote-tracking branch 'upstream/master' into aws_cloudwatch_input
kaiyan-sheng Jun 30, 2020
cc1ad4a
fix unit test
kaiyan-sheng Jul 1, 2020
9f6a457
Merge remote-tracking branch 'upstream/master' into aws_cloudwatch_input
kaiyan-sheng Jul 1, 2020
d3ef76b
add default_field: false into fields.yml
kaiyan-sheng Jul 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Explicitly set ECS version in all Filebeat modules. {pull}19198[19198]
- Add new mode to multiline reader to aggregate constant number of lines {pull}18352[18352]
- Add automatic retries and exponential backoff to httpjson input. {pull}18956[18956]
- Add awscloudwatch input. {pull}19025[19025]
- Changed the panw module to pass through (rather than drop) message types other than threat and traffic. {issue}16815[16815] {pull}19375[19375]
- Add support for timezone offsets and `Z` to decode_cef timestamp parser. {pull}19346[19346]
- Improve ECS categorization field mappings in traefik module. {issue}16183[16183] {pull}19379[19379]
Expand Down
42 changes: 42 additions & 0 deletions filebeat/docs/fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ grouped in the following categories:
* <<exported-fields-apache>>
* <<exported-fields-auditd>>
* <<exported-fields-aws>>
* <<exported-fields-awscloudwatch>>
* <<exported-fields-azure>>
* <<exported-fields-beat-common>>
* <<exported-fields-cef>>
Expand Down Expand Up @@ -2040,6 +2041,47 @@ type: keyword
The type of traffic: IPv4, IPv6, or EFA.


type: keyword

--

[[exported-fields-awscloudwatch]]
== awscloudwatch fields

Fields from AWS CloudWatch logs.



[float]
=== awscloudwatch

Fields from AWS CloudWatch logs.



*`awscloudwatch.log_group`*::
+
--
The name of the log group to which this event belongs.

type: keyword

--

*`awscloudwatch.log_stream`*::
+
--
The name of the log stream to which this event belongs.

type: keyword

--

*`awscloudwatch.ingestion_time`*::
+
--
The time the event was ingested in AWS CloudWatch.

type: keyword

--
Expand Down
119 changes: 119 additions & 0 deletions x-pack/filebeat/docs/inputs/input-awscloudwatch.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
[role="xpack"]

:libbeat-xpack-dir: ../../../../x-pack/libbeat

:type: awscloudwatch

[id="{beatname_lc}-input-{type}"]
=== awscloudwatch input

++++
<titleabbrev>awscloudwatch</titleabbrev>
++++

beta[]

`awscloudwatch` input can be used to retrieve all logs from all log streams in a
specific log group. `filterLogEvents` AWS API is used to list log events from
the specified log group. Amazon CloudWatch Logs can be used to store log files
from Amazon Elastic Compute Cloud(EC2), AWS CloudTrail, Route53, and other sources.

A log group is a group of log streams that share the same retention, monitoring,
and access control settings. You can define log groups and specify which streams
to put into each group. There is no limit on the number of log streams that can
belong to one log group.

A log stream is a sequence of log events that share the same source. Each
separate source of logs in CloudWatch Logs makes up a separate log stream.

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: awscloudwatch
log_group_arn: arn:aws:logs:us-east-1:428152502467:log-group:test:*
scan_frequency: 1m
credential_profile_name: elastic-beats
start_position: beginning
----

The `awscloudwatch` input supports the following configuration options plus the
<<{beatname_lc}-input-{type}-common-options>> described later.

[float]
==== `log_group_arn`
ARN of the log group to collect logs from.

==== `log_group_name`
Name of the log group to collect logs from. Note: region_name is required when
log_group_name is given.

==== `region_name`
Region that the specified log group belongs to.

[float]
==== `log_streams`
A list of strings of log streams names that Filebeat collect log events from.

[float]
==== `log_stream_prefix`
A string to filter the results to include only log events from log streams
that have names starting with this prefix.

[float]
==== `start_position`
`start_position` allows user to specify if this input should read log files from
the `beginning` or from the `end`.

* `beginning`: reads from the beginning of the log group (default).
* `end`: read only new messages from current time minus `scan_frequency` going forward

For example, with `scan_frequency` equals to `30s` and current timestamp is
`2020-06-24 12:00:00`:

* with `start_position = beginning`:
** first iteration: startTime=0, endTime=2020-06-24 12:00:00
** second iteration: startTime=2020-06-24 12:00:00, endTime=2020-06-24 12:00:30

* with `start_position = end`:
** first iteration: startTime=2020-06-24 11:59:30, endTime=2020-06-24 12:00:00
** second iteration: startTime=2020-06-24 12:00:00, endTime=2020-06-24 12:00:30

[float]
==== `scan_frequency`
This config parameter sets how often Filebeat checks for new log events from the
specified log group. Default `scan_frequency` is 1 minute, which means Filebeat
will sleep for 1 minute before querying for new logs again.

[float]
==== `api_timeout`
The maximum duration of AWS API can take. If it exceeds the timeout, AWS API
will be interrupted. The default AWS API timeout for a message is 120 seconds.
The minimum is 0 seconds. The maximum is half of the visibility timeout value.

[float]
==== `api_sleep`
This is used to sleep between AWS `FilterLogEvents` API calls inside the same
collection period. `FilterLogEvents` API has a quota of 5 transactions per
second (TPS)/account/Region. By default, `api_sleep` is 200 ms. This value should
only be adjusted when there are multiple Filebeats or multiple Filebeat inputs
collecting logs from the same region and AWS account.

[float]
==== `aws credentials`
In order to make AWS API calls, `awscloudwatch` input requires AWS credentials.
Please see <<aws-credentials-config,AWS credentials options>> for more details.

[float]
=== AWS Permissions
Specific AWS permissions are required for IAM user to access awscloudwatch:
----
logs:FilterLogEvents
----

[id="{beatname_lc}-input-{type}-common-options"]
include::../../../../filebeat/docs/inputs/input-common-options.asciidoc[]

[id="aws-credentials-config"]
include::{libbeat-xpack-dir}/docs/aws-credentials-config.asciidoc[]

:type!:
1 change: 1 addition & 0 deletions x-pack/filebeat/include/list.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions x-pack/filebeat/input/awscloudwatch/_meta/fields.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
- key: awscloudwatch
title: "awscloudwatch"
description: >
Fields from AWS CloudWatch logs.
fields:
- name: awscloudwatch
type: group
description: >
Fields from AWS CloudWatch logs.
fields:
- name: log_group
type: keyword
description: The name of the log group to which this event belongs.
- name: log_stream
type: keyword
description: The name of the log stream to which this event belongs.
- name: ingestion_time
type: keyword
description: The time the event was ingested in AWS CloudWatch.
57 changes: 57 additions & 0 deletions x-pack/filebeat/input/awscloudwatch/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package awscloudwatch
kaiyan-sheng marked this conversation as resolved.
Show resolved Hide resolved

import (
"errors"
"time"

"github.com/elastic/beats/v7/filebeat/harvester"
awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws"
)

type config struct {
harvester.ForwarderConfig `config:",inline"`
LogGroupARN string `config:"log_group_arn"`
LogGroupName string `config:"log_group_name"`
RegionName string `config:"region_name"`
LogStreams []string `config:"log_streams"`
LogStreamPrefix string `config:"log_stream_prefix"`
StartPosition string `config:"start_position" default:"beginning"`
ScanFrequency time.Duration `config:"scan_frequency" validate:"min=0,nonzero"`
APITimeout time.Duration `config:"api_timeout" validate:"min=0,nonzero"`
APISleep time.Duration `config:"api_sleep" validate:"min=0,nonzero"`
Comment on lines +23 to +25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api_sleep and scan_frequency are very similar concepts with very different names here. Would it make sense to unify these a little bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scan_frequency defines the sleep time between this Filebeat input recheck for new logs
api_sleep defines the sleep time between each FilterLogEvents API call in the same Filebeat collection cycle.

How about scan_frequency and api_freqency?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, let's keep api_sleep and make sure this is well documented

AwsConfig awscommon.ConfigAWS `config:",inline"`
}

func defaultConfig() config {
return config{
ForwarderConfig: harvester.ForwarderConfig{
Type: "awscloudwatch",
},
StartPosition: "beginning",
ScanFrequency: 10 * time.Second,
APITimeout: 120 * time.Second,
APISleep: 200 * time.Millisecond, // FilterLogEvents has a limit of 5 transactions per second (TPS)/account/Region: 1s / 5 = 200 ms
}
}

func (c *config) Validate() error {
if c.StartPosition != "beginning" && c.StartPosition != "end" {
return errors.New("start_position config parameter can only be " +
"either 'beginning' or 'end'")
}

if c.LogGroupARN == "" && c.LogGroupName == "" {
return errors.New("log_group_arn and log_group_name config parameter" +
"cannot be both empty")
}

if c.LogGroupName != "" && c.RegionName == "" {
return errors.New("region_name is required when log_group_name " +
"config parameter is given")
}
return nil
}
23 changes: 23 additions & 0 deletions x-pack/filebeat/input/awscloudwatch/fields.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading