Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make AWS credential management available in data-prepper-config.yaml #2570

Closed
3 of 4 tasks
dlvenable opened this issue Apr 21, 2023 · 3 comments
Closed
3 of 4 tasks
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@dlvenable
Copy link
Member

dlvenable commented Apr 21, 2023

Problem

Presently Data Prepper pipeline definitions must have AWS IAM credential configurations for most AWS authentication.

This presents a few problems:

  • Different pipeline components have copied-and-pasted configurations resulting in duplicate configuration.
  • Pipeline authors may have to change multiple locations to perform updates.
  • AWS STS roles may need to be assumed in multiple locations even though they can be shared.
  • The pipeline configuration can be somewhat clutter with these configurations.

Solution

I'd like to have three options available for configuring AWS IAM credentials in pipeline configurations.

  1. Use a default AWS configuration configured in data-prepper-config.yaml.
  2. Specify a named AWS configuration which is configured in data-prepper-config.yaml.
  3. Configure the AWS configuration in the pipeline configuration as Data Prepper already supports.

Default AWS configuration

In data-prepper-config.yaml, I'd like to have something like the following.

aws:
  default:
    region: us-west-2
    sts_role_arn: "arn:aws:iam::123456789012:role/MyRole"

Now, can configure my opensearch sink with just:

- opensearch:
    hosts: [ "https://search-my-amazon-opensearch-domain.us-west-2.es.amazonaws.com" ]
    index: my_index

It will use that sts_role_arn and region as specified above.

Named AWS configurations

In data-prepper-config.yaml, I'd like to have something like the following.

aws:
  configurations:
    my_configuration:
      region: us-west-2
      sts_role_arn: "arn:aws:iam::123456789012:role/MyRole"

Now, can configure my opensearch sink with just:

- opensearch:
    hosts: [ "https://search-my-amazon-opensearch-domain.us-west-2.es.amazonaws.com" ]
    index: my_index
    aws:
      configuration: my_configuration

It will use that sts_role_arn and region as defined in my_configuration.

Additional configurations

Additionally, a few other options could be provided to customize how Data Prepper authenticates.

  • role_session_name_prefix - Now that credentials can be shared, a default STS session name would be DataPrepper-${random}. Instead, the role prefix can be configured. Thus, the session name can be ${role_session_name_prefix}-${random}.
  • role_session_name - Provide the full name for role sessions.
  • endpoint - Configure a specific endpoint for STS requests.
aws:
  default:
    region: us-west-2
    sts_role_arn: "arn:aws:iam::123456789012:role/MyRole"
    role_session_name_prefix: dp1
    endpoint: https://mysts.example.org

Alternative considered

An alternative is make use of AWS profiles and the default provider chain. However, this can be confusing because it relies on changes to user paths or environment variables. It also requires making configuration changes in other files beyond Data Prepper which may be challenging in certain environments.

Also, since Data Prepper supports these role configurations, it makes sense to support this with Data Prepper itself.

Plugin support

While I'd like to have this available in data-prepper-config.yaml, I do not think Data Prepper core should have this AWS functionality. Instead, I'd like to have the ability to create plugins which are not pipeline components, but are instead able to extend Data Prepper's core functionality.

That is, I want this AWS feature to be a plugin which adds these configurations to data-prepper-config.yaml. And it will provide classes to other plugins that need AWS support.

Tasks

@sharraj
Copy link

sharraj commented Apr 30, 2023

Can we also add hosts: [ "https://search-my-amazon-opensearch-domain.us-west-2.es.amazonaws.com" ] and all other common common at the common configuration ? It is additional flexibility if user want to write to one domain, he can specify every configuration related to that type of Sink at one place. Also we need in the per sink configuration is anything specific to that sink. User can also overwrite all this with per sink configuration on top of common configuration and that per sink configuration should be given priority.

@sharraj
Copy link

sharraj commented Apr 30, 2023

Also how we handle this when there are sinks of multiple types in same pipeline --> OpenSearch, OpenSearch Serverless, S3, Kafka, http etc. Also how source side of aws configuration will be handled.

@dlvenable
Copy link
Member Author

Can we also add hosts: [ "https://search-my-amazon-opensearch-domain.us-west-2.es.amazonaws.com" ] and all other common common at the common configuration ? It is additional flexibility if user want to write to one domain, he can specify every configuration related to that type of Sink at one place. Also we need in the per sink configuration is anything specific to that sink. User can also overwrite all this with per sink configuration on top of common configuration and that per sink configuration should be given priority.

@sharraj,

Please see #2590 for that proposal. I believe both will utilize some common solutions as provided by #2588. But, aside from that it is a distinct feature.

Also how we handle this when there are sinks of multiple types in same pipeline --> OpenSearch, OpenSearch Serverless, S3, Kafka, http etc. Also how source side of aws configuration will be handled.

Not all configurations will be shared. For example, the opensearch sink now has a serverless flag for Amazon OpenSearch Serverless. This is not relevant for other AWS-based sinks/sources. However, the configurations can be common, including the STS roles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

3 participants