Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: support for opsworks for chef automate #7817

Closed
wants to merge 16 commits into from
Closed
3 changes: 3 additions & 0 deletions aws/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ import (
"github.com/aws/aws-sdk-go/service/mq"
"github.com/aws/aws-sdk-go/service/neptune"
"github.com/aws/aws-sdk-go/service/opsworks"
"github.com/aws/aws-sdk-go/service/opsworkscm"
"github.com/aws/aws-sdk-go/service/organizations"
"github.com/aws/aws-sdk-go/service/pinpoint"
"github.com/aws/aws-sdk-go/service/pricing"
Expand Down Expand Up @@ -264,6 +265,7 @@ type AWSClient struct {
mqconn *mq.MQ
neptuneconn *neptune.Neptune
opsworksconn *opsworks.OpsWorks
opsworkscmconn *opsworkscm.OpsWorksCM
organizationsconn *organizations.Organizations
partition string
pinpointconn *pinpoint.Pinpoint
Expand Down Expand Up @@ -434,6 +436,7 @@ func (c *Config) Client() (interface{}, error) {
mqconn: mq.New(sess),
neptuneconn: neptune.New(sess),
opsworksconn: opsworks.New(sess),
opsworkscmconn: opsworkscm.New(sess),
organizationsconn: organizations.New(sess),
partition: partition,
pinpointconn: pinpoint.New(sess),
Expand Down
1 change: 1 addition & 0 deletions aws/provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -577,6 +577,7 @@ func Provider() terraform.ResourceProvider {
"aws_opsworks_user_profile": resourceAwsOpsworksUserProfile(),
"aws_opsworks_permission": resourceAwsOpsworksPermission(),
"aws_opsworks_rds_db_instance": resourceAwsOpsworksRdsDbInstance(),
"aws_opsworks_chef": resourceAwsOpsworksChef(),
"aws_organizations_organization": resourceAwsOrganizationsOrganization(),
"aws_organizations_account": resourceAwsOrganizationsAccount(),
"aws_organizations_policy": resourceAwsOrganizationsPolicy(),
Expand Down
343 changes: 343 additions & 0 deletions aws/resource_aws_opsworks_chef.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,343 @@
package aws

import (
"log"
"strings"
"time"

"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/awserr"
"github.com/aws/aws-sdk-go/service/opsworkscm"
"github.com/hashicorp/terraform/helper/resource"
"github.com/hashicorp/terraform/helper/schema"
"github.com/hashicorp/terraform/helper/validation"
)

const EngineModel = "Single"
const EngineVersion = "12"

func resourceAwsOpsworksChef() *schema.Resource {
return &schema.Resource{
Create: resourceAwsOpsworksChefCreate,
Read: resourceAwsOpsworksChefRead,
Update: resourceAwsOpsworksChefUpdate,
Delete: resourceAwsOpsworksChefDelete,
Importer: &schema.ResourceImporter{
State: schema.ImportStatePassthrough,
},

Schema: map[string]*schema.Schema{
// TODO: do we want this? I think this is another "magic" thing that happens
// I think we can still associate a public IP later if we want? :shrug:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious to know what y'all think about this. I'm on the side of not having magic happen that the user might not be expecting, meaning this attribute would go away and we'd just hardcode it to no behind the scenes.

"associate_public_ip_address": {
Type: schema.TypeBool,
Optional: true,
Default: false,
},

"backup_retention_count": {
Type: schema.TypeInt,
Optional: true,
Default: 1,
// https://docs.aws.amazon.com/opsworks/latest/userguide/opscm-chef-backup.html
// I could swear I found this 1-30 limitation elsewhere, but I can't seem to find it now other than there
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found docs here

A maximum of 30 backups are kept; when the maximum is reached, AWS OpsWorks for Chef Automate deletes the oldest backups to make room for new ones.
It's also on the console wizard

ValidateFunc: validation.IntBetween(1, 30),
},

"disable_automated_backup": {
Type: schema.TypeBool,
Optional: true,
Default: false,
},

// TODO:
// I kinda want to make this required because how else are we going to get at that info?
// Also, since these are write-only and not returned by the API, we'll probably need a custom diff function that ignores them?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also interested in y'all's thoughts on this:

What's happening here is these are optional things to the API, and amazon will generate values for you and return them in the CreateServerResponse, but you'll never see it again.

I don't know if there's a way then to take the output of that (data provider-like?) and shove it into another resource like a vault password or something. And I'm fairly certain the whole thing would be pretty useless if you threw away these values. So I'm leaning on the side of making these required up front.

"these" being the chef_pivotal_key and chef_delivery_admin_password attributes, both of which behave this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this is just the pubkey component. Due to sensitive information leak in state the privkey should ideally be kept out of terraform - making this argument required.

Also needs to be base64 encoded, which differs from what is provided in your testcase (which would include -----BEGIN...)

"chef_pivotal_key": {
Type: schema.TypeString,
Required: true,
// TODO: should we?
// ValidateFunc: validateRsaKey,
},

"chef_delivery_admin_password": {
Type: schema.TypeString,
Required: true,
Sensitive: true,
// TODO: should we?
// ValidateFunc: validatePasswordOk,
},

// TODO: document the instance role required
// https://s3.amazonaws.com/opsworks-cm-us-east-1-prod-default-assets/misc/opsworks-cm-roles.yaml
"instance_profile_arn": {
Type: schema.TypeString,
Required: true,
ForceNew: true,
ValidateFunc: validateArn,
},

"instance_type": {
Type: schema.TypeString,
Required: true,
ForceNew: true,
},

"name": {
Type: schema.TypeString,
Required: true,
ForceNew: true,
},

"ssh_key_pair": {
Type: schema.TypeString,
Optional: true,
ForceNew: true,
},

"preferred_backup_window": {
Type: schema.TypeString,
Required: true,
ValidateFunc: validateStartTimeFormat,
},

"preferred_maintenance_window": {
Type: schema.TypeString,
Required: true,
ValidateFunc: validateStartTimeFormat,
},

// TODO:
// if you don't specify this, it'll create one. I'm of the opinion that users should
// create it themselves. Or maybe some sort of wrapper module? I dunno.
// if it's optional, will they be computed? if so, is that something we can express here?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another "magic happens" thing. If you don't specify security groups, it'll make one for you. Is this something we want to allow or should we require folks to specify the security groups explicitly? I'm leaning on the side of explicitly.

"security_group_ids": {
Type: schema.TypeSet,
Required: true,
ForceNew: true,
Elem: &schema.Schema{Type: schema.TypeString},
Set: schema.HashString,
MinItems: 1,
},

// TODO: document this role creation
// https://s3.amazonaws.com/opsworks-cm-us-east-1-prod-default-assets/misc/opsworks-cm-roles.yaml
"service_role_arn": {
Type: schema.TypeString,
Required: true,
ForceNew: true,
ValidateFunc: validateArn,
},

// TODO: huh?
// The IDs of subnets in which to launch the server EC2 instance.
//
// Amazon EC2-Classic customers: This field is required. All servers must run
// within a VPC. The VPC must have "Auto Assign Public IP" enabled.
//
// EC2-VPC customers: This field is optional. If you do not specify subnet IDs,
// your EC2 instances are created in a default subnet that is selected by Amazon
// EC2. If you specify subnet IDs, the VPC must have "Auto Assign Public IP"
// enabled.
// if it's optional, will they be computed? if so, is that something we can express here?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All sorts of :wat: here.

I'm thinking we should force people to specify subnet IDs. But that's just me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make subnets required argument.

The subnet only needs "Auto Assign Public IP" if associate_public_ip_address == true, otherwise it won't be internet accessible. AWS docs may have drifted, you can now make chef servers with private ips.

This, together with EC2-Classic requirements should just be covered in documentation.

"subnet_ids": {
Type: schema.TypeSet,
Required: true,
ForceNew: true,
Elem: &schema.Schema{Type: schema.TypeString},
Set: schema.HashString,
MinItems: 1,
},

"endpoint": {
Type: schema.TypeString,
Computed: true,
},

"arn": {
Type: schema.TypeString,
Computed: true,
},
},
}
}

func resourceAwsOpsworksChefValidate(d *schema.ResourceData) error {
// TODO: validation function
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For validation, how much should I replicate the validation that amazon is going to do on the backend here?

The maintenance and backup windows have a simple format that could be validated, and then there's the chef admin password. Oh, and the chef_pivotal_key. How much should I validate before just leaving it to the API to explode when the input is wrong?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so window format is already a thing that is validated elsewhere so I'll validate it too. I wanted to wrap the existing validators but wasn't really sure how to do it so I made another that did the needful. Definitely open to any suggestions for improvement :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

annnnd nevermind, that's in a newer version of terraform than this package is pulling in (I think, I'm not super familiar with golang development tbh). I'll use the one in this provider: https://github.com/terraform-providers/terraform-provider-aws/blob/85e58f8cf3c79621d32db94f657e8a258ee44fd3/aws/validators.go#L27-L42

// chefPivotalKey validate rsa? or is that too much?
// chefDeliveryAdminPassword validate the password meets the API's requirements or is that too much?
// backupRetentionCount validate that it's positive
// but also maybe validate that it's in the valid range? or is that too much?
// I also wonder if there are constants in the API connector that maybe we can leverage here
return nil
}

func resourceAwsOpsworksChefRead(d *schema.ResourceData, meta interface{}) error {
client := meta.(*AWSClient).opsworkscmconn

req := &opsworkscm.DescribeServersInput{
ServerName: aws.String(d.Id()),
}

log.Printf("[DEBUG] Reading OpsWorks Chef server: %s", d.Id())

var resp *opsworkscm.DescribeServersOutput
var dErr error

resp, dErr = client.DescribeServers(req)
if dErr != nil {
log.Printf("[DEBUG] OpsWorks Chef (%s) not found", d.Id())
d.SetId("")
return dErr
}

server := resp.Servers[0]
d.Set("associate_public_ip_address", server.AssociatePublicIpAddress)
d.Set("backup_retention_count", server.BackupRetentionCount)
d.Set("disable_automated_backup", server.DisableAutomatedBackup)
d.Set("instance_profile_arn", server.InstanceProfileArn)
d.Set("instance_type", server.InstanceType)
d.SetId(*server.ServerName)
d.Set("name", server.ServerName)
if server.KeyPair != nil {
d.Set("ssh_key_pair", server.KeyPair)
}
d.Set("preferred_backup_window", server.PreferredBackupWindow)
d.Set("preferred_maintenance_window", server.PreferredMaintenanceWindow)
d.Set("security_group_ids", flattenStringList(server.SecurityGroupIds))
d.Set("subnet_ids", flattenStringList(server.SubnetIds))
d.Set("endpoint", server.Endpoint)
d.Set("arn", server.ServerArn)

return nil
}

func resourceAwsOpsworksChefCreate(d *schema.ResourceData, meta interface{}) error {
client := meta.(*AWSClient).opsworkscmconn

err := resourceAwsOpsworksChefValidate(d)
if err != nil {
return err
}

req := &opsworkscm.CreateServerInput{
AssociatePublicIpAddress: aws.Bool(d.Get("associate_public_ip_address").(bool)),
BackupRetentionCount: aws.Int64(int64(d.Get("backup_retention_count").(int))),
DisableAutomatedBackup: aws.Bool(d.Get("disable_automated_backup").(bool)),
Engine: aws.String("Chef"),
EngineModel: aws.String(EngineModel),
EngineVersion: aws.String(EngineVersion),
InstanceProfileArn: aws.String(d.Get("instance_profile_arn").(string)),
InstanceType: aws.String(d.Get("instance_type").(string)),
ServerName: aws.String(d.Get("name").(string)),
PreferredBackupWindow: aws.String(d.Get("preferred_backup_window").(string)),
PreferredMaintenanceWindow: aws.String(d.Get("preferred_maintenance_window").(string)),
SecurityGroupIds: expandStringSet(d.Get("security_group_ids").(*schema.Set)),
ServiceRoleArn: aws.String(d.Get("service_role_arn").(string)),
SubnetIds: expandStringSet(d.Get("subnet_ids").(*schema.Set)),
}

chefPivotalKeyAttribute := opsworkscm.EngineAttribute{
Name: aws.String("CHEF_PIVOTAL_KEY"),
Value: aws.String(d.Get("chef_pivotal_key").(string)),
}

chefDeliveryAdminPassword := opsworkscm.EngineAttribute{
Name: aws.String("CHEF_DELIVERY_ADMIN_PASSWORD"),
Value: aws.String(d.Get("chef_delivery_admin_password").(string)),
}

req.SetEngineAttributes([]*opsworkscm.EngineAttribute{&chefPivotalKeyAttribute, &chefDeliveryAdminPassword})

// TODO: possibly make these optional
// TODO: security_group_ids, optional
// TODO: subnet_ids, optional

if sshKeyPair, ok := d.GetOk("ssh_key_pair"); ok {
req.KeyPair = aws.String(sshKeyPair.(string))
}

log.Printf("[DEBUG] Creating OpsWorks Chef server: %s", req)

var resp *opsworkscm.CreateServerOutput
err = resource.Retry(20*time.Minute, func() *resource.RetryError {
var cerr error
resp, cerr = client.CreateServer(req)
if cerr != nil {
if opserr, ok := cerr.(awserr.Error); ok {
// TODO: does this also happen with this resource?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I err on the side of "we can add this back later if it turns out it's a problem" or should I try to recreate the circumstances described to see if it needs to stay?

// If Terraform is also managing the service IAM role,
// it may have just been created and not yet be
// propagated.
// AWS doesn't provide a machine-readable code for this
// specific error, so we're forced to do fragile message
// matching.
// The full error we're looking for looks something like
// the following:
// Service Role Arn: [...] is not yet propagated, please try again in a couple of minutes
propErr := "not yet propagated"
trustErr := "not the necessary trust relationship"
validateErr := "validate IAM role permission"
if opserr.Code() == "ValidationException" && (strings.Contains(opserr.Message(), trustErr) || strings.Contains(opserr.Message(), propErr) || strings.Contains(opserr.Message(), validateErr)) {
log.Printf("[INFO] Waiting for service IAM role to propagate")
return resource.RetryableError(cerr)
}
}
return resource.NonRetryableError(cerr)
}
return nil
})
if err != nil {
return err
}

serverName := *resp.Server.ServerName
d.SetId(serverName)

return resourceAwsOpsworksChefUpdate(d, meta)
}

func resourceAwsOpsworksChefUpdate(d *schema.ResourceData, meta interface{}) error {
client := meta.(*AWSClient).opsworkscmconn

err := resourceAwsOpsworksChefValidate(d)
if err != nil {
return err
}

// TODO: these are the only values that are allowed to be changed
// according to the API, is there a way to express this in the provider somehow?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question here: only some of the values are able to be updated. The others are write-once. How to handle?

There are actually 2 more that can be updated: the chef_pivotal_key and chef_delivery_admin_password, but those are actually set using a different API call (easy enough to handle, though) but the question there is how do we know if those values need to change. Is there a way we can store the output of the createserver in a hashed form somehow so we can compare the inputs to what we last knew the values were on the server and update them? Or something else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like ForceNew on the attribute is the Schema is what I'm after for the first part of that question.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

followup question on the ForceNew: will that propagate to any node associations that exist? Also, I should probably create a resource for those.

https://docs.aws.amazon.com/sdk-for-go/api/service/opsworkscm/#OpsWorksCM.AssociateNode

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. Now that I think of it, the AssociateNode is probably going to be done outside of terraform scope, which means terraform wouldn't have the info needed to be able to recreate the associations. So a ForceNew would be a pretty destructive thing.

I'm curious how this is even handled on AWS side. If instance type is one of the forcenew params, then I'm assuming they're not able to roll the backend chef machine(s) to new sizes, which makes me wonder if they even have multiples or if it's HA or whatever, what happens when one dies? I'm probably reading too much into that part of things, but I might contact amazon support and ask for more details.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I filed a support request with amazon to see what they have to say about this. For now, I'll leave ForceNew on, but I actually think forcing new here would be a really bad idea, so I'm hoping there are better ways to handle that situation :) I'm open to any suggestions.

req := &opsworkscm.UpdateServerInput{
BackupRetentionCount: aws.Int64(int64(d.Get("backup_retention_count").(int))),
DisableAutomatedBackup: aws.Bool(d.Get("disable_automated_backup").(bool)),
PreferredBackupWindow: aws.String(d.Get("preferred_backup_window").(string)),
PreferredMaintenanceWindow: aws.String(d.Get("preferred_maintenance_window").(string)),
ServerName: aws.String(d.Get("name").(string)),
}

log.Printf("[DEBUG] Updating OpsWorks Chef server: %s", req)

_, err = client.UpdateServer(req)
if err != nil {
return err
}

return resourceAwsOpsworksChefRead(d, meta)
}

func resourceAwsOpsworksChefDelete(d *schema.ResourceData, meta interface{}) error {
client := meta.(*AWSClient).opsworkscmconn

req := &opsworkscm.DeleteServerInput{
ServerName: aws.String(d.Id()),
}

log.Printf("[DEBUG] Deleting OpsWorks Chef server: %s", d.Id())

_, err := client.DeleteServer(req)
if err != nil {
return err
}

return nil
}
Loading