Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_stepfunctions: StateMachine construct doesn't generate a valid policy for default StateMachineRole #31714

Closed
1 task
JamieClayton7 opened this issue Oct 10, 2024 · 3 comments · Fixed by #31801
Assignees
Labels
@aws-cdk/aws-stepfunctions Related to AWS StepFunctions bug This issue is a bug. effort/small Small work item – less than a day of effort p1

Comments

@JamieClayton7
Copy link

Describe the bug

When using aws_stepfunctions.StateMachine, the default IAM policy for the state machine role does not generate the correct statement for the action ecs:RunTask.

The difference being that we now must specify the revision number (or all revisions by omitting the number and simply adding :) tagged onto the task definition ARN.

From 15th October 2024, the statement generated will result in an AccessDeniedException when the state machine attempts to RunTask on the non-tagged task definition ARN.

Regression Issue

  • Select this option if this issue appears to be a regression.

Last Known Working CDK Version

N/A

Expected Behavior

The valid statement that should be generated:

{
    "Action": "ecs:RunTask",
    "Resource": "arn:aws:ecs:eu-west-1:12345:task-definition/TaskDefinitionABC1234:1",
    "Effect": "Allow"
}

Current Behavior

The statement generated:

{
    "Action": "ecs:RunTask",
    "Resource": "arn:aws:ecs:eu-west-1:12345:task-definition/TaskDefinitionABC1234",
    "Effect": "Allow"
}

Reproduction Steps


// exectionRole

const taskDefinition = new ecs.FargateTaskDefinition(this, 'TaskDefinition', {
  cpu: 256,
  executionRole,
  memoryLimitMiB: 512,
});

// container definitions...

const stateMachineDefinition = new tasks.EcsRunTask(this, 'Run Traffic DB maintenance jobs', {
  cluster,
  launchTarget: new tasks.EcsFargateLaunchTarget(),
  taskDefinition,
  integrationPattern: sfn.IntegrationPattern.RUN_JOB,
});

const stateMachine = new sfn.StateMachine(this, 'StateMachine', {
    definition: stateMachineDefinition,
    stateMachineName: 'StateMachine',
    stateMachineType: sfn.StateMachineType.STANDARD,
    timeout: Duration.hours(2),
    tracingEnabled: true,
});

Possible Solution

CDK synth should generate the correct IAM statement for state machines ecs:RunTask by using the task definition role ARN with the revision tag attached to the task definition.

Work around for the time being:


// exectionRole

const taskDefinition = new ecs.FargateTaskDefinition(this, 'TaskDefinition', {
  cpu: 256,
  executionRole,
  memoryLimitMiB: 512,
});

// container definitions...

const stateMachineDefinition = new tasks.EcsRunTask(this, 'Run Traffic DB maintenance jobs', {
  cluster,
  launchTarget: new tasks.EcsFargateLaunchTarget(),
  taskDefinition,
  integrationPattern: sfn.IntegrationPattern.RUN_JOB,
});

const stateMachine = new sfn.StateMachine(this, 'StateMachine', {
    definition: stateMachineDefinition,
    stateMachineName: 'StateMachine',
    stateMachineType: sfn.StateMachineType.STANDARD,
    timeout: Duration.hours(2),
    tracingEnabled: true,
});

// WORK AROUND 
// Create a new policy
const policy = new iam.Policy(this, 'RunTaskPolicy', {
  statements: [
    new iam.PolicyStatement({
      actions: ['ecs:RunTask'],
      resources: [`${taskDefinition.taskDefinitionArn}`]
    })
  ]
});

// Attach the new policy to the state machine
policy.attachToRole(stateMachine.role)

Additional Information/Context

No response

CDK CLI Version

2.161.1 (build 0a606c9)

Framework Version

No response

Node.js Version

v22.9.0

OS

MacOS

Language

TypeScript

Language Version

No response

Other information

No response

@JamieClayton7 JamieClayton7 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 10, 2024
@github-actions github-actions bot added the @aws-cdk/aws-stepfunctions Related to AWS StepFunctions label Oct 10, 2024
@khushail khushail added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed needs-triage This issue or PR still needs to be triaged. labels Oct 10, 2024
@khushail khushail self-assigned this Oct 10, 2024
@khushail khushail added the p2 label Oct 10, 2024
@khushail
Copy link
Contributor

khushail commented Oct 10, 2024

Hi @JamieClayton7 , thanks for reaching out.

I tried to deploy the source code -

    const cluster = new ecs.Cluster(this, 'Cluster', {
      vpc: new cdk.aws_ec2.Vpc(this, 'Vpc', { maxAzs: 1 }),
    });

    cluster.addCapacity('DefaultAutoScalingGroup', {
      instanceType: new cdk.aws_ec2.InstanceType('t2.micro'),
    });

    const taskDefinition = new ecs.FargateTaskDefinition(this, 'TaskDefinition', {
      cpu: 256,
      memoryLimitMiB: 512,
      runtimePlatform: {
        operatingSystemFamily: ecs.OperatingSystemFamily.LINUX,
        cpuArchitecture: ecs.CpuArchitecture.X86_64
      },
    });

    taskDefinition.addContainer('Container', {
      image: ecs.ContainerImage.fromRegistry('amazonlinux'),
      memoryLimitMiB: 512,
      cpu: 256,
    });
    
    const stateMachineDefinition = new tasks.EcsRunTask(this, 'Run Traffic DB maintenance jobs', {
      cluster,
      launchTarget: new tasks.EcsFargateLaunchTarget(),
      taskDefinition,
      integrationPattern: sfn.IntegrationPattern.RUN_JOB,
    });

    const stateMachine = new sfn.StateMachine(this, 'StateMachine', {
      definition: stateMachineDefinition,
      stateMachineName: 'StateMachine',
      stateMachineType: sfn.StateMachineType.STANDARD,
      timeout: cdk.Duration.hours(2),
      tracingEnabled: true,
  });

  new cdk.CfnOutput(this, 'StateMachineArn', {
    value: stateMachine.stateMachineArn,
  })

and see this ecs:runtask policy added to the role as -

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Action": "ecs:RunTask",
			"Resource": "arn:aws:ecs:us-west-1:123456789012:task-definition/StepfunctionIssueStackTaskDefinition64A4E983:*",
			"Effect": "Allow"
		},

which seems exactly as its supposed to be.

This PR introduced the change to correct the arn returned -(Original issue).

@khushail khushail added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Oct 10, 2024
@khushail
Copy link
Contributor

khushail commented Oct 10, 2024

I found an article on this task definition with revision number -

https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonelasticcontainerservice.html#amazonelasticcontainerservice-task-definition

and this PR seems to be merged 3 days ago -#31615 which addresses the same issue for event-targets.

Marking this issue as P1 as the Task definition ARN in IAM policy resource should include revision number.

Reaching out to on-call to provide inputs here as if its something already on their radar or share insights if possible.
Thanks

@khushail khushail added p1 effort/small Small work item – less than a day of effort and removed response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p2 investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Oct 10, 2024
@khushail khushail removed their assignment Oct 10, 2024
@xazhao xazhao self-assigned this Oct 10, 2024
@mergify mergify bot closed this as completed in #31801 Oct 18, 2024
mergify bot pushed a commit that referenced this issue Oct 18, 2024
…alid policy for default StateMachineRole (#31801)

### Issue # (if applicable)

Closes #31714.

### Reason for this change

Currently, the step functions `runEcsTask()` will create an IAM policy. The `Resource` section is an ARN constructed by CDK with wildcard `*` appending at the end. However, CDK should `Ref` the resource directly instead of constructing the ARN, while keeping the revision number.

### Description of changes

The same solution as #31615. However this change needs to behind a feature flag because it could be a breaking change.

### Description of how you validated changes

Integration test. Also checked the synth template.

### Checklist
- [ ] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Copy link

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
@aws-cdk/aws-stepfunctions Related to AWS StepFunctions bug This issue is a bug. effort/small Small work item – less than a day of effort p1
Projects
None yet
3 participants