Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(aws-glue-alpha): Unable to create PythonShell job for Glue version 2.0 using cdk in Python #26599

Closed
dokeita opened this issue Aug 2, 2023 · 4 comments
Labels
@aws-cdk/aws-glue Related to AWS Glue bug This issue is a bug. duplicate This issue is a duplicate. effort/small Small work item – less than a day of effort p1

Comments

@dokeita
Copy link

dokeita commented Aug 2, 2023

Describe the bug

When I try to create a PythonShell for Glue Version 2.0 using glue_alpha.Job, I get the following error:

RuntimeError: maxCapacity cannot be used when GlueVersion 2.0 or later

Expected Behavior

A PythonShell job for Glue version 2.0 create normally.
PythonShell job requires maxCapacity parameter.

Current Behavior

The following error occurred:

$ cdk ls
jsii.errors.JavaScriptError: 
  Error: maxCapacity cannot be used when GlueVersion 2.0 or later
      at new Job (/tmp/jsii-kernel-6NowWl/node_modules/@aws-cdk/aws-glue-alpha/lib/job.js:336:19)
      at Kernel._Kernel_create (/tmp/tmpxqi7nkmp/lib/program.js:10002:25)
      at Kernel.create (/tmp/tmpxqi7nkmp/lib/program.js:9673:93)
      at KernelHost.processRequest (/tmp/tmpxqi7nkmp/lib/program.js:11602:36)
      at KernelHost.run (/tmp/tmpxqi7nkmp/lib/program.js:11562:22)
      at Immediate._onImmediate (/tmp/tmpxqi7nkmp/lib/program.js:11563:46)
      at process.processImmediate (node:internal/timers:476:21)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "app.py", line 9, in <module>
    GlueAlphaPythonStack(app, "glue-alpha-python")
  File "/abc/glue_alpha_python/.venv/lib/python3.8/site-packages/jsii/_runtime.py", line 118, in __call__
    inst = super(JSIIMeta, cast(JSIIMeta, cls)).__call__(*args, **kwargs)
  File "/abc/glue_alpha_python/glue_alpha_python/glue_alpha_python_stack.py", line 14, in __init__
    job = glue_alpha.Job(
  File "/abc/glue_alpha_python/.venv/lib/python3.8/site-packages/jsii/_runtime.py", line 118, in __call__
    inst = super(JSIIMeta, cast(JSIIMeta, cls)).__call__(*args, **kwargs)
  File "/abc/glue_alpha_python/.venv/lib/python3.8/site-packages/aws_cdk/aws_glue_alpha/__init__.py", line 3487, in __init__
    jsii.create(self.__class__, self, [scope, id, props])
  File "/abc/glue_alpha_python/.venv/lib/python3.8/site-packages/jsii/_kernel/__init__.py", line 334, in create
    response = self.provider.create(
  File "/abc/glue_alpha_python/.venv/lib/python3.8/site-packages/jsii/_kernel/providers/process.py", line 365, in create
    return self._process.send(request, CreateResponse)
  File "/abc/glue_alpha_python/.venv/lib/python3.8/site-packages/jsii/_kernel/providers/process.py", line 342, in send
    raise RuntimeError(resp.error) from JavaScriptError(resp.stack)
RuntimeError: maxCapacity cannot be used when GlueVersion 2.0 or later

Reproduction Steps

# glue_alpha_python_stack.py
from constructs import Construct
from aws_cdk import (
    Stack,
    aws_glue_alpha as glue_alpha
)


class GlueAlphaPythonStack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        job = glue_alpha.Job(
            scope=self,
            id="sample-job",
            executable=glue_alpha.JobExecutable.python_shell(
                glue_version=glue_alpha.GlueVersion.V2_0,
                python_version=glue_alpha.PythonVersion.THREE_NINE,
                script = glue_alpha.Code.from_asset('script/hello_world.py'),
            ),
            description='an example Python Shell job',
            max_capacity=0.0625
        )
#app.py
#!/usr/bin/env python3

import aws_cdk as cdk

from glue_alpha_python.glue_alpha_python_stack import GlueAlphaPythonStack


app = cdk.App()
GlueAlphaPythonStack(app, "glue-alpha-python")

app.synth()

Possible Solution

When the job type is Python shell, the following if statement should not be executed.

https://github.com/aws/aws-cdk/blob/main/packages/%40aws-cdk/aws-glue-alpha/lib/job.ts#L725

    if (props.maxCapacity !== undefined && ![GlueVersion.V0_9, GlueVersion.V1_0].includes(executable.glueVersion)) {
      throw new Error('maxCapacity cannot be used when GlueVersion 2.0 or later');
    }

example:

      if (executable.type !== JobType.PYTHON_SHELL ) {
        if (props.maxCapacity !== undefined && ![GlueVersion.V0_9, GlueVersion.V1_0].includes(executable.glueVersion)) {
          throw new Error('maxCapacity cannot be used when GlueVersion 2.0 or later');
        }
      }

Additional Information/Context

Using cdk in Typescript creates a job normally.
What is happening?

import { Stack, StackProps } from 'aws-cdk-lib';
import * as glue from '@aws-cdk/aws-glue-alpha'
import { Construct } from 'constructs';

export class GlueAlphaTestStack extends Stack {
  constructor(scope: Construct, id: string, props?: StackProps) {
    super(scope, id, props);

    new glue.Job(this, 'PythonShellJob', {
      executable: glue.JobExecutable.pythonShell({
        glueVersion: glue.GlueVersion.of('2.0'),
        pythonVersion: glue.PythonVersion.THREE_NINE,
        script: glue.Code.fromAsset('script/hello_world.py'),
      }),
      description: 'an example Python Shell job',
      maxCapacity: 0.0625
    });
  }
}

CDK CLI Version

2.89.0 (build 2ad6683)

Framework Version

2.89.0a0

Node.js Version

v18.16.1

OS

Ubuntu 20.04 on Windows10 ( WSL2 )

Language

Python

Language Version

Python (3.8.10)

Other information

No response

@dokeita dokeita added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 2, 2023
@github-actions github-actions bot added the @aws-cdk/aws-glue Related to AWS Glue label Aug 2, 2023
@pahud
Copy link
Contributor

pahud commented Aug 3, 2023

Did you mean this only happens in CDK with Python but works in TypeScript?

@pahud pahud added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p2 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Aug 3, 2023
@pahud
Copy link
Contributor

pahud commented Aug 3, 2023

I can reproduce this both in CDK in Python and TypeScript so it's not Python specific issue.

According to this, python-shell job type does require maxCapacity. I agree we should fix this. As there's no workaround, I am making it a p1 bug and we welcome pull requests from the community as well.

@pahud pahud added p1 effort/small Small work item – less than a day of effort and removed p2 response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. effort/medium Medium work item – several days of effort labels Aug 3, 2023
@peterwoodworth peterwoodworth added the duplicate This issue is a duplicate. label Aug 3, 2023
@peterwoodworth
Copy link
Contributor

closing in favor of #26620

@github-actions
Copy link

github-actions bot commented Aug 3, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-glue Related to AWS Glue bug This issue is a bug. duplicate This issue is a duplicate. effort/small Small work item – less than a day of effort p1
Projects
None yet
Development

No branches or pull requests

3 participants