Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(sdk) Custom tasks misgenerate code when all inputs are passed literals #880

Closed
Udiknedormin opened this issue Mar 17, 2022 · 0 comments · Fixed by #881
Closed

bug(sdk) Custom tasks misgenerate code when all inputs are passed literals #880

Udiknedormin opened this issue Mar 17, 2022 · 0 comments · Fixed by #881
Labels

Comments

@Udiknedormin
Copy link
Contributor

/kind bug

What steps did you take and what happened:
Generate a custom task which accepts literals as all of its parameters and at least one of these parameters matches the name of some pipeline parameter.

In the following example code:

import itertools

from kfp import dsl, components
from kfp_tekton.tekton import TEKTON_CUSTOM_TASK_IMAGES
from kfp_tekton.compiler import TektonCompiler
import yaml


ARTIFACT_FETCHER_IMAGE_NAME = "fetcher/image:latest"
TEKTON_CUSTOM_TASK_IMAGES = TEKTON_CUSTOM_TASK_IMAGES.append(ARTIFACT_FETCHER_IMAGE_NAME)

_artifact_fetcher_no = 0

def artifact_fetcher(**artifact_paths: str):
  '''A containerOp template resolving some artifacts, given their paths.'''
  global _artifact_fetcher_no
  template_yaml = {
    'name': f'artifact-fetcher-{_artifact_fetcher_no}',
    'description': 'Artifact Fetch',
    'inputs': [
      {'name': name, 'type': 'String'}
      for name in artifact_paths.keys()
    ],
    'outputs': [
      {'name': name, 'type': 'Artifact'}
      for name in artifact_paths.keys()
    ],
    'implementation': {
      'container': {
        'image': ARTIFACT_FETCHER_IMAGE_NAME,
        'command': ['sh', '-c'], # irrelevant
        'args': [
          '--apiVersion', 'fetcher.tekton.dev/v1alpha1',
          '--kind', 'FETCHER',
          '--name', f'artifact_fetcher_{_artifact_fetcher_no}',
          *itertools.chain(*[
            (f'--{name}', {'inputValue': name})
            for name in artifact_paths.keys()
          ])
        ]
      }
    }
  }
  _artifact_fetcher_no += 1
  template_str = yaml.dump(template_yaml, Dumper=yaml.SafeDumper)
  template = components.load_component_from_text(template_str)
  op = template(**artifact_paths)
  op.add_pod_annotation("valid_container", "false")
  return op


@dsl.pipeline("all-literals")
def all_literals(foo: str):
  global _artifact_fetcher_no

  # no literals
  _artifact_fetcher_no = 0
  op00 = artifact_fetcher(bar=foo)
  op01 = artifact_fetcher(foo=foo)

  # not all inputs are literals
  _artifact_fetcher_no = 10
  op10 = artifact_fetcher(foo="foo", bar=foo)
  op11 = artifact_fetcher(foo=foo, bar="bar")

  # all inputs are literals but none of them
  #   matches the name of any pipeline param
  _artifact_fetcher_no = 20
  op20 = artifact_fetcher(bar="bar")
  op21 = artifact_fetcher(bar="bar", buzz="buzz")

  # all inputs are literals and at least one of them
  #   matches the name of a pipeline param
  _artifact_fetcher_no = 30
  op30 = artifact_fetcher(foo="foo")
  op31 = artifact_fetcher(foo="bar") # doesn't matter what the literal is
  op32 = artifact_fetcher(foo="foo", bar="bar")
  op33 = artifact_fetcher(foo="bar", bar="foo")


if __name__ == '__main__':
  TektonCompiler().compile(all_literals, __file__.replace('.py', '.yaml'))

Produces the following yaml:

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: all-literals
  annotations:
    tekton.dev/output_artifacts: '{}'
    tekton.dev/input_artifacts: '{}'
    tekton.dev/artifact_bucket: mlpipeline
    tekton.dev/artifact_endpoint: minio-service.kubeflow:9000
    tekton.dev/artifact_endpoint_scheme: http://
    tekton.dev/artifact_items: '{}'
    sidecar.istio.io/inject: "false"
    pipelines.kubeflow.org/big_data_passing_format: $(workspaces.$TASK_NAME.path)/artifacts/$ORIG_PR_NAME/$TASKRUN_NAME/$TASK_PARAM_NAME
    pipelines.kubeflow.org/pipeline_spec: '{"inputs": [{"name": "foo", "type": "String"}],
      "name": "all-literals"}'
spec:
  params:
  - name: foo
    value: ''
  pipelineSpec:
    params:
    - name: foo
    tasks:
    - name: artifact-fetcher-0
      params:
      - name: bar
        value: $(params.foo)
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-1
      params:
      - name: foo
        value: $(params.foo)
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-10
      params:
      - name: foo
        value: foo
      - name: bar
        value: $(params.foo)
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-11
      params:
      - name: foo
        value: $(params.foo)
      - name: bar
        value: bar
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-20
      params:
      - name: bar
        value: bar
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-21
      params:
      - name: bar
        value: bar
      - name: buzz
        value: buzz
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-30
      params:
      - name: foo
        value: $(params.foo)
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-31
      params:
      - name: foo
        value: $(params.foo)
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-32
      params:
      - name: foo
        value: $(params.foo)
      - name: bar
        value: bar
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-33
      params:
      - name: foo
        value: $(params.foo)
      - name: bar
        value: foo
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
  timeout: 525600m

Note how cases 0x, 1x and 2x are all correct, but for 3x, "foo" parameter is assigned $(params.foo) despite being passed a some literal in the DSL code. The value of literal seems to be irrelevant.

What did you expect to happen:
The 3x cases compiling just fine, to:

    - name: artifact-fetcher-30
      params:
      - name: foo
        value: foo
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-31
      params:
      - name: foo
        value: bar
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-32
      params:
      - name: foo
        value: foo
      - name: bar
        value: bar
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m
    - name: artifact-fetcher-33
      params:
      - name: foo
        value: bar
      - name: bar
        value: foo
      taskRef:
        name: artifact_fetcher
        apiVersion: fetcher.tekton.dev/v1alpha1
        kind: FETCHER
      timeout: 525600m

Additional information:
In compiler.py, I can see the following check:

if task.get('orig_params', []):  # custom task

...but no orig-param is generated for literals, so that list is empty in such cases. In such a case, it's handled as if it was a "normal" task --- and kfp-tekton seems to utilize the fact that the name matching a pipeline or loop parameter means that parameter actually origins from that pipeline or loop parameter:

if task.get('orig_params', []):  # custom task
  # (...)
else:  # regular task
  op = pipeline.ops.get(task['name'])
  for tp in task.get('params', []):
    if tp['name'] in pipeline_param_names + loop_args:
      tp['value'] = '$(params.%s)' % tp['name']

This is probably because in custom-task, the name is directly controlled by the user, while for "normal" tasks --- the names are controlled by kubeflow-pipelines and the compiler can (but should it?) reason about parameters based on their names.

It seems to me it should be enough to change:

if task.get('orig_params', []):  # custom task

to:

if 'orig_params' in task:  # custom task

...as "normal" Tekton tasks don't have 'orig_params' key at all --- now at the moment at least. An additional parameter (e.g. 'custom_task': true), removed at the end of this if could be added too.

Environment:

  • Python Version (use python --version): 3.9.0
  • SDK Version: master's HEAD, 3ab7a40
  • Tekton Version (use tkn version): irrelevant
  • Kubernetes Version (use kubectl version): irrelevant
  • OS (e.g. from /etc/os-release): irrelevant
google-oss-prow bot pushed a commit that referenced this issue Mar 17, 2022
* add test for custom-tasks with literals

* fix orig_params check for custom tasks

* fix: style: min 2 spaces before inline comment

* fix: style: 2 empty line before function decl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant