Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly capture all params and validate against correct ParamSpec with substitutions considered #4879

Closed
ywluogg opened this issue May 16, 2022 · 14 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@ywluogg
Copy link
Contributor

ywluogg commented May 16, 2022

Feature request

Recently, we have ArrayOrString implementing UnmarshalJSON for json.Unmarshaller interface. It will set the ArrayOrString.Type to be string when the data starts with a symbol other than {, " or [. It means that any substituted params will be considered as string type.

When implementing '$(params.paramvalue[*])' for both arrays and objects for params, this becomes problematic, as current code will fail at checking parameter type at wrongTypeParamsNames in validate_resources.go, as the param will be considered as string.

An example can be:

params:
  - name: gitrepo
    value: $(params.gitrepo[*])

If the params being passed is arrays (with the assumption that now the object in the array is strings), and the substituted type is also arrays of strings, it can also be using the following way for substitution:

params:
  - name: gitrepo
    value: '$(params.gitrepo[*])'

We need to be able to pass the check when param is a substitution of another param or results that's either a array or object in wrongTypeParamsNames and be able to capture all the param and paramSpec pairs by tracing back the variable reference back to its actual type in validateParams(), when we substitute params with objects and arrays as a whole.

Use case

@ywluogg ywluogg added the kind/feature Categorizes issue or PR as related to a new feature. label May 16, 2022
@ywluogg
Copy link
Contributor Author

ywluogg commented May 16, 2022

Pinging @Yongxuanzhang @chuangw6

@Yongxuanzhang
Copy link
Member

For array params and results, maybe we can let user follow these syntax, this can distinguish the usage of different types.

params:
  - name: gitrepo
    value: 
      -  $(params.gitrepo[*])

or

params:
  - name: gitrepo
    value: [$(params.gitrepo[*])]

@chuangw6
Copy link
Member

chuangw6 commented May 17, 2022

Discussed with @Yongxuanzhang offline.

Similarly when providing object type param with the object type param variable (from pipeline spec's params) or the object type result variable from a task, it would make more sense to let users follow yaml's dictionary syntax as well (surrounding the variable with {}). For example assume the param gitrepo is intended to be of object type, when providing value for it, users can

  • use a param variable of object type
params:
  - name: gitrepo
    value: {$(params.myObject[*])}
  • use a result variable of object type
params:
  - name: gitrepo
    value: {$(tasks.TASK_NAME.results.OBJECT_RESULT_NAME[*])}
  • BUT NOT something that will be treated as string type (suggested in this tep75).
params:
  - name: gitrepo
    value: $(params.myObject[*])

Caveat

When providing value with {$(params.myObject[*])} in the first example, UnmarshalJSON will be able to set the param gitrepo type to be object type, but the objectVal of ArrayOrString for param gitrepo will be unmarshalled to be a map with a single key that has no value i.e.

map[$(params.myObject[*]):]

This will lead us to add additional checks for the special case when implementing validation and param apply/replacement. Because we need to distinguish between explicit definition (like the following example) and using variables.

params:
  - name: gitrepo
    value: 
       url: abc.com
       commit: xxxx

The unmarshal result for objectVal in this example will be

map[url:"abc.com", commit: "xxxx"]

Thoughts

  • Since normal object param key names will be not be in the format of $(params.myObject[*]) or $(tasks.TASK_NAME.results.OBJECT_RESULT_NAME[*]) (if we want, we can enforce that the key names must follow same rules as param names), the additional check mentioned above can simply check if the object only has one key and its key name looks like something containing variables i.e. $(params.myObject[*]) or $(tasks.TASK_NAME.results.OBJECT_RESULT_NAME[*]).
  • Note: explicit definition can also has variable in it i.e.
params:
  - name: gitrepo
    value: 
       url: abc.com
       commit: "$(param.aStringTypeParam)"

@dibyom
Copy link
Member

dibyom commented May 18, 2022

@ywluogg @chuangw6 I'm trying to understand why all substituted params are considered string type....do you have an example/link to your branch where I can see the code?

Using the [] and {} for values seems like the technically correct thing to do though combining that with [*] makes the syntax a bit hard to follow.

@ywluogg
Copy link
Contributor Author

ywluogg commented May 18, 2022

@ywluogg @chuangw6 I'm trying to understand why all substituted params are considered string type....do you have an example/link to your branch where I can see the code?

Using the [] and {} for values seems like the technically correct thing to do though combining that with [*] makes the syntax a bit hard to follow.

None of them are implemented yet, but based on the current code in
UnmarshalJSON, first one and second one will be regarded as strings

Regarding to what Chuang suggested, I also have the concern that it would be making syntax a bit hard to follow.
Because the value itself needs to follow json schema rules, so I think using [] and {} outside of the actual values would be like:
Suppose the json value for a result should be:

{... properties: {"a": "b"}}

Then what's in params that would be:

value: {{... properties: {"a": "b"}}}

@chuangw6
Copy link
Member

chuangw6 commented May 18, 2022

Because the value itself needs to follow json schema rules, so I think using [] and {} outside of the actual values would be like: Suppose the json value for a result should be:

{... properties: {"a": "b"}}

Then what's in params that would be:

value: {{... properties: {"a": "b"}}}

I don't quite understand this part. Can you explain more?
To clarify, when providing value for object param, no properties will be specified again. Like the explicit def example I showed in previous thread, if users explicitly provide value for each key, it would be like the following, (no {} needed)

params:
  - name: gitrepo
    value: 
       url: abc.com
       commit: xxxx

And my previous post only suggests surrounding with {} specifically when using something containing variables for the value.

@chuangw6
Copy link
Member

@ywluogg @chuangw6 I'm trying to understand why all substituted params are considered string type....do you have an example/link to your branch where I can see the code?

Using the [] and {} for values seems like the technically correct thing to do though combining that with [*] makes the syntax a bit hard to follow.

Yes, UnmarshalJSON will set type to be string if the first char of raw json doesn't start with [ or {.

@Yongxuanzhang
Copy link
Member

@ywluogg @chuangw6 I'm trying to understand why all substituted params are considered string type....do you have an example/link to your branch where I can see the code?
Using the [] and {} for values seems like the technically correct thing to do though combining that with [*] makes the syntax a bit hard to follow.

We have several options to do substitution:

params:
  - name: gitrepo
    value: $(params.gitrepo[*])
params:
  - name: gitrepo
    value: "$(params.gitrepo[*])"

And the new ones that @chuangw6 suggested:

params:
  - name: gitrepo
    value: {$(params.myObject[*])}

None of them are implemented yet, but based on the current code in UnmarshalJSON, first one and second one will be regarded as strings

Regarding to what Chuang suggested, I also have the concern that it would be making syntax a bit hard to follow. Because the value itself needs to follow json schema rules, so I think using [] and {} outside of the actual values would be like: Suppose the json value for a result should be:

{... properties: {"a": "b"}}

Then what's in params that would be:

value: {{... properties: {"a": "b"}}}

I discussed with @chuangw6 yesterday and I lean to only use

params:
  - name: gitrepo
    value: {$(params.myObject[*])}

not

params:
  - name: gitrepo
    value: "$(params.gitrepo[*])"

Same for array type param and results.
Supporting "$(params.gitrepo[*])" may seem convenient for users, but it is bit weird that we are expecting array but the input is string type. And we also need to change the type of ArrayOrString

@ywluogg
Copy link
Contributor Author

ywluogg commented May 18, 2022

Because the value itself needs to follow json schema rules, so I think using [] and {} outside of the actual values would be like: Suppose the json value for a result should be:

{... properties: {"a": "b"}}

Then what's in params that would be:

value: {{... properties: {"a": "b"}}}

I don't quite understand this part. Can you explain more? To clarify, when providing value for object param, no properties will be specified again.

I meant to say the value of an object already starts with the json schema that looks like (example in tep75):

{\"url\": \"gcr.io\/somerepo\/someimage\", \"digest\": \"a61ed0bca213081b64be94c5e1b402ea58bc549f457c2682a86704dd55231e09\"}

In params, if you add another bracket as you suggested in your example:

params:
  - name: gitrepo
    value: {$(params.myObject[*])}

It actually is the following after substitution:

params:
  - name: gitrepo
    value: {{\"url\": \"gcr.io\/somerepo\/someimage\", \"digest\": \"a61ed0bca213081b64be94c5e1b402ea58bc549f457c2682a86704dd55231e09\"}}

I think this is the complexity that Dibyo and I think would be confused to users. Or is this not what you are suggesting?

Like the explicit def example I showed in previous thread, if users explicitly provide value for each key, it would be like the following, (no {} needed)

params:
  - name: gitrepo
    value: 
       url: abc.com
       commit: xxxx

And my previous post only suggests surrounding with {} specifically when using something containing variables for the value.

AFAIU, this works because this format

value: 
   url: abc.com
   commit: xxxx

in OpenAPI yaml to json transition, it is referring to a object schema, but what you are suggesting:

params:
  - name: gitrepo
    value: {$(params.myObject[*])}

is not recognized as an object. I think the example in your mind is this:

params:
  - name: gitrepo
    value: {
      url: abc
      commit: xxx
    }

I don't think

params:
  - name: gitrepo
    value: {$(params.myObject[*])}

is referring to the above example. In short, I believe what you are suggesting is adding an extra {} to the syntax

@dibyom
Copy link
Member

dibyom commented May 19, 2022

Ok, just to make sure I'm understanding this properly, if we have the following param inside a pipelineTask:

params:
  - name: gitrepo
    value: $(params.gitrepo[*])

and the value of gitrepo from the pipelinerun or somewhere else is:

params:
- name: gitrepo
   value:
      url: https://github.com/tektoncd/pipeline
      commit: "6f633b2"

if we run through the current param substitution code, (I think) the value of gitrepo becomes:

params:
- name: gitrepo
   value: "{\"url\": \"https://github.com/tektoncd/pipeline\", \"commit\": \"6f633b2\"}"

AKA the value is the JSONified string of the object and not the JSON object itself? So, the type of the value is string which is different from the type that the task actually expects.

* The value being a string is problematic because this will fail the param type/schema validation?

* One option that might fix this is to wrap the value within the YAML dictionary {} or array syntax []. Though from @ywluogg 's last comment it seems like that might just wrap the JSON string in another set of {} or [] and not solve the actual problem

  • What we want the output to be sounds like:
params:
- name: gitrepo
  value: 
      url: abc.com
      commit: xxxx

* Q: Would special-casing the substitution when [*] is involved be a solution?

@dibyom
Copy link
Member

dibyom commented May 20, 2022

Chatted with @ywluogg @chuangw6 @Yongxuanzhang the issue was that the value: $(params.gitrepo[*]) fails validation since $(params.gitrepo[*]) is a string while the validation expects an actual map/dictionary. The plan is to update the validation to allow this usage

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2022
@Yongxuanzhang
Copy link
Member

I think we address this one.
Maybe close it?
/close

@tekton-robot
Copy link
Collaborator

@Yongxuanzhang: Closing this issue.

In response to this:

I think we address this one.
Maybe close it?
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants