Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate CEL validation rules not to enforce required fields when ObserveOnly #166

Merged
merged 3 commits into from
Apr 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions pkg/config/externalname.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"regexp"
"strings"
"text/template"
"text/template/parse"

"github.com/crossplane/crossplane-runtime/pkg/errors"
"github.com/crossplane/crossplane-runtime/pkg/fieldpath"
Expand Down Expand Up @@ -47,6 +48,8 @@ var (
GetIDFn: ExternalNameAsID,
DisableNameInitializer: true,
}

parameterPattern = regexp.MustCompile(`{{\s*\.parameters\.([^\s}]+)\s*}}`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Edge cases may arise when we use arbitrary templating expressions apart from {{ .parameters.<argument name> }} as a parameter to config.TemplatedStringAsIdentifier. This is not common but is theoretically possible. Please also see the comment for config.ExternalName.IdentifierFields, which is also related to the discussion here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I am getting this. Could you elaborate with some examples?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have already discussed these offline but just taking a note here for future reference. Some examples we've found with @sergenyalcin are:

(We have already agreed these are not blockers. Just taking a note here for future reference).

)

// ParameterAsIdentifier uses the given field name in the arguments as the
Expand All @@ -60,6 +63,7 @@ func ParameterAsIdentifier(param string) ExternalName {
param,
param + "_prefix",
}
e.IdentifierFields = []string{param}
return e
}

Expand Down Expand Up @@ -90,6 +94,18 @@ func TemplatedStringAsIdentifier(nameFieldPath, tmpl string) ExternalName {
if err != nil {
panic(errors.Wrap(err, "cannot parse template"))
}

// Note(turkenh): If a parameter is used in the external name template,
// it is an identifier field.
var identifierFields []string
for _, node := range t.Root.Nodes {
if node.Type() == parse.NodeAction {
match := parameterPattern.FindStringSubmatch(node.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, we check the {{ .parameters.<field_name> }} for catching the root level required fields. This will cover some external name configurations if we use this convention. However, as you know, we also have another external name configurations, for example IdentifierFromProvider. For this case, we rely to the API schema of resource for the required fields while constructing the ID by terraform. What do you think about this type of cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of IdentifierFromProvider, provider assigns an id to the resource and we don't need (or at least know) parameters that will be included in that identifier. Noticed some helper methods in provider-aws like FormattedIdentifierFromProvider which should be updated to set relevant parameters as identifier fields.

In upjet, noticed that we need to mark the parameter as identifier in ParameterAsIdentifier method and updated accordingly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not also be able to extract all the argument names from an arbitrary templating expression here (although the logic implemented should cover the most common case of config.TemplatedStringAsIdentifier, i.e., {{ .parameters.<field_name> }}).

And I also agree we cannot cover all the config.ExternalName instances as @sergenyalcin mentioned above (there will even be anonymous ones, which will necessitate manual configuration of the newly introduced config.ExternalName.IdentifierFields as discussed below).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded in the other thread: #166 (comment)

if len(match) == 2 {
identifierFields = append(identifierFields, match[1])
}
}
}
return ExternalName{
SetIdentifierArgumentFn: func(base map[string]any, externalName string) {
if nameFieldPath == "" {
Expand Down Expand Up @@ -126,6 +142,7 @@ func TemplatedStringAsIdentifier(nameFieldPath, tmpl string) ExternalName {
}
return GetExternalNameFromTemplated(tmpl, id.(string))
},
IdentifierFields: identifierFields,
}
}

Expand Down
6 changes: 6 additions & 0 deletions pkg/config/resource.go
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,12 @@ type ExternalName struct {
// assigned by the provider, like AWS VPC where it gets vpc-21kn123 identifier
// and not let you name it.
DisableNameInitializer bool

// IdentifierFields are the fields that are used to construct external
// resource identifier. We need to know these fields no matter what the
// management policy is including the Observe Only, different from other
// (required) fields.
IdentifierFields []string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit worried that we introduce a new external name configuration field without a robust defaulting for it so that we will have to do manual configurations to get proper required checks in place. We have covered some defaulting for this in config.TemplatedStringAsIdentifier but there are even anonymous config.ExternalName instances which will not be covered.

If this is not properly configured (i.e., if an identifier parameter is not correctly marked as so), we would expect the Terraform validation to catch it but this would still be a compromise in our API quality. Is there a way to extract this information regarding which Terraform configuration arguments are part of the Terraform ID from the native schema? What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but there are even anonymous config.ExternalName instances which will not be covered.

Could you point to some examples? Can we configure them as part of rolling the feature out to the providers?

If this is not properly configured (i.e., if an identifier parameter is not correctly marked as so), we would expect the Terraform validation to catch it, but this would still be a compromise in our API quality.

To be clear, if an identifier parameter is not marked correctly, we will replace required annotation with CEL validation rule which marks it as required. So, the impact is:

Default (not Observe Only):

  • 1.25+ => No difference, the user will get warned during resource creation.
  • 1.24- => Resource will be created and we get an error after the first reconciliation

Observe Only:

  • User may create the resource without providing an identifier field but should get an error after the first reconciliation.

I'm a little bit worried that we introduce a new external name configuration field without a robust defaulting for it.

Is there a way to extract this information regarding which Terraform configuration arguments are part of the Terraform ID from the native schema? What do you think?

I agree with your concerns and have already investigated ways to figure this out before introducing a new configuration; however, unfortunately, couldn't find one. I believe this is a similar problem to external name configuration. The information could be inferred from the import documentation, but there is no robust way of automating this without human intervention. So, the best I can find is to leverage that existing configuration to figure the individual parameters out. Happy to try if you have some other ideas.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean with anonymous config.ExternalName instances are those which we anonymously define instead of a reusable definition (such as functions like config.TemplatedStringAsIdentifier or variables like the config.IdentifierFromProvider). An example is here.

As we have discussed offline, we may start getting reports on cases where we cannot enforce required fields at the Kubernetes API level as you mentioned above and although the remedy is to supply a value for the required field at runtime, we will also want to do a manual configuration on config.ExternalName.IdentifierFields as a permanent fix. This looks like to be the best option unless we extend the discussion to XRM or related stuff, so I believe we are good to go.

}

// References represents reference resolver configurations for the fields of a
Expand Down
1 change: 1 addition & 0 deletions pkg/pipeline/crd.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ func (cg *CRDGenerator) Generate(cfg *config.Resource) (string, error) {
"Kind": cfg.Kind,
"ForProviderType": gen.ForProviderType.Obj().Name(),
"AtProviderType": gen.AtProviderType.Obj().Name(),
"ValidationRules": gen.ValidationRules,
"Path": cfg.Path,
},
"Provider": map[string]string{
Expand Down
1 change: 1 addition & 0 deletions pkg/pipeline/templates/crd_types.go.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ type {{ .CRD.Kind }}Status struct {
type {{ .CRD.Kind }} struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
{{- .CRD.ValidationRules }}
Spec {{ .CRD.Kind }}Spec `json:"spec"`
Status {{ .CRD.Kind }}Status `json:"status,omitempty"`
}
Expand Down
33 changes: 28 additions & 5 deletions pkg/types/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,17 @@ type Generated struct {

ForProviderType *types.Named
AtProviderType *types.Named

ValidationRules string
}

// Builder is used to generate Go type equivalence of given Terraform schema.
type Builder struct {
Package *types.Package

genTypes []*types.Named
comments twtypes.Comments
genTypes []*types.Named
comments twtypes.Comments
validationRules string
}

// NewBuilder returns a new Builder.
Expand All @@ -59,6 +62,7 @@ func (g *Builder) Build(cfg *config.Resource) (Generated, error) {
Comments: g.comments,
ForProviderType: fp,
AtProviderType: ap,
ValidationRules: g.validationRules,
}, errors.Wrapf(err, "cannot build the Types")
}

Expand Down Expand Up @@ -127,6 +131,11 @@ func (g *Builder) AddToBuilder(typeNames *TypeNames, r *resource) (*types.Named,
obsType := types.NewNamed(typeNames.ObservationTypeName, types.NewStruct(r.obsFields, r.obsTags), nil)
g.genTypes = append(g.genTypes, obsType)

for _, p := range r.topLevelRequiredParams {
g.validationRules += "\n"
g.validationRules += fmt.Sprintf(`// +kubebuilder:validation:XValidation:rule="self.managementPolicy == 'ObserveOnly' || has(self.forProvider.%s)",message="%s is a required parameter"`, p, p)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to bump the crossplane-runtime dependency so that the spec.managementPolicy is available for the MRs. One question is what happens if we have this rule generated for the MRs but the spec.managementPolicy field is missing? I assume it will be safe because self.managementPolicy == 'ObserveOnly' condition will evaluate to false and has(self.forProvider.<argument>) will act just like the previous required marker, thus keeping backward compatibility. I assume we do not error in that case because of an invalid path. Is this really the case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried this out and we get the following error during CRD creation:

The CustomResourceDefinition "instances.rds.aws.upbound.io" is invalid: spec.validation.openAPIV3Schema.properties[spec].x-kubernetes-validations[0].rule: Invalid value: apiextensions.ValidationRule{Rule:"self.managementPolicy == 'ObserveOnly' || has(self.forProvider.instanceClass)", Message:"instanceClass is a required parameter"}: compilation failed: ERROR: <input>:1:5: undefined field 'managementPolicy'
 | self.managementPolicy == 'ObserveOnly' || has(self.forProvider.instanceClass)
 | ....^

We will need to ensure that we are updating crossplane-runtime and crossplane-tools dependency with management policy, as part of rolling out this feature to the providers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving this a try @turkenh.

turkenh marked this conversation as resolved.
Show resolved Hide resolved
}

return paramType, obsType
}

Expand Down Expand Up @@ -260,19 +269,33 @@ func NewTypeNames(fieldPaths []string, pkg *types.Package) (*TypeNames, error) {
type resource struct {
paramFields, obsFields []*types.Var
paramTags, obsTags []string
topLevelRequiredParams []string
}

func (r *resource) addParameterField(f *Field, field *types.Var) {
if f.Schema.Optional {
req := !f.Schema.Optional
// Note(turkenh): We are collecting the top level required parameters that
// are not identifier fields. This is for generating CEL validation rules for
// those parameters and not to require them if the management policy is set
// Observe Only. In other words, if we are not creating or managing the
// resource, we don't need to provide those parameters which are:
// - req => required
// - !f.Identifier => not identifiers - i.e. region, zone, etc.
// - len(f.CanonicalPaths) == 1 => top level, i.e. not a nested field
if req && !f.Identifier && len(f.CanonicalPaths) == 1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Let's assume a.b is a required argument that also happens to be a component in the Terraform ID. Let's also assume that a.c is a required argument that's not a component of the resource's Terraform ID. In this case, although we should be generating a CRD validation rule and not the unconditional +kubebuilder:validation:Required marker for a.c, the check on the length of the canonical path will prevent this. Not sure how much a problem this is because I don't recall any examples where a nested field is an identifier field, e.g., a composite type's field happens to be a path component. Please also see the relevant discussion below for types.NewField.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you noticed, we don't support nested identifier fields right now and I added a todo for the future.

For now, I believe this should be ok due to the following two reasons:

  1. There is no known real case for this.
  2. Even if there were, users would have to specify a.c when they want to observe (or import) that specific resource which is more like a degraded UX than a blocker. There will be no difference for default operation (i.e. FullControl), they have to specify a.c.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. As discussed offline, our plan is to make manual external-name configurations via config.ExternalName.IdentifierFields as needed.

req = false
r.topLevelRequiredParams = append(r.topLevelRequiredParams, f.TransformedName)
}

f.Comment.Required = &req
if !req {
r.paramTags = append(r.paramTags, fmt.Sprintf(`json:"%s" tf:"%s"`, f.JSONTag, f.TFTag))
} else {
// Required fields should not have omitempty tag in json tag.
// TODO(muvaf): This overrides user intent if they provided custom
// JSON tag.
r.paramTags = append(r.paramTags, fmt.Sprintf(`json:"%s" tf:"%s"`, strings.TrimSuffix(f.JSONTag, ",omitempty"), f.TFTag))
}
req := !f.Schema.Optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that CRD validation rules have graduated to beta with Kubernetes 1.25. We may not be enforcing required fields at the Kubernetes API level for the previous Kubernetes versions if the corresponding alpha feature is not explicitly enabled.

Also, though practically less important because we will making previously required fields optional, we will also be introducing breaking API changes for such clusters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that CRD validation rules have graduated to beta with Kubernetes 1.25. We may not be enforcing required fields at the Kubernetes API level for the previous Kubernetes versions if the corresponding alpha feature is not explicitly enabled.

Correct. This was already discussed during the design here, and there was an agreement that this is reasonable given the reconciler will report the error from cloud API indicating that a required field is missing after the first reconciliation.

f.Comment.Required = &req
r.paramFields = append(r.paramFields, field)
}

Expand Down
6 changes: 3 additions & 3 deletions pkg/types/builder_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ func TestBuild(t *testing.T) {
},
},
want: want{
forProvider: `type example.Parameters struct{Enable *bool "json:\"enable,omitempty\" tf:\"enable,omitempty\""; ID *int64 "json:\"id\" tf:\"id,omitempty\""; Name *string "json:\"name\" tf:\"name,omitempty\""}`,
forProvider: `type example.Parameters struct{Enable *bool "json:\"enable,omitempty\" tf:\"enable,omitempty\""; ID *int64 "json:\"id,omitempty\" tf:\"id,omitempty\""; Name *string "json:\"name,omitempty\" tf:\"name,omitempty\""}`,
atProvider: `type example.Observation struct{Config *string "json:\"config,omitempty\" tf:\"config,omitempty\""; Enable *bool "json:\"enable,omitempty\" tf:\"enable,omitempty\""; ID *int64 "json:\"id,omitempty\" tf:\"id,omitempty\""; Name *string "json:\"name,omitempty\" tf:\"name,omitempty\""; Value *float64 "json:\"value,omitempty\" tf:\"value,omitempty\""}`,
},
},
Expand Down Expand Up @@ -282,7 +282,7 @@ func TestBuild(t *testing.T) {
},
},
want: want{
forProvider: `type example.Parameters struct{List []*string "json:\"list\" tf:\"list,omitempty\""; ResourceIn map[string]example.ResourceInParameters "json:\"resourceIn\" tf:\"resource_in,omitempty\""}`,
forProvider: `type example.Parameters struct{List []*string "json:\"list,omitempty\" tf:\"list,omitempty\""; ResourceIn map[string]example.ResourceInParameters "json:\"resourceIn,omitempty\" tf:\"resource_in,omitempty\""}`,
atProvider: `type example.Observation struct{List []*string "json:\"list,omitempty\" tf:\"list,omitempty\""; ResourceIn map[string]example.ResourceInParameters "json:\"resourceIn,omitempty\" tf:\"resource_in,omitempty\""; ResourceOut map[string]example.ResourceOutObservation "json:\"resourceOut,omitempty\" tf:\"resource_out,omitempty\""}`,
},
},
Expand Down Expand Up @@ -354,7 +354,7 @@ func TestBuild(t *testing.T) {
},
},
want: want{
forProvider: `type example.Parameters struct{Name *string "json:\"name\" tf:\"name,omitempty\""; ReferenceID *string "json:\"referenceId,omitempty\" tf:\"reference_id,omitempty\""; ExternalResourceID *github.com/crossplane/crossplane-runtime/apis/common/v1.Reference "json:\"externalResourceId,omitempty\" tf:\"-\""; ReferenceIDSelector *github.com/crossplane/crossplane-runtime/apis/common/v1.Selector "json:\"referenceIdSelector,omitempty\" tf:\"-\""}`,
forProvider: `type example.Parameters struct{Name *string "json:\"name,omitempty\" tf:\"name,omitempty\""; ReferenceID *string "json:\"referenceId,omitempty\" tf:\"reference_id,omitempty\""; ExternalResourceID *github.com/crossplane/crossplane-runtime/apis/common/v1.Reference "json:\"externalResourceId,omitempty\" tf:\"-\""; ReferenceIDSelector *github.com/crossplane/crossplane-runtime/apis/common/v1.Selector "json:\"referenceIdSelector,omitempty\" tf:\"-\""}`,
atProvider: `type example.Observation struct{Name *string "json:\"name,omitempty\" tf:\"name,omitempty\""; ReferenceID *string "json:\"referenceId,omitempty\" tf:\"reference_id,omitempty\""}`,
},
},
Expand Down
11 changes: 11 additions & 0 deletions pkg/types/field.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ type Field struct {
Reference *config.Reference
TransformedName string
SelectorName string
Identifier bool
}

// getDocString tries to extract the documentation string for the specified
Expand Down Expand Up @@ -94,6 +95,16 @@ func NewField(g *Builder, cfg *config.Resource, r *resource, sch *schema.Schema,
AsBlocksMode: asBlocksMode,
}

for _, ident := range cfg.ExternalName.IdentifierFields {
// TODO(turkenh): Could there be a nested identifier field? No, known
// cases so far but we would need to handle that if/once there is one,
// which is missing here.
if ident == snakeFieldName {
turkenh marked this conversation as resolved.
Show resolved Hide resolved
f.Identifier = true
break
}
}

var commentText string
docString := getDocString(cfg, f, tfPath)
if len(docString) > 0 {
Expand Down