Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example manifest generation pipeline #4

Merged
merged 7 commits into from
Jun 9, 2022

Conversation

ulucinar
Copy link
Collaborator

@ulucinar ulucinar commented Jun 1, 2022

Description of your changes

This PR is a copy of crossplane/terrajet#173

Part of #5.

This PR adds an example manifest generation pipeline. The example manifests are scraped from Terraform registry.

I have:

  • Read and followed Crossplane's contribution process.
  • Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

Please see examples-generated folders in crossplane-contrib/provider-jet-azure#111, crossplane-contrib/provider-jet-aws#138, and crossplane-contrib/provider-jet-gcp#19 for generated manifests.

An example manifest generated for provider-jet-azure:

apiVersion: devices.azure.upbound.io/v1alpha2
kind: IOTHub
metadata:
  name: example
spec:
  forProvider:
    cloudToDevice:
    - defaultTtl: PT1H
      feedback:
      - lockDuration: PT30S
        maxDeliveryCount: 15
        timeToLive: PT1H10M
      maxDeliveryCount: 30
    endpoint:
    - batchFrequencyInSeconds: 60
      connectionStringSecretRef:
        key: attribute.primary_blob_connection_string
        name: example-storage-account
        namespace: crossplane-system
      containerNameRef:
        name: example
      encoding: Avro
      fileNameFormat: '{iothub}/{partition}_{YYYY}_{MM}_{DD}_{HH}_{mm}'
      maxChunkSizeInBytes: 10485760
      name: export
      type: AzureIotHub.StorageContainer
    - connectionStringSecretRef:
        key: attribute.primary_connection_string
        name: example-eventhub-authorization-rule
        namespace: crossplane-system
      name: export2
      type: AzureIotHub.EventHub
    enrichment:
    - endpointNames:
      - export
      - export2
      key: tenant
      value: $twin.tags.Tenant
    location: West Europe
    resourceGroupNameRef:
      name: example
    route:
    - condition: "true"
      enabled: true
      endpointNames:
      - export
      name: export
      source: DeviceMessages
    - condition: "true"
      enabled: true
      endpointNames:
      - export2
      name: export2
      source: DeviceMessages
    sku:
    - capacity: "1"
      name: S1
    tags:
      purpose: testing
  providerConfigRef:
    name: example

Corresponding Terraform resource from registry:

resource "azurerm_iothub" "example" {
  name                = "Example-IoTHub"
  resource_group_name = azurerm_resource_group.example.name
  location            = azurerm_resource_group.example.location

  sku {
    name     = "S1"
    capacity = "1"
  }

  endpoint {
    type                       = "AzureIotHub.StorageContainer"
    connection_string          = azurerm_storage_account.example.primary_blob_connection_string
    name                       = "export"
    batch_frequency_in_seconds = 60
    max_chunk_size_in_bytes    = 10485760
    container_name             = azurerm_storage_container.example.name
    encoding                   = "Avro"
    file_name_format           = "{iothub}/{partition}_{YYYY}_{MM}_{DD}_{HH}_{mm}"
  }

  endpoint {
    type              = "AzureIotHub.EventHub"
    connection_string = azurerm_eventhub_authorization_rule.example.primary_connection_string
    name              = "export2"
  }

  route {
    name           = "export"
    source         = "DeviceMessages"
    condition      = "true"
    endpoint_names = ["export"]
    enabled        = true
  }

  route {
    name           = "export2"
    source         = "DeviceMessages"
    condition      = "true"
    endpoint_names = ["export2"]
    enabled        = true
  }

  enrichment {
    key            = "tenant"
    value          = "$twin.tags.Tenant"
    endpoint_names = ["export", "export2"]
  }

  cloud_to_device {
    max_delivery_count = 30
    default_ttl        = "PT1H"
    feedback {
      time_to_live       = "PT1H10M"
      max_delivery_count = 15
      lock_duration      = "PT30S"
    }
  }

  tags = {
    purpose = "testing"
  }
}

Signed-off-by: Alper Rifat Ulucinar <[email protected]>
(cherry picked from commit af0dc440500da20e163c666f6d5333866d8bb573)
Signed-off-by: Alper Rifat Ulucinar <[email protected]>
(cherry picked from commit 0ad041887b062973f041c7d5cf16939000af4418)
Signed-off-by: Alper Rifat Ulucinar <[email protected]>
(cherry picked from commit 0f9468d385e58b1fb2a0772aece828842f878a43)
@ulucinar ulucinar force-pushed the expipeline branch 2 times, most recently from 1837220 to d55cf71 Compare June 1, 2022 22:43
Copy link
Member

@muvaf muvaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ulucinar ! There are a couple of things discussed here but we'll deal with those in separate PRs as I understand. I tried to touch only the parts where I think could change the API of Upjet, hence affect official provider implementations. Though it's fine compared to Terrajet since it's closed source.

pkg/config/provider.go Outdated Show resolved Hide resolved
// Generated is a struct that holds generated types
type Generated struct {
Types []*types.Named
Comments twtypes.Comments

ForProviderType *types.Named
AtProviderType *types.Named

FieldTransformations map[string]Transformation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, this is used to transfer the information of reference fields using field paths from CRD generator to example generator. And we have a similar information in Reference which is actually the source of truth of this information for the CRD generator as well. And sensitive also comes from schema, which is part of the configuration. Can we somehow use that configuration field in the example generator as well? This would keep example generator independent of CRD generator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately config.Reference is not used consistently for generating secret references and the transformed name we are interested in is computed on the fly here and not stored on its own except for the json tag. I ended up storing (caching) the transformed name in a new types.Field attribute named TransformedName.

I think, with a future refactoring, we may consider unifying secret references and other cross-resource references. This would also benefit what we are trying to achieve here. I did not want to do refactorings with this change.

Another alternative I've considered was to use one of the existing fields to store (cache) the transformed name for the fields. One such candidate, for instance, was to use Field.Name.LowerCamelComputed (which already has the desired value for non-reference fields). But I did not want to change the existing assumptions on this field (such as for referencer fields, the target field's lower camel does not have the ref-suffix).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened this issue to track this & improvements for future pipelines.

@@ -146,6 +166,14 @@ func WithDefaultResourceFn(f DefaultResourceFn) ProviderOption {
}
}

// WithProviderMetadata configures the Terraform metadata file scraped
// from the Terraform registry
func WithProviderMetadata(metadataPath string) ProviderOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, I'm trying to see schema and metadata very similar to have a conceptual convergence about how things work. Schema is required since there is no TF provider without a schema. I wonder whether that's true for metadata as well given that we literally use TF registry to find TF providers. So, I'm not sure there is an outstanding scenario that we'd use Upjet for a TF provider that does not have metadata. I think it's worth considering to make metadata as a direct argument to NewProviderWithSchema similar to schema and distribute the metadata to individual resources, ready to be used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed config.WithProviderMetadata and made the metadata document path a parameter to config.NewProviderWithSchema. However, we still allow the path to be zero-valued because we may not have the metadata available for all of the official providers at a given time (even if we want to handcraft metadata docs for small providers).

pkg/config/provider.go Outdated Show resolved Hide resolved
pkg/pipeline/example.go Outdated Show resolved Hide resolved
}

// Generate generates an example manifest for the specified Terraform resource.
func (eg *ExampleGenerator) Generate(group, version string, r *config.Resource, fieldTransformations map[string]tjtypes.Transformation) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, we have Generate and StoreExamples separate because of the global catalog approach, right? Once we move to scoping references of an example to its own HCL block, we could be done with a single Generate function like others?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently we resolve any references globally so we store example manifests with their references resolved after we have generated CRDs for all resources.

For test automation, when we switch to an implementation in which we use example manifests for the dependencies specified together with the target resource, we may start resolving references locally ("local resolution"). But this may not hold for all resources, i.e., for some resources, Terraform registry may not contain example manifests for the dependencies locally (specified together with the target resource's manifest). Thus, first doing the resolution locally and then globally (prioritizing local resolution over global resolution but also keeping global resolution) would provide better coverage potentially requiring less manual configuration. Let's think about this in a future PR implementing local resolution.

return errors.Wrap(eg.resolveReferences(pm.paved.UnstructuredContent()), "failed to resolve references of paved")
}

func (eg *ExampleGenerator) resolveReferences(params map[string]interface{}) error { // nolint:gocyclo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two classes of references in the examples from what I've seen:

  • Referencing of a static value, i.e. location from ResourceGroup
  • Referencing the actual object, i.e. ID of ResourceGroup

We're resolving references only for the static values here, right? There isn't any code here that will infer actual object reference that could potentially populate References configuration field?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the context here is to generate the example manifests and the sort of reference-resolving we do here result in segments like:

   - connectionStringSecretRef:
        key: attribute.primary_connection_string
        name: example-eventhub-authorization-rule
        namespace: crossplane-system
...
    masterPasswordSecretRef:
      key: example-key
      name: example-secret
      namespace: crossplane-system
    resourceGroupNameRef:
      name: example
    serverNameRef:
      name: example

where in the last fragment, serverNameRef is a reference to a PostgreSQL server resource.

For automatically populating referencer fields from the provided example HCL configurations, the proposed metadata document format contains the examples[].references array.

In this implementation, we do not use this metadata extracted from example HCL configurations to generate cross-resource referencer fields. But as an example we have something like:

             ...
             references:
                backup_policy_id: azurerm_data_protection_backup_policy_postgresql.id
                database_credential_key_vault_secret_id: azurerm_key_vault_secret.versionless_id
                database_id: azurerm_postgresql_database.id
                location: azurerm_resource_group.location
                vault_id: azurerm_data_protection_backup_vault.id

for the azurerm_data_protection_backup_instance_postgresql resource extracted as metadata.

pkg/pipeline/example.go Outdated Show resolved Hide resolved
- Instead, metadata document path is a direct parameter of config.NewProviderWithSchema

Signed-off-by: Alper Rifat Ulucinar <[email protected]>
Copy link
Member

@muvaf muvaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ulucinar ! As discussed offline, inferring the ref/sensitive/fieldname information (Transformation struct) from the schema is possible but we decided to go forward with the current iteration and address it later.

@ulucinar
Copy link
Collaborator Author

ulucinar commented Jun 9, 2022

Thank you @muvaf.

@ulucinar ulucinar merged commit 1a4a079 into crossplane:main Jun 9, 2022
@ulucinar ulucinar deleted the expipeline branch June 9, 2022 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants