Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One-click GCP stack deployments #2833

Merged
merged 20 commits into from
Jul 11, 2024

Conversation

stefannica
Copy link
Contributor

@stefannica stefannica commented Jul 8, 2024

Describe changes

Adding GCP to the list of cloud providers for which ZenML supports "one-click" full stack deployments - ZenML stacks with attached infrastructure provisioned through automated in-browser cloud-specific mechanisms, in this case GCP Cloud Shell and GCP Deployment Manager.

This allows users to easily deploy a full GCP ZenML stack with all the associated infrastructure and credentials:

  • from the CLI with the zenml stack deploy -p gcp command
  • from the dashboard, using the "Smart stack setup" workflows

The GCP ZenML stack includes:

  • S3 bucket as artifact store
  • Artifact Registry as container registry
  • Vertex AI as orchestrator
  • GCP Cloud Build as image builder

Example CLI output:

$ zenml stack deploy -p gcp --set --name gcp-stefan --location europe-west2

GCP ZenML Cloud Stack Deployment                                                                                                                                                                                   
================================

Provision and register a basic GCP ZenML stack authenticated and connected to all the necessary cloud infrastructure resources required to run pipelines in GCP.                                                   

Instructions                                                                                                                                                                                                       

You will be redirected to a GCP Cloud Shell console in your browser where you'll be asked to log into your GCP project and then create a Deployment Manager deployment to provision the necessary cloud resources  
for ZenML.                                                                                                                                                                                                         

NOTE: The Deployment Manager deployment will create the following new resources in your GCP project. Please ensure you have the necessary permissions and are aware of any potential costs:                        

 • A GCS bucket registered as a ZenML artifact store.                                                                                                                                                              
 • A Google Artifact Registry registered as a ZenML container registry.                                                                                                                                            
 • Vertex AI registered as a ZenML orchestrator.                                                                                                                                                                   
 • GCP Cloud Build registered as a ZenML image builder.                                                                                                                                                            
 • A GCP Service Account with the minimum necessary permissions to access the above resources.                                                                                                                     
 • An GCP Service Account access key used to give access to ZenML to connect to the above resources through a ZenML service connector.                                                                             

The Deployment Manager deployment will automatically create a GCP Service Account secret key and will share it with ZenML to give it permission to access the resources created by the stack. You can revoke these 
permissions at any time by deleting the Deployment Manager deployment in the GCP Cloud Console.                                                                                                                    

Estimated costs                                                                                                                                                                                                    

A small training job would cost around: $0.60                                                                                                                                                                      

These are rough estimates and actual costs may vary based on your usage and specific GCP pricing. Some services may be eligible for the GCP Free Tier. Use the GCP Pricing Calculator for a detailed estimate based
on your usage.                                                                                                                                                                                                     

⚠️ The Cloud Shell session will warn you that the ZenML GitHub repository is untrusted. We recommend that you review the contents of the repository and then check the Trust repo checkbox to proceed with the      
deployment, otherwise the Cloud Shell session will not be authenticated to access your GCP projects.                                                                                                               

💡 After the Deployment Manager deployment is complete, you can close the Cloud Shell session and return to the CLI to view details about the associated ZenML stack automatically registered with ZenML.          

Configuration                                                                                                                                                                                                      

You will be asked to provide the following configuration values during the deployment process:                                                                                                                     

 
### BEGIN CONFIGURATION ###
ZENML_STACK_NAME=my-stack
ZENML_STACK_REGION=europe-west2
ZENML_SERVER_URL=https://...-zenml.cloudinfra.zenml.io
ZENML_SERVER_API_TOKEN=....
### END CONFIGURATION ###


Proceed to continue with the deployment. You will be automatically redirected to GCP in your browser. [y/n]: y
If your browser did not open automatically, please open the following URL into your browser to deploy the stack to GCP: GCP Cloud Shell Console.                                                                   


Waiting for the deployment to complete and the stack to be registered. Press CTRL+C to abort...

Implementation Details

This PR mainly uses the existing code structure already put in place with the introduction of AWS full stack deployments in the previous release, with the following minor modifications:

  • the stack deployment API endpoint is modified to also return an optional configuration string in addition to the deployment URL. Users are responsible for copy-pasting this configuration from the CLI/dashboard into the cloud provider console in cases which don't allow transferring this information automatically through URL query params (GCP is one of those cases).
  • the stack deployment API endpoint also returns the list of integrations required by the stack
  • the GCP service connector had to be adjusted to allow credentials to be passed in base64-encoded format

The GCP stack is provisioned through a GCP Cloud Shell session with an included MarkDown tutorial and bash script that the user needs to update and execute manually to deploy a GCP Deployment Manager template. This complication was required due to the following factors:

  • no way to trigger a GCP Deployment Manager template creation directly through a URL
  • no way to include custom parameters in the GCP Cloud Shell URL

Side-changes

  • fix the "publish AWS stack deployment templates" github action
  • introduce a similar github action step for the GCP related artifacts
  • add local image builder to the full AWS ZenML stack registered by the AWS cloud formation template (for completion sake)
  • include the AWS account name in the name of the S3 bucket provisioned by the AWS cloud formation template. This is done to avoid S3 namespace clashing (note: all S3 bucket names share the same global namespace regardless of AWS account, region or owner).
  • include more instruction in the zenml stack deploy CLI after the full stack is deployed: how to activate the stack, which integrations to install
  • include cost estimation in the zenml stack deploy CLI
  • add support to allow GCP service connector credentials to be provided as base64 encoded JSON strings. This is necessary because this is the ONLY format supported by the GCP Deployment Manager service account key resource.
  • increase the lifetime of the ZenML API token generated for cloud stack deployments from 1 hour to 6 hours, to give users more time to go through the stack deployment process.
  • fix a minor bug in the zenml DB upgrade logic that caused DB upgrade failures in development
  • validate that the stack component flavors mentioned in the "full stack registration" endpoint are valid

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • If my change requires a change to docs, I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • If my changes require changes to the dashboard, these changes are communicated/requested.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

Copy link
Contributor

coderabbitai bot commented Jul 8, 2024

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Walkthrough

This update significantly enhances the ZenML stack deployment process across AWS and GCP, introducing detailed configuration steps, new deployment scripts, and updated documentation for streamlined cloud deployments. It also incorporates error handling improvements, deployment support for GCP, and enhancements to integration and validation mechanisms.

Changes

File Path Change Summary
.github/workflows/publish_stack_templates.yml Added permissions for contents to read and id-token to write in the workflow job.
docs/book/.../image-builders/gcp.md Added a hint block to streamline the deployment process using cloud deployment and registration wizards.
docs/book/.../deploy-a-cloud-stack.md Enhanced deployment instructions for ZenML stacks on GCP, detailing steps, warnings, and provisioning resources via Deployment Manager.
infra/README.md Added detailed configuration steps and parameterized templates for AWS and GCP deployment scripts.
infra/aws/aws-ecr-s3-sagemaker.yaml Updated resource naming conventions and added new configurations, including an "image_builder" configuration with a "local" flavor attribute.
infra/gcp/... Introduced a deployment tutorial, script updates, and new YAML configurations for GCP.
src/zenml/cli/stack.py Improved error handling for invalid deployment locations, updated messaging, and streamlined stack deployment instructions.
src/zenml/constants.py Added new constants for configuration paths and token expiration.
src/zenml/enums.py Added GCP to the StackDeploymentProvider enum.
src/zenml/.../gcp_service_connector.py Updated credential classes to support base64 encoding for JSON fields, enhancing validation functions accordingly.
src/zenml/models/init.py Added imports and exported entities for StackDeploymentConfig.
src/zenml/models/.../stack_deployment.py Added new fields to StackDeploymentInfo and StackDeploymentConfig classes for better deployment details and configuration.
src/zenml/.../aws_stack_deployment.py Updated descriptions, instructions, and method signatures for AWS stack deployment, including integration requirements and cost estimates.
src/zenml/.../gcp_stack_deployment.py Introduced functionality for deploying ZenML stacks to GCP with methods for managing deployment processes and configurations.
src/zenml/.../stack_deployment.py Enhanced ZenMLCloudStackDeployment class with new variables and methods for deployment configuration, integration requirements, and stack retrieval.
src/zenml/.../utils.py Included GCPZenMLCloudStackDeployment in stack deployment providers.
src/zenml/.../stack_deployment_endpoints.py Refactored endpoint logic for stack deployment config retrieval and token expiration management.
src/zenml/.../rest_zen_store.py Added StackDeploymentConfig to return types and updated methods for improved stack deployment handling and error management.
src/zenml/.../sql_zen_store.py Modified database migration logic, error handling, and configuration management in stack deployment methods.
src/zenml/.../zen_store_interface.py Updated get_stack_deployment_config method signature to return StackDeploymentConfig.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant ZenMLServer
    participant CloudProvider

    User ->> CLI: Initiates Deployment
    CLI ->> ZenMLServer: Request Deployment Configuration
    ZenMLServer ->> CLI: Returns DeploymentConfig
    CLI ->> CloudProvider: Provision Resources
    CloudProvider -->> CLI: Resources Provisioned
    CLI ->> ZenMLServer: Register Deployment
    ZenMLServer -->> CLI: Deployment Registered
    CLI ->> User: Deployment Complete
Loading

Poem

🐰 In the cloud where data thrives,
With scripts and stacks, ZenML arrives.
AWS and GCP, both in stride,
A seamless journey, a smoother ride.
Errors handled, configs enhanced,
To new deployments, let’s advance! 🚀


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added internal To filter out internal PRs and issues enhancement New feature or request labels Jul 8, 2024
@stefannica stefannica force-pushed the feature/prd-482-gcp-stack-deployment branch from 279f679 to a15cf02 Compare July 9, 2024 14:47
@stefannica
Copy link
Contributor Author

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Jul 10, 2024

Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Outside diff range, codebase verification and nitpick comments (24)
infra/gcp/main.py (1)

16-16: Update the docstring to match the new return type.

The function docstring should be updated to reflect the new return type Tuple[Any, int].

     Returns:
-        Dict[str, str]: A dictionary containing the status of the script and a
+        Tuple[Any, int]: A tuple containing the status of the script and a
             message.
infra/README.md (5)

7-7: Add a comma before "and".

A comma is needed before "and" to connect two independent clauses.

- A Cloud Formation template is used to provision the infrastructure in AWS. The template is parameterized and the user is prompted to provide the necessary values during the CLI / dashboard deployment process.
+ A Cloud Formation template is used to provision the infrastructure in AWS. The template is parameterized, and the user is prompted to provide the necessary values during the CLI / dashboard deployment process.
Tools
LanguageTool

[uncategorized] ~7-~7: Use a comma before “and” if it connects two independent clauses (unless they are closely connected and short).
Context: ...re in AWS. The template is parameterized and the user is prompted to provide the nec...

(COMMA_COMPOUND_SENTENCE_2)


13-13: Avoid bare URL.

Use a markdown link instead of a bare URL.

- The Cloud Formation template is uploaded to AWS S3 using a GitHub action during the release process at the following location: https://zenml-cf-templates.s3.eu-central-1.amazonaws.com/aws-ecr-s3-sagemaker.yaml
+ The Cloud Formation template is uploaded to AWS S3 using a GitHub action during the release process at the following location: [AWS S3](https://zenml-cf-templates.s3.eu-central-1.amazonaws.com/aws-ecr-s3-sagemaker.yaml)
Tools
Markdownlint

13-13: null
Bare URL used

(MD034, no-bare-urls)


17-17: Add a comma before "and".

A comma is needed before "and" to connect two independent clauses.

- A Deployment Manager template is used to provision the infrastructure in GCP. The template is parameterized and the user is prompted to provide the necessary values during the CLI / dashboard deployment process.
+ A Deployment Manager template is used to provision the infrastructure in GCP. The template is parameterized, and the user is prompted to provide the necessary values during the CLI / dashboard deployment process.
Tools
LanguageTool

[uncategorized] ~17-~17: Use a comma before “and” if it connects two independent clauses (unless they are closely connected and short).
Context: ...re in GCP. The template is parameterized and the user is prompted to provide the nec...

(COMMA_COMPOUND_SENTENCE_2)


[uncategorized] ~17-~17: Use a comma before “and” if it connects two independent clauses (unless they are closely connected and short).
Context: ...CP Cloud Shell session is opened instead and the user is provided with a set of conf...

(COMMA_COMPOUND_SENTENCE_2)


17-17: Add a comma before "and".

A comma is needed before "and" to connect two independent clauses.

- Given that there is no way to trigger a Deployment Manager template creation directly using a URL, a GCP Cloud Shell session is opened instead and the user is provided with a set of configuration values that they have to manually copy and paste into the deployment script.
+ Given that there is no way to trigger a Deployment Manager template creation directly using a URL, a GCP Cloud Shell session is opened instead, and the user is provided with a set of configuration values that they have to manually copy and paste into the deployment script.
Tools
LanguageTool

[uncategorized] ~17-~17: Use a comma before “and” if it connects two independent clauses (unless they are closely connected and short).
Context: ...re in GCP. The template is parameterized and the user is prompted to provide the nec...

(COMMA_COMPOUND_SENTENCE_2)


[uncategorized] ~17-~17: Use a comma before “and” if it connects two independent clauses (unless they are closely connected and short).
Context: ...CP Cloud Shell session is opened instead and the user is provided with a set of conf...

(COMMA_COMPOUND_SENTENCE_2)


23-23: Consider using a shorter alternative.

The phrase "In order to be able to" can be shortened to "To".

- `Note:` In order to be able to install the ZenML stack successfully, you need to have billing enabled for your project.
+ `Note:` To install the ZenML stack successfully, you need to have billing enabled for your project.
infra/gcp/gcp-gar-gcs-vertex.md (3)

48-48: Specify language for fenced code blocks.

Specify the language for the fenced code block.

- ```
+ ```plain
Tools
Markdownlint

48-48: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


58-58: Add a comma before "as".

A comma is needed before "as" to clarify the sentence.

- ⚠️ Please make sure the `ZENML_SERVER_API_TOKEN` value is not broken into multiple lines as this will lead to errors !
+ ⚠️ Please make sure the `ZENML_SERVER_API_TOKEN` value is not broken into multiple lines, as this will lead to errors!
Tools
LanguageTool

[uncategorized] ~58-~58: Possible missing comma found.
Context: ...OKEN` value is not broken into multiple lines as this will lead to errors ! ## Deplo...

(AI_HYDRA_LEO_MISSING_COMMA)


81-81: Add a comma before "and".

A comma is needed before "and" to connect two independent clauses.

- The ZenML stack has also been automatically registered with your ZenML server and you may now close the Cloud Shell session and switch back to the ZenML dashboard or the ZenML CLI to continue your workflow.
+ The ZenML stack has also been automatically registered with your ZenML server, and you may now close the Cloud Shell session and switch back to the ZenML dashboard or the ZenML CLI to continue your workflow.
Tools
LanguageTool

[uncategorized] ~81-~81: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ically registered with your ZenML server and you may now close the Cloud Shell sessi...

(COMMA_COMPOUND_SENTENCE)

src/zenml/models/v2/misc/stack_deployment.py (2)

43-47: Typo in description field.

The description field is split across two lines, which may cause readability issues.

-        description="The list of ZenML integrations that need to be installed "
-        "for the stack to be usable.",
+        description="The list of ZenML integrations that need to be installed for the stack to be usable.",

60-73: Missing description for configuration field.

The field configuration is missing a description.

-    configuration: Optional[str] = Field(
+    configuration: Optional[str] = Field(
+        description="Optional configuration for the stack deployment that the user must manually configure into the cloud provider console.",
infra/gcp/gcp-gar-gcs-vertex-deploy.sh (1)

125-133: Improve error message clarity.

Enhance the error message to provide more context.

-    echo "ERROR: The deployment failed. Please check the logs for more information."
+    echo "ERROR: The deployment failed. Please check the GCP Deployment Manager logs for more information."
src/zenml/stack_deployments/aws_stack_deployment.py (2)

14-14: Consider removing the docstring.

The docstring is redundant as it does not add meaningful information beyond the filename and module name.


Line range hint 209-253:
Consider refactoring parameter encoding logic.

The parameter encoding logic can be refactored into a separate utility function for better readability and maintainability.

def encode_params(params: Dict[str, str]) -> str:
    """Encode parameters as URL query parameters."""
    return "&".join([f"{k}={v}" for k, v in params.items()])

# Usage
query_params = encode_params(params)
src/zenml/stack_deployments/gcp_stack_deployment.py (1)

14-14: Consider removing the docstring.

The docstring is redundant as it does not add meaningful information beyond the filename and module name.

docs/book/component-guide/image-builders/gcp.md (3)

19-24: Consider rephrasing for conciseness.

The phrase "In order to" can be replaced with "To" for conciseness.

- In order to use the ZenML Google Cloud image builder you need to enable Google Cloud Build relevant APIs on the Google Cloud project.
+ To use the ZenML Google Cloud image builder you need to enable Google Cloud Build relevant APIs on the Google Cloud project.

25-25: Consider removing unnecessary word.

The word "In" is unnecessary and can be removed for better readability.

- In order to use the ZenML Google Cloud image builder you need to enable Google Cloud Build relevant APIs on the Google Cloud project.
+ To use the ZenML Google Cloud image builder you need to enable Google Cloud Build relevant APIs on the Google Cloud project.
Tools
LanguageTool

[style] ~25-~25: Consider a shorter alternative to avoid wordiness.
Context: ...er this stack component. {% endhint %} In order to use the ZenML Google Cloud image builde...

(IN_ORDER_TO_PREMIUM)


Line range hint 93-93:
Separate "otherwise" for clarity.

The word "otherwise" should be separated from the sentence for clarity.

- Trust repo` checkbox to proceed with the deployment, otherwise the Cloud Shell session will not be authenticated to access your GCP projects.
+ Trust repo` checkbox to proceed with the deployment. Otherwise, the Cloud Shell session will not be authenticated to access your GCP projects.
Tools
LanguageTool

[style] ~25-~25: Consider a shorter alternative to avoid wordiness.
Context: ...er this stack component. {% endhint %} In order to use the ZenML Google Cloud image builde...

(IN_ORDER_TO_PREMIUM)

docs/book/how-to/stack-deployment/deploy-a-cloud-stack.md (1)

281-281: Consider using a synonym for "give".

The word "give" can be replaced with "provide" for better clarity.

- An GCP Service Account access key used to give access to ZenML to connect to the above resources through a ZenML service connector.
+ An GCP Service Account access key used to provide access to ZenML to connect to the above resources through a ZenML service connector.
Tools
LanguageTool

[style] ~281-~281: Try using a synonym here to strengthen your writing.
Context: ... GCP Service Account access key used to give access to ZenML to connect to the above...

(GIVE_PROVIDE)

src/zenml/integrations/gcp/service_connectors/gcp_service_connector.py (3)

Line range hint 93-131:
Ensure consistent base64 encoding validation.

The method validate_user_account_dict correctly handles base64 decoding but only checks if the string matches a base64 pattern. Consider using a more robust method to ensure the string is a valid base64 encoded JSON.

import base64

def is_base64(sb):
    try:
        if isinstance(sb, str):
            sb_bytes = bytes(sb, 'utf-8')
        elif isinstance(sb, bytes):
            sb_bytes = sb
        else:
            raise ValueError("Input should be a string or bytes")
        return base64.b64encode(base64.b64decode(sb_bytes)) == sb_bytes
    except Exception:
        return False

# In validate_user_account_dict method:
elif isinstance(user_account_json, str):
    # Check if the user account JSON is base64 encoded and decode it
    if is_base64(user_account_json):
        try:
            data["user_account_json"] = base64.b64decode(
                user_account_json
            ).decode("utf-8")
        except Exception as e:
            raise ValueError(
                f"Failed to decode base64 encoded user account JSON: {e}"
            )

Line range hint 186-227:
Ensure consistent base64 encoding validation.

The method validate_service_account_dict correctly handles base64 decoding but only checks if the string matches a base64 pattern. Consider using a more robust method to ensure the string is a valid base64 encoded JSON.

# In validate_service_account_dict method:
elif isinstance(service_account_json, str):
    # Check if the service account JSON is base64 encoded and decode it
    if is_base64(service_account_json):
        try:
            data["service_account_json"] = base64.b64decode(
                service_account_json
            ).decode("utf-8")
        except Exception as e:
            raise ValueError(
                f"Failed to decode base64 encoded service account JSON: {e}"
            )

Line range hint 290-331:
Ensure consistent base64 encoding validation.

The method validate_service_account_dict correctly handles base64 decoding but only checks if the string matches a base64 pattern. Consider using a more robust method to ensure the string is a valid base64 encoded JSON.

# In validate_service_account_dict method:
elif isinstance(external_account_json, str):
    # Check if the external account JSON is base64 encoded and decode it
    if is_base64(external_account_json):
        try:
            data["external_account_json"] = base64.b64decode(
                external_account_json
            ).decode("utf-8")
        except Exception as e:
            raise ValueError(
                f"Failed to decode base64 encoded external account JSON: {e}"
            )
src/zenml/zen_stores/zen_store_interface.py (1)

2272-2278: Docstring improvement: Add details about the return type.

The docstring should include details about the return type for better clarity.

-        """Return the cloud provider console URL and configuration needed to deploy the ZenML stack.
+        """Return the cloud provider console URL and configuration needed to deploy the ZenML stack.
+
+        Args:
+            provider: The stack deployment provider.
+            stack_name: The name of the stack.
+            location: The location where the stack should be deployed.
+
+        Returns:
+            StackDeploymentConfig: The cloud provider console URL and configuration needed to deploy
+            the ZenML stack to the specified cloud provider.
src/zenml/zen_stores/rest_zen_store.py (1)

Line range hint 2889-2908: Ensure proper error handling for external calls.

The get_stack_deployment_stack method makes an external call using the self.get method. Consider adding error handling to manage potential issues with the external call, such as network failures or invalid responses.

        body = self.get(
            f"{STACK_DEPLOYMENT}{STACK}",
            params=params,
        )
+       if not body:
+           raise ValueError("Failed to retrieve stack deployment stack.")
Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 9060c52 and 08b72c9.

Files selected for processing (23)
  • .github/workflows/publish_stack_templates.yml (1 hunks)
  • docs/book/component-guide/image-builders/gcp.md (1 hunks)
  • docs/book/how-to/stack-deployment/deploy-a-cloud-stack.md (5 hunks)
  • infra/README.md (1 hunks)
  • infra/aws/aws-ecr-s3-sagemaker.yaml (4 hunks)
  • infra/gcp/gcp-gar-gcs-vertex-deploy.sh (1 hunks)
  • infra/gcp/gcp-gar-gcs-vertex.jinja (4 hunks)
  • infra/gcp/gcp-gar-gcs-vertex.md (1 hunks)
  • infra/gcp/main.py (3 hunks)
  • src/zenml/cli/stack.py (4 hunks)
  • src/zenml/constants.py (2 hunks)
  • src/zenml/enums.py (1 hunks)
  • src/zenml/integrations/gcp/service_connectors/gcp_service_connector.py (7 hunks)
  • src/zenml/models/init.py (2 hunks)
  • src/zenml/models/v2/misc/stack_deployment.py (2 hunks)
  • src/zenml/stack_deployments/aws_stack_deployment.py (7 hunks)
  • src/zenml/stack_deployments/gcp_stack_deployment.py (1 hunks)
  • src/zenml/stack_deployments/stack_deployment.py (3 hunks)
  • src/zenml/stack_deployments/utils.py (1 hunks)
  • src/zenml/zen_server/routers/stack_deployment_endpoints.py (5 hunks)
  • src/zenml/zen_stores/rest_zen_store.py (4 hunks)
  • src/zenml/zen_stores/sql_zen_store.py (5 hunks)
  • src/zenml/zen_stores/zen_store_interface.py (2 hunks)
Files skipped from review due to trivial changes (3)
  • .github/workflows/publish_stack_templates.yml
  • src/zenml/enums.py
  • src/zenml/zen_server/routers/stack_deployment_endpoints.py
Additional context used
Path-based instructions (14)
src/zenml/stack_deployments/utils.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/models/v2/misc/stack_deployment.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/stack_deployments/stack_deployment.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/stack_deployments/aws_stack_deployment.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/stack_deployments/gcp_stack_deployment.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

docs/book/component-guide/image-builders/gcp.md (1)

Pattern docs/**/*.md: Review the documentation for readability and clarity.

docs/book/how-to/stack-deployment/deploy-a-cloud-stack.md (1)

Pattern docs/**/*.md: Review the documentation for readability and clarity.

src/zenml/constants.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/models/__init__.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/cli/stack.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/integrations/gcp/service_connectors/gcp_service_connector.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/zen_stores/zen_store_interface.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/zen_stores/rest_zen_store.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

src/zenml/zen_stores/sql_zen_store.py (1)

Pattern src/zenml/**/*.py: Review the Python code for conformity with Python best practices.

LanguageTool
infra/README.md

[uncategorized] ~7-~7: Use a comma before “and” if it connects two independent clauses (unless they are closely connected and short).
Context: ...re in AWS. The template is parameterized and the user is prompted to provide the nec...

(COMMA_COMPOUND_SENTENCE_2)


[uncategorized] ~17-~17: Use a comma before “and” if it connects two independent clauses (unless they are closely connected and short).
Context: ...re in GCP. The template is parameterized and the user is prompted to provide the nec...

(COMMA_COMPOUND_SENTENCE_2)


[uncategorized] ~17-~17: Use a comma before “and” if it connects two independent clauses (unless they are closely connected and short).
Context: ...CP Cloud Shell session is opened instead and the user is provided with a set of conf...

(COMMA_COMPOUND_SENTENCE_2)

infra/gcp/gcp-gar-gcs-vertex.md

[style] ~23-~23: Consider a shorter alternative to avoid wordiness.
Context: ...k. Note: In order to be able to install the ZenML stack succ...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~58-~58: Possible missing comma found.
Context: ...OKEN` value is not broken into multiple lines as this will lead to errors ! ## Deplo...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~81-~81: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ically registered with your ZenML server and you may now close the Cloud Shell sessi...

(COMMA_COMPOUND_SENTENCE)

docs/book/component-guide/image-builders/gcp.md

[style] ~25-~25: Consider a shorter alternative to avoid wordiness.
Context: ...er this stack component. {% endhint %} In order to use the ZenML Google Cloud image builde...

(IN_ORDER_TO_PREMIUM)

docs/book/how-to/stack-deployment/deploy-a-cloud-stack.md

[typographical] ~93-~93: The word “otherwise” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...rust repo` checkbox to proceed with the deployment, otherwise the Cloud Shell session will not be aut...

(THUS_SENTENCE)


[uncategorized] ~116-~116: Possible missing comma found.
Context: ... service accounts involved in the stack deployment and then deploys the stack using a GCP ...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~136-~136: Consider a shorter alternative to avoid wordiness.
Context: ...g) {% endtab %} {% tab title="CLI" %} In order to create a remote stack over the CLI you ...

(IN_ORDER_TO_PREMIUM)


[uncategorized] ~137-~137: Possible missing comma found.
Context: ...order to create a remote stack over the CLI you can use the following command: ``...

(AI_HYDRA_LEO_MISSING_COMMA)


[typographical] ~177-~177: The word “otherwise” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...rust repo` checkbox to proceed with the deployment, otherwise the Cloud Shell session will not be aut...

(THUS_SENTENCE)


[uncategorized] ~200-~200: Possible missing comma found.
Context: ... service accounts involved in the stack deployment and then deploys the stack using a GCP ...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~281-~281: Try using a synonym here to strengthen your writing.
Context: ... GCP Service Account access key used to give access to ZenML to connect to the above...

(GIVE_PROVIDE)

Markdownlint
infra/README.md

13-13: null
Bare URL used

(MD034, no-bare-urls)

infra/gcp/gcp-gar-gcs-vertex.md

48-48: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/book/how-to/stack-deployment/deploy-a-cloud-stack.md

154-154: null
Multiple headings with the same content

(MD024, no-duplicate-heading)


167-167: null
Multiple headings with the same content

(MD024, no-duplicate-heading)


289-289: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


290-290: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


291-291: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


292-292: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


293-293: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


294-294: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


295-295: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


296-296: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


297-297: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


298-298: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


273-273: null
Multiple headings with the same content

(MD024, no-duplicate-heading)


284-284: null
Multiple headings with the same content

(MD024, no-duplicate-heading)

Additional comments not posted (33)
src/zenml/stack_deployments/utils.py (2)

22-24: Import statement looks good.

The import statement for GCPZenMLCloudStackDeployment is correctly added.


29-29: Update to STACK_DEPLOYMENT_PROVIDERS dictionary looks good.

The dictionary entry for StackDeploymentProvider.GCP is correctly added.

infra/gcp/main.py (4)

33-34: Handle empty payload correctly.

The return statement for an empty payload is correct, returning a status code of 400.


37-40: Secret values expansion logic looks good.

The logic for expanding secret values in the payload is correctly implemented.


72-78: Success return statement looks good.

The return statement for successful stack registration is correct, returning a status code of 200.


91-94: Exception handling looks good.

The exception handling and return statement for a failure due to an exception are correct, returning a status code of 500.

infra/gcp/gcp-gar-gcs-vertex-deploy.sh (1)

116-116: Ensure the script is POSIX compliant.

The set +e and set -e commands are specific to Bash. Ensure the script is intended to be Bash-specific or consider POSIX compliance.

Is the script intended to be Bash-specific, or should it be POSIX compliant?

infra/gcp/gcp-gar-gcs-vertex.jinja (2)

1-5: Ensure uniqueness of random_int macro.

The random_int macro generates a random integer which might not be unique. Consider using a more robust method to ensure uniqueness.

Is uniqueness of the generated integer guaranteed? If not, consider using a more robust method.


Line range hint 143-169:
Ensure secure handling of sensitive data.

Ensure that sensitive data like ZENML_SERVER_API_TOKEN is handled securely and not logged or exposed.

Is there any logging or exposure of ZENML_SERVER_API_TOKEN in the deployment process?

src/zenml/stack_deployments/aws_stack_deployment.py (3)

112-123: Ensure integration names are accurate.

The integration names should match the official ZenML documentation and integration names.

Verification successful

Integration names are correct.

The integration names "aws" and "s3" match the official ZenML documentation and integration names.

  • aws:

    • docs/book/component-guide/step-operators/sagemaker.md: zenml integration install aws
    • docs/book/component-guide/container-registries/aws.md: zenml integration install aws
    • docs/book/how-to/popular-integrations/aws-guide.md: zenml integration install aws s3 -y
  • s3:

    • docs/book/component-guide/artifact-stores/s3.md: zenml integration install s3 -y
    • docs/book/user-guide/production-guide/remote-storage.md: zenml integration install s3 -y
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify accuracy of integration names.

# Test: Search for integration names in documentation.
rg --type md 'integration install'

Length of output: 14769


81-92: Ensure cost estimation accuracy.

The cost estimation should be regularly updated to reflect the latest pricing from AWS.


30-30: Ensure consistency in class variables.

The deployment class variable is specific to AWS. Ensure that similar variables are consistently defined across different stack deployment classes.

Verification successful

Consistency in class variables confirmed.

The deployment class variable is consistently defined across different stack deployment classes:

  • src/zenml/stack_deployments/stack_deployment.py
  • src/zenml/stack_deployments/gcp_stack_deployment.py
  • src/zenml/stack_deployments/aws_stack_deployment.py

No issues found.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistency of class variables in stack deployment classes.

# Test: Search for `deployment` class variable across stack deployment classes.
ast-grep --lang python --pattern $'class $_ {
  $$$
  deployment: ClassVar[$_]
  $$$
}'

Length of output: 562


Script:

#!/bin/bash
# Description: Verify consistency of class variables in stack deployment classes.

# Test: Search for `deployment` class variable across stack deployment classes.
ast-grep --lang python --pattern 'class $_ { deployment: ClassVar[$_] }'

Length of output: 534


Script:

#!/bin/bash
# Description: Verify consistency of class variables in stack deployment classes.

# Test: Search for `deployment` class variable across stack deployment classes.
ast-grep --lang python --pattern 'class $_ { $_: ClassVar[$_] }'

Length of output: 518


Script:

#!/bin/bash
# Description: Verify consistency of class variables in stack deployment classes.

# Test: Search for `deployment` class variable across stack deployment classes.
ast-grep --lang python --pattern 'deployment: ClassVar[$_]'

Length of output: 362

src/zenml/stack_deployments/gcp_stack_deployment.py (2)

30-30: Ensure consistency in class variables.

The deployment class variable is specific to GCP. Ensure that similar variables are consistently defined across different stack deployment classes.


89-94: Ensure the repository trust warning is clear.

The warning about the ZenML GitHub repository being untrusted should be clear and emphasize the importance of reviewing the repository contents.

docs/book/how-to/stack-deployment/deploy-a-cloud-stack.md (3)

48-49: Clarify Azure support status.

The warning about Azure support should clearly state that it's currently in development.


93-93: Separate "otherwise" for clarity.

The word "otherwise" should be separated from the sentence for clarity.

Tools
LanguageTool

[typographical] ~93-~93: The word “otherwise” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...rust repo` checkbox to proceed with the deployment, otherwise the Cloud Shell session will not be aut...

(THUS_SENTENCE)


200-200: Add missing comma.

A comma is missing before "and then deploys the stack."

Tools
LanguageTool

[uncategorized] ~200-~200: Possible missing comma found.
Context: ... service accounts involved in the stack deployment and then deploys the stack using a GCP ...

(AI_HYDRA_LEO_MISSING_COMMA)

infra/aws/aws-ecr-s3-sagemaker.yaml (4)

18-20: Approved: Improved parameter description.

The updated description for ResourceName now includes an example, which enhances clarity.


63-63: Approved: Improved bucket name uniqueness.

The BucketName property now includes the AWS account ID, ensuring uniqueness across different accounts.


327-329: Approved: Added image builder configuration.

The new configuration for image_builder with flavor set to local adds support for local image building.


399-401: Approved: Added image builder configuration to outputs.

The new configuration for image_builder in the outputs section ensures it is included in the stack outputs.

src/zenml/constants.py (2)

340-340: Approved: Added configuration path constant.

The new constant CONFIG with value "/config" adds a clear configuration path reference.


490-490: Approved: Added API token expiration constant.

The new constant STACK_DEPLOYMENT_API_TOKEN_EXPIRATION with value 60 * 6 adds a clear reference for the token expiration time.

src/zenml/models/__init__.py (2)

392-392: Approved: Added import for StackDeploymentConfig.

The new import StackDeploymentConfig in the misc.stack_deployment module is necessary for handling stack deployment configurations.


730-730: Approved: Added new entities to public API.

The addition of StackDeploymentConfig and StackDeploymentInfo to the __all__ list ensures they are part of the module's public API.

src/zenml/cli/stack.py (5)

38-38: Import statement approved.

The import of Style from rich.style is necessary for the formatting changes in the deploy command.


1691-1695: Error handling for invalid location.

The error handling correctly validates the location against the provider's valid locations.


1705-1727: Fetching and displaying deployment configuration.

The changes fetch the deployment configuration and display it to the user, enhancing the deployment process.


1737-1764: Opening deployment URL and waiting for completion.

The code opens the deployment URL in the browser and waits for the deployment to complete, providing a clear user interaction flow.


1793-1814: Follow-up instructions for the user.

The follow-up instructions provide clear guidance to the user on next steps after deployment, including setting the stack as active and installing required integrations.

src/zenml/zen_stores/sql_zen_store.py (3)

1556-1567: LGTM! But verify the migration logic.

The changes for handling user enriched events and version checks look good.

However, ensure that the migration logic is correctly implemented and tested.

Verification successful

Verified: The migration logic and user enriched events handling are correctly implemented.

The _send_user_enriched_events_if_necessary function is defined and properly invoked, ensuring user enriched events for legacy users are sent as intended.

  • src/zenml/zen_stores/sql_zen_store.py: The function _send_user_enriched_events_if_necessary is defined and used correctly.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the migration logic and user enriched events handling.

# Test: Search for migration logic and user enriched events handling. Expect: Correct implementation.
rg --type python -A 5 $'_send_user_enriched_events_if_necessary'

Length of output: 94


Script:

#!/bin/bash
# Description: Verify the migration logic and user enriched events handling.

# Test: Search for migration logic and user enriched events handling. Expect: Correct implementation.
rg --glob '*.py' -A 5 '_send_user_enriched_events_if_necessary'

Length of output: 968


7022-7035: LGTM! But verify the stack component creation logic.

The changes for creating stack components and handling service connectors look good.

However, ensure that the stack component creation logic is correctly implemented and tested.

Verification successful

The stack component creation logic is correctly implemented.

The create_stack_component method in src/zenml/zen_stores/sql_zen_store.py properly handles the creation of stack components, including validation and service connector integration.

  • src/zenml/zen_stores/sql_zen_store.py: create_stack_component method
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the stack component creation logic and service connector handling.

# Test: Search for stack component creation and service connector handling. Expect: Correct implementation.
rg --type python -A 5 $'create_stack_component'

Length of output: 77


Script:

#!/bin/bash
# Description: Verify the stack component creation logic and service connector handling.

# Test: Search for stack component creation and service connector handling. Expect: Correct implementation.
rg 'def create_stack_component' -A 5

Length of output: 2040


Script:

#!/bin/bash
# Description: Extract the implementation of the `create_stack_component` method in `src/zenml/zen_stores/sql_zen_store.py`.

# Extract the method implementation for detailed review.
rg 'def create_stack_component' -A 50 src/zenml/zen_stores/sql_zen_store.py

Length of output: 1892


7449-7465: LGTM! But verify the new function implementation.

The new function to return the cloud provider console URL and configuration looks good.

However, ensure that the new function is correctly implemented and tested.

Verification successful

The implementation of the get_stack_deployment_config function is correct but currently raises a NotImplementedError.

The function is properly defined with accurate docstrings and error handling, but it does not yet provide the intended functionality.

  • src/zenml/zen_stores/sql_zen_store.py: Lines around get_stack_deployment_config function.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the new function implementation for stack deployment configuration.

# Test: Search for the new function implementation. Expect: Correct implementation.
rg --type python -A 5 $'get_stack_deployment_config'

Length of output: 82


Script:

#!/bin/bash
# Description: Verify the new function implementation for stack deployment configuration.

# Test: Search for the new function implementation. Expect: Correct implementation.
rg -A 5 'def get_stack_deployment_config'

Length of output: 1909


Script:

#!/bin/bash
# Description: Extract the full implementation of the get_stack_deployment_config function.

# Extract 20 lines before and after the function definition for complete context.
rg -A 20 -B 20 'def get_stack_deployment_config' src/zenml/zen_stores/sql_zen_store.py

Length of output: 1449

src/zenml/zen_stores/rest_zen_store.py Show resolved Hide resolved
infra/gcp/gcp-gar-gcs-vertex-deploy.sh Show resolved Hide resolved
infra/gcp/main.py Show resolved Hide resolved
Comment on lines +233 to +241
params = dict(
cloudshell_git_repo="https://github.com/zenml-io/zenml",
cloudshell_workspace="infra/gcp",
cloudshell_open_in_editor="gcp-gar-gcs-vertex.jinja,gcp-gar-gcs-vertex-deploy.sh",
cloudshell_tutorial="gcp-gar-gcs-vertex.md",
ephemeral="true",
# TODO: remove this before the branch is merged
cloudshell_git_branch="feature/prd-482-gcp-stack-deployment",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove TODO comment before merging.

The TODO comment indicating the branch should be removed before merging the code.

- # TODO: remove this before the branch is merged
- cloudshell_git_branch="feature/prd-482-gcp-stack-deployment",
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
params = dict(
cloudshell_git_repo="https://github.com/zenml-io/zenml",
cloudshell_workspace="infra/gcp",
cloudshell_open_in_editor="gcp-gar-gcs-vertex.jinja,gcp-gar-gcs-vertex-deploy.sh",
cloudshell_tutorial="gcp-gar-gcs-vertex.md",
ephemeral="true",
# TODO: remove this before the branch is merged
cloudshell_git_branch="feature/prd-482-gcp-stack-deployment",
)
params = dict(
cloudshell_git_repo="https://github.com/zenml-io/zenml",
cloudshell_workspace="infra/gcp",
cloudshell_open_in_editor="gcp-gar-gcs-vertex.jinja,gcp-gar-gcs-vertex-deploy.sh",
cloudshell_tutorial="gcp-gar-gcs-vertex.md",
ephemeral="true",
)

docs/book/how-to/stack-deployment/deploy-a-cloud-stack.md Outdated Show resolved Hide resolved
docs/book/how-to/stack-deployment/deploy-a-cloud-stack.md Outdated Show resolved Hide resolved
@avishniakov
Copy link
Contributor

There are some issue detected by the CI, like docstrings. Please, make sure to sort them out before merging.

Co-authored-by: Andrei Vishniakov <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@stefannica stefannica merged commit f94e743 into develop Jul 11, 2024
65 of 73 checks passed
@stefannica stefannica deleted the feature/prd-482-gcp-stack-deployment branch July 11, 2024 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal To filter out internal PRs and issues run-slow-ci
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants