Skip to content

Latest commit

 

History

History
60 lines (52 loc) · 4.05 KB

readme.md

File metadata and controls

60 lines (52 loc) · 4.05 KB

Tips

A collection of tips for scaling jobs, generalizing jobs for flexibility, and developing ML training jobs that are portable. Think of this as DevOps for ML training jobs. The tips will show how to do multiple tasks in parallel within your code, pass parameters to jobs from the command line and input files, package training code, build custom containers with training code, and deploy training code on Vertex AI Training to take advantage of scalable managed infrastructure at the job level.

Using This Repository

  • Each notebook that has a parameter defined as BUCKET = PROJECT_ID can be customized:
    • change this to BUCKET = PROJECT_ID + 'suffix' if you already have a GCS bucket with the same name as the project.

Notes

  • aiplatform Python Client
    • All about the Vertex AI Python Client: versions (aiplatform_v1 and aiplatform_v1beta) and layers (aiplatform and aiplatform.gapic). Includes the deeper details and examples of using each.

Python: Notebooks on Skills For ML Training Jobs and Tasks

  • Python Multiprocessing
    • tips for executing multiple tasks at the same time
  • Python Job Parameters
    • tips for passing values to programs from the command line (argparse, docopt, click) or with files (JSON, YAML, pickle)
  • Python Client for GCS
    • tips for interacting with GCS storage from Python, Vertex AI
  • Python Packages
    • prepare ML training code with a file (modules), folders, packages, distributions (source distribution and built distribution) and storing in custom repositories with Artifact Registry
  • Python Custom Containers
    • tips for building derivative containers with Cloud Build and Artifact Registry
  • Python Training
    • move training code out of a notebook and into Vertex AI Training Custom Jobs
    • This demonstrates many workflows for directly using the code formats created in Python Packages and for the custom container workflows created in Python Custom Containers

BigQuery: Notebooks on BigQuery Topics

Additional Tips

Notebooks on Skills For BigQuery

  • New series will go here (see todo)

ToDo:

  • split this folder with subfolders
  • Python, BigQuery, KFP, ...
    • KFP Layers: components, tasks, artifacts, pipelines, IO
    • BQ Layers: project, dataset, table, rows, columns, cells + access, operations, ...
  • [IP] BigQuery Tips:
    • BigQuery - Python Clients
    • BigQuery - R
    • BigQuery - Data Types
    • BigQuery - Tables
    • BigQuery - UDF
    • BigQuery - Remote Functions
  • Add Git workflow tip - how to clone with PAT
  • [DEV] add KFP tip, include component authoring