Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAMS v2.1 Release #201

Merged
merged 44 commits into from
Nov 15, 2024
Merged

VAMS v2.1 Release #201

merged 44 commits into from
Nov 15, 2024

Conversation

scheurik
Copy link
Collaborator

[2.1.0] (2024-11-15)

This minor version includes changes to VAMS pipelines, use-case pipeline implementations, and v2.0 bug fixes.

Recommended Upgrade Path: A/B Stack Deployment with data migration using staging bucket configuration and upgrade migration scripts for DynamoDB tables in ./infra/upgradeMigrationScripts

⚠ BREAKING CHANGES

  • Pipelines are now changed to support a new pipelineType meaning, and the old pipelineType was renamed to pipelineExecutionType.
  • Execution workflow input parameter names to pipelines have also changed, which can break existing workflows/pipelines.

Due to DynamoDB table structure changes, a A/B Stack deployment with migration script is recommended if there are existing pipelines that need to be automatically brought over.

Features

  • Re-worked infrastructure CDK components and project directory structure to split out use-case pipelines (i.e., PotreeViewer/Visualizer Pipelines) from the rest of the lambda backend and stack infrastructures. This will allow for future upgrades that will split these components completely out into their own open-source project.
  • PotreeViewerPipeline (previously VisualizerPipeline) is now baselined to the new standard use-case pipeline pattern to support external state machine callbacks (i.e., from VAMS pipeline workflows)
    • PreviewPotreeViewerPipeline (previously VisualizerPipeline) can now be registered and called from VAMS pipeline workflows (suggested to be called from a preview type pipeline) via the 'vamsExecutePreviewPcPotreeViewerPipeline' lambda function.
  • Added a new use-case pipeline and configuration option for GenAiMetadata3dLabelingPipeline that can take in OBJ, FBX, GLB, USD, STL, PLY, DAE, and ABC files from an asset and use generative AI to analyze the file through 2D renders what keywords, tags, or other metadata the file should be associated with. Pipeline can be called by registering 'vamsExecuteGenAiMetadata3dLabelingPipeline' lambda function with VAMS pipelines / workflows.
  • Added a new use-case pipeline and configuration option for Conversion3dBasic that can convert between STL, OBJ, PLY, GLTF, GLB, 3MF, XAML, 3DXML, DAE, and XYZ file types. VAMS pipeline registration outputType will define for each pipeline registration what the output file extension type will be.
    • This pipeline for non-GovCloud deployments is enabled by default in the infrastructure configuration.
  • Web Added pipelineExecutionType to VAMS pipelines (previously pipelineType) and added a new context to pipelineType. Current pipeline types are StandardFile and PreviewFile. These are implemented to support future roadmap implementations of different pipeline types and auto-executions options on asset file uploads.
  • Web Added inputParameters to pipelines to allow the optional specification of a JSON object which can be used within a pipeline execution to set pipeline configuration options. This is set at the time of creating a VAMS pipeline.
  • Added inputMetadata to pipeline inputs which automatically pulls in asset name, description, tags, and all metadata fields of the asset to a pipeline execution. This can also be used in the future to pull through user-defined inputMetadata at the time of an execution with additional UI/UX.
  • Changed inputPath and outputPath of pipeline function execution inputs to inputS3AssetFilePath and outputS3AssetFilesPath
  • Added outputS3AssetPreviewPath, outputS3AssetMetadataPath, and inputOutputS3AssetAuxiliaryFilesPath pipeline execution parameter inputs to support different location paths for asset data outputs and writing to asset auxiliary temporary path locations
  • Added outputType for user-specified expected file extension output for pipelines based on the VAMS pipeline registration. OutputType is not enforced and is something pipelines need to work into their own business logic as appropriate.
    • All asset write-back locations are now temporary job execution specific to allow for better security, file checks, proper back-versioning into an asset, and to start abstracting pipelines from writing directly to assets. Once the UploadV2 process is completed in a future update, direct access by use-case pipelines to S3 asset buckets will be removed in favor of API uploads / presigned URLs for storage abstraction.
  • Updated processWorkflowExecutionOutput lambda function (previously uploadAllAssets) to also account for metadata data object outputs of pipelines to update against assets. Preview image output logic is stubbed out but will not be fully implemented until the new upload / storage process overhaul is completed in a future version.
  • Added credTokenTimeoutSeconds authProvider config on the infrastructure side to allow manual specification of access, ID, and pre-signed URL tokenExpiration. Extending this can fix upload timeouts for larger files or slower connections. Auth refresh tokens timeouts are fixed to 24 hours currently.
    • Implements a new approach for s3ScopedAccess for upload that allows tokens up to 12 hours using AssumeRoleWithWebIdentity.
  • Web Added PointCloud viewer and pipeline support for .ply file formats, moved from the 3D Mesh 3D Online Viewer
  • Web The asset file viewer now says (primary) next to the assets main/primary associated file. The primary file is what get's used right now for pipeline ingestion when launching a workflow.
  • Changed access logs S3 bucket lifecycle policy to only remove logs after 90 days
  • Added lifecycle polcies on asset and asset auxiliary bucket to remove incomplete upload parts after 14 days

Bug Fixes

  • Fixed CreateWorkflow error seen in v2.0 (Mac/Linux builds) with updated library dependencies and setting a standardized docker platforms across the board to linux/amd64
  • Re-worked PreviewPotreeViewerPipeline (previously VisualizerPipeline) state machine and associated functions to properly handle errors
  • Fixed benign logger errors in OpenSearch indexing lambda function (streams)
  • Fixed existing functionality with processWorkflowExecutionOutput (previously uploadAllAssets) not working
  • Fixed pipeline execution to properly account for asset file primary key names that contain spaces. Previously, could cause pipelines to error on execution.
  • Web The asset file viewer now appropriately shows multiple files that are uploaded to the asset
  • Web Hid the View %AssetName% Metadata button for top-level root folder on asset details page file manager that led to a blank page. The metadata for this is already on the asset details page.
  • Fixed GovCloud deployments where v2 Lambda PreTokenGen for Cognito are not supported, reverted to v1 lambdas that only support Access Tokens (instead of both ID and Access token use for VAMS authorizers)
  • Fixed GovCloud deployments for erronouesly including a GeoServices reference that is not supported in GovCloud partition
  • Fixed KMS key IAM policy principals (for non-externally imported key setting) to include OpenSearch when using OpenSearch deployment configurations
  • Added logic to look at other claims data if "vams:*" claims are not in the original JWT token. This is in prepartion for external IDP support and some edge case setups customers have.
  • Fixed CDK deployment bug not deploying the required VPC endpoints during particular configurations of OpenSearch Provisioned, Not using all Lambda's behind VPCs, and using the option to use VPC endpoints
  • Web Fixed bug where adding asset links had swapped the child/parent asset (WebUI only bug, API direct calls were not affected)
  • Fixed CDK deployment bug of encrypting the WebAppLogsBucket when deploying with ALB and KMS encryption. The WebAppLogsBucket cannot be KMS encrypted when used for ALB logging output.
  • Fixed bug for exceeding PolicyLimitSize of STS temporary role calls in S3ScopedAccess used during asset upload from the Web UI when KMS encryption is enabled.
  • Increased CustomResource lambda timeouts for OpenSearch schema deployment that caused issues intermitently during GovCloud deployments
  • Fixed bug in constraint service API that was saving constraints on POST/PUT properly but was erroring on generating a 200 response resulting in a 500 error
  • Fixed bug in OpenSearch indexing (bad logging method) during certain edge cases that prevented adding new data to the index
  • Fixed bug in CDK storageResource helper function where S3 buckets were not getting the proper resource policies applied

Chores

  • VisualizerPipeline now re-named to PreviewPotreeViewerPipeline as the previous name was too generic and other "visualizer" or viewer pipelines may exist later
  • 'visualizerAssets' S3 bucket renamed to 'assetAuxiliary'. This bucket will now be used for all pipeline or otherwise auto-generated files (previews/thumbnails) associated with assets that should not be versioned
  • 'visualizerAssets/{proxy+}' API route and related function re-named to 'auxililaryPreviewAssets/stream/{proxy+}'. This function is used for retrieving auto-generated preview files that should be rapidly streamed such as the PreviewPotreeViewerPipeline files.
  • Renamed and moved uploadAllAssets lambda function handler. It is now processWorkflowExecutionOutput and moved to the workflows backend folder
  • Updated Workflow ListExecutions to write stopDate, startDate, and executionStatus back to DynamoDB table after an SFN fetch where the execution has stopped. This is done for performance / caching reasons.
  • Workflow executions are now limited to only 1 active running execution per workflow per asset. This helps prevent workflows from globbering each other and preventing other errors and race conditions
  • Updated a pipeline's default taskTimeout to 24 hours and taskHeartBeat to 1 hour unless otherwise specified. Previously, it defaulted to the service default which was up to a year. This helps prevent runaway asynchronous processes that never properly return and closeout workflow executions.
  • Added some external sfn token heartbeats into the new and existing use-case pipeline implementations at the end of a container run. These heartbeat locations can still be improved, but it is expected that these pipelines take longer to run.
  • Workflow executions now pass the originating execution caller's username and request context, which can be used for lambda cross-call logic
  • Created an additional Casbin API check abstraction function which can be used to consolidate API permission check logic and simplify lambda handlers. Applied to all existing API-gateway accessible lambda handlers
  • Added CDK Stack output to display all VAMS Pipeline Lambda function names for all activated use-case pipelines that can be registered within the VAMS.
  • Added error for all use-case pipeline lambdas if executed with the wrong task_token / call-back setup (synch vs asynch) in VAMS
  • Added draft lambda functions for the uploadV2 feature expected. Draft function not yet ingested into CDK for deployment.
  • Added security.txt file to website for AWS security reporting information.
  • Updated documentation on security, legal, and use notices.

Known Oustanding Issues

  • Using s3ScopedAccess for Upload (currently in use by VAMS WebUI) can also cause synchronization issues due to race conditions between uploading and calling the asset upload APIs. Additionally handling very large file uplaods and downloads (+1TB) can cause issues. Expect a future re-write to use solely pre-signed storage URLs for upload and a 3/4-step guided API call process for this to resolve this issue, similar to ingestAsset API used to test the core of this new method.

scheurik and others added 30 commits June 14, 2024 18:07
… classic auth flow. Added ability to set timeout in config.Know issue: webpage will refresh after an hour during upload, but upload will still occur if stay logged into browser page
…imeout, added comments to change in auth flow
Extended auth token to a day using assume_role_with_web_identity with classic auth flow...

See merge request aws-spatial-prototyping/vams-govcloud!5
…ion, Update default timeout to 3600 seconds, Update Documentation and Diagrams, Fix the docker platform versions to amd64, Fixed/Updated default lambda artefacts zip with latest function code, Fixed pipeline upload logic for files, Lint/Prettier Fix
…gic was missing for checking/moving pipeline uploaded assets to their primary assets location
…peline, Feature to show pipeline use-case lambda function names as CDK deployment outputs for easier registration in VAMS, Added file outputType to pipeline step function input data, Update code to display all files at a asset location and show the primary asset file with a "(Primary)" tag, Fixed bug with "View Asset Metadata" incorrectly showing button showing on root asset folder
…rsion pipeline, rename GenAi3dMetadataExtraction to GenAi3dMetadataLabeling, update file headings for release, update documentation and diagrams to match
Copy link
Collaborator

@cottingr cottingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved to merge

@cottingr cottingr merged commit 15a57a4 into main Nov 15, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants