- Python 3.10
- Poetry (for managing python dependencies in the VAMS backend)
- Docker
- Node >=18.7
- Yarn >=1.22.19
- Node Version Manager (nvm)
- AWS cli
- AWS CDK cli
- Programatic access to AWS account at minimum access levels outlined above.
VAMS Codebase is changing frequently and we recommend you checkout the stable released version from github.
You can identify stable releases by their tag. Fetch the tags git fetch --all --tags
and then git checkout tags/TAG
or git checkout -b TAG tags/TAG
where TAG is the actual desired tag. A list of tags is found by running git tag --list
or on the releases page.
-
cd ./web && nvm use
- make sure you're node version matches the project. Make sure Docker daemon is running. -
yarn install
- make sure you install the packages required by the web app -
npm run build
- build the web app. -
cd ../infra && npm install
- installs dependencies defined in package.json. -
If you haven't already bootstrapped your aws account with CDK.
cdk bootstrap aws://101010101010/us-east-1
- replace with your account and region. If you are boostrapping a GovCloud account, runexport AWS_REGION=[gov-cloud-region]
as the AWS SDK needs to be informed to use GovCloud endpoints. -
Modify the
config.json
in/infra/config
to set the VAMS deployment parameters and features you would like to deploy. Recommended minimum fields to update areregion
,adminEmailAddress
, andbaseStackName
when using the default provided template. More information about the configuration options can be found in the Configuration Options section below. -
(Optional) Override the the CDK stack name and region for deployment with environment variables
export AWS_REGION=us-east-1 && export STACK_NAME=dev
- replace with the region you would like to deploy to and the name you want to associate with the cloudformation stack that the CDK will deploy. -
(FIPS Use Only) If deploying with FIPS, enable FIPS environment variables for AWS CLI
export AWS_USE_FIPS_ENDPOINT=true
and enableapp.useFips
in theconfig.json
configuration file in/infra/config
-
(External VPC Import Only) If importing an external VPC with subnets in the
config.json
configuration, runcdk deploy --all --require-approval never --context loadContextIgnoreVPCStacks=true
to import the VPC ID/Subnets context and deploy all non-VPC dependant stacks first. Failing to run this with the context setting or configuration setting ofloadContextIgnoreVPCStacks
will cause the final deployment of all stacks step to fail. -
npm run deploy.dev
- An account is created in an AWS Cognito User Pool using the email address specified in the infrastructure config file. Expect an email from <[email protected]
> with a temporary password.10a. Ensure that docker is running before deploying as a container will need to be built
-
Navigate to URL provided in
{stackName].WebAppCloudFrontDistributionDomainNameOutput
(Cloudfront) or{stackName].WebsiteEndpointURLOutput
(ALB) fromcdk deploy
output. -
Check email for temporary account password to log in with the email address you provided.
You can change the region and deploy a new instance of VAMS my setting the environment variables to new values (export AWS_REGION=us-east-1 && export STACK_NAME=dev
) and then running npm run deploy.dev
again.
To deploy customzations or updates to VAMS, you can update the stack by running cdk deploy --all
. A changeset is created and deployed to your stack.
Please note, depending on what changes are in flight, VAMS may not be available to users in part or in whole during the deployment. Please read the change log carefully and test changes before exposing your users to new versions.
Deployment data migration documentation and scripts between major VAMS version deployments are located in /infra/deploymentDataMigration
SAML authentication enables you to provision access to your VAMS instance using your organization's federated identity provider such as Auth0, Active Directory, or Google Workspace.
If the configuration file /infra/config/config.json
set AuthProvider.UseCognito.UseSaml
to true
to enable, false
for disabled
You need your SAML metadata url, and then you can fill out the required information in infra/config/saml-config.ts
.
The required information is as follows:
name
identifies the name of your identity provider.cognitoDomainPrefix
is a DNS compatible, globally unique string used as a subdomain of cognito's signon url.metadataContent
is a url of your SAML metadata. This can also point to a local file ifmetadataType
is changed tocognito.UserPoolIdentityProviderSamlMetadataType.FILE
.
Then you can deploy the infra stack by running cdk deploy --all
if you have already deployed or using the same build and deploy steps as above.
The following stack outputs are required by your identity provider to establish trust with your instance of VAMS:
- SAML IdP Response URL
- SP urn / Audience URI / SP entity ID
- CloudFrontDistributionUrl for the list of callback urls. Include this url with and without a single trailing slack (e.g., https://example.com and https://example.com/)
VAMS API requires a valid authorization token that will be validated on each call against the configured authentication system (eg. Cognito).
All API calls require that the below claims be included as part of that JWT token. This is done via the pretokengen
lambda that is triggered on token generation in Cognito. If implementing a different authentication OATH system, developers must ensure these claim token are included in their JWT token.
The critical component right now is that the authenticated VAMS username be included in the tokens
array. Roles and externalAttributes are optional right now as they are looked up at runtime.
{
'claims': {
"tokens": [<username>],
"roles": [<roles>],
"externalAttributes": []
}
}
NOTE: GovCloud deployments when the GovCloud configuration setting is true only support v1 of the Cognito lambdas. This means that ONLY Access tokens produced by Cognito can be used with VAMS API Calls for authentication/authorization. Otherwise both ID and Access tokens can be used for Non-govcloud deployments.
If you are needing to add custom settings to your local docker builds, such as adding custom SSL CA certificates to get through HTTPS proxies, modify the following docker build files:
/infra/config/dockerDockerfile-customDependencyBuildConfig
- Docker file for all local packaging environments such as Lambda Layers and/or Custom Resources. Add extra lines to end of file./backendPipelines/...
- Docker files for use-case pipeline containers. Add extra lines above any package install or downloads.
When specifying docker image pulls, all docker files should be defined/fixed to the linux/amd64
platform. This alleievates issues with deploying across Windows, Max, and Linux host OS platforms.
If you need to deploy VAMS CDK using custom SSL certificates due to internal organization HTTPS proxy requirements, follow the below instructions.
- Download to your host machine the .pem certificate that is valid for your HTTPS proxy to a specific path
- Set the following environments variables to the file path in step 1:
$AWS_CA_BUNDLE
and$NODE_EXTRA_CA_CERTS
- Modify the Dockerbuild files specified and instructed in and add the following lines (for Python PIP installs) below. Update
/local/OShost/path/Combined.pem
to the local host path relative to the Dockerfile location.
COPY /local/OShost/path/Combined.pem /var/task/Combined.crt
RUN pip config set global.cert /var/task/Combined.crt
- You may need to add additional environment variables to allow using the ceritificate to be used for for
apk install
orapt-get
system actions.
The web front-end runs on NodeJS React with a supporting library of amplify-js SDK. The React web page is setup as a single page app using React routes with a hash (#) router.
Infrastructure Note (Hash Router): The hash router was chosen in order so support both cloudfront and application load balancer (ALB) deployment options. As of today, ALBs do not support URL re-writing (without a EC2 reverse proxy), something needed to support normal (non-hash) web routing in React. It was chosen to go this route to ensure that the static web page serving is a AWS serverless process at the expense of SEO degredation, something generally not critical in internal enterprise deployments.
(Important!) Development Note (Hash Router): When using <Link>
, ensure that the route paths have a #
in front of them as Link uses the cloudscape library which doesn't tie into the React router. When using <navigate>
, part of the native React library and thus looking at the route manager, exclude the #
from the beginning of the route path. Not following this will cause links to return either additional appended hash routes in the path or not use hashes at all.
The front end when loading the page receives a configuration from the AWS backend to include amplify storage bucket, API Gateway/Cloudfront endpoints, authentication endpoints, and features enabled. Some of these are retrieved on load pre-authentication while others are received post-authentication. Features enabled is a comma-deliminated list of infrastructure features that were enabled/disabled on CDK deployment through the config.json
file and toggle different front-end features to view.
To process an asset through VAMS using an external system or when a job can take longer than the Lambda timeout of 15 minutes, it is recommended that you use the Wait for a Callback with the Task Token feature so that the Pipeline Lambda can initiate your job and then exit instead of waiting for the work to complete before it also finishes. This reduces your Lambda costs and helps you avoid failed jobs that fail simply because they take longer than the timeout to complete.
To use Wait for a call back with the Task Token, enable the option to use Task Tokens on the create pipeline screen. When using this option, you must explicitly make a callback to the Step Functions API with the Task Token in the event passed to your Lambda function. The Task Token is provided in the event with the key TaskToken
. You can see this using the Step Functions execution viewer under the Input tab for an execution with the call back enabled. Pass the TaskToken
to the system that can notify the Step Functions API that the work is complete with the SendTaskSuccess
message.
SendTaskSuccess
is sent with the aws cli like this:
aws stepfunctions send-task-success --task-token 'YOUR_TASK_TOKEN' --task-output '{"status": "success"}'
Or, in python using boto3, like this:
response = client.send_task_success(
taskToken='string',
output='string'
)
For other platforms, see the SDK documentation.
For task failures, see the adjacent api calls for SendTaskFailure
.
Two additional settings enable your job to end with a timeout error by defining a task timeout. This can reduce your time to detect a problem with your task. By default, the timeout is over a year when working with task tokens. To set a timeout, specify a Task Timeout on the create pipeline screen.
If you would like your job check in to show that it is still running and fail the step if it does not check in within some amount of time less than the task timeout, define the Task Heartbeat Timeout on the create pipeline screen also. If more time than the specified seconds elapses between heartbeats from the task, this state fails with a States.Timeout error name.
- Run
cdk destroy
from infra folder - Some resources may not be deleted by CDK (e.g S3 buckets and DynamoDB table) and you will have to delete them via aws cli or using aws console
Note:
After running CDK destroy there might still some resources be running in AWS that will have to be cleaned up manually as CDK does not delete some resources.
The CDK deployment deploys the VAMS stack into your account. The components that are created by this app are:
- Web app hosted on cloudfront distribution
- API Gateway to route front end calls to api handlers.
- Lambda Lambda handlers are created per API path.
- DynamoDB tables to store Workflows, Assets, Pipelines
- S3 Buckets for assets, cdk deployments and log storage
- Cognito User Pool for authentication
- Open Search Collection for searching the assets using metadata
Please see Swagger Spec for details
Table | Partition Key | Sort Key | Attributes |
---|---|---|---|
AppFeatureEnabledStorageTable | featureName | n/a | |
AssetStorageTable | databaseId | assetId | assetLocation, assetName, assetType, currentVersion, description, generated_artifacts, isDistributable, previewLocation, versions |
JobStorageTable | jobId | databaseId | |
PipelineStorageTable | databaseId | pipelineId | assetType, dateCreated, description, enabled, outputType, pipelineType, pipelineExecutionType |
DatabaseStorageTable | databaseId | n/a | assetCount, dateCreated, description |
WorkflowStorageTable | databaseId | workflowId | dateCreated, description, specifiedPipelines, workflow_arn |
WorkflowExecutionStorageTable | pk | sk | asset_id, database_id, execution_arn, execution_id, workflow_arn, workflow_id, assets |
MetadataStorageTable | databaseId | assetId | Varies with user provided attributes |
Field | Data Type | Description |
---|---|---|
assetLocation | Map | S3 Bucket and Key for this asset |
assetName | String | The user provided asset name |
assetType | String | The file extension of the asset |
currentVersion | Map | The current version of the S3 object |
description | String | The user provided description |
generated_artifacts | Map | S3 bucket and key references to artifacts generated automatically through pipelines when an asset is uploaded. |
isDistributable | Boolean | Whether the asset is distributable |
Field | Data Type | Description |
---|---|---|
assetType | String | File extension of the asset |
dateCreated | String | Creation date of this record |
description | String | User provided description |
enabled | Boolean | Whether this pipeline is enabled |
outputType | String | File extension of the output asset |
pipelineType | String | Defines the pipeline type — StandardFile/PreviewFile |
pipelineExecutionType | String | Defines the pipeline execution type — Lambda |
inputParameters | String | Defines the optional JSON parameters that get sent to the pipeline at execution |
Field | Data Type | Description |
---|---|---|
assetCount | String | Number of assets in this database |
dateCreated | String | Creation date of this record |
description | String | User provided description |
Field | Data Type | Description |
---|---|---|
dateCreated | String | Creation date of this record |
description | String | User provided description |
specifiedPipelines | Map, List, Map, String | List of pipelines given by their name, outputType, pipelineType, pipelineExecutionType |
workflow_arn | String | The ARN identifying the step function state machine |
Field | Data Type | Description |
---|---|---|
asset_id | String | Asset identifier for this workflow execution |
database_id | String | Database to which the asset belongs |
execution_arn | String | The state machine execution arn |
execution_id | String | Execution identifier |
workflow_arn | String | State machine ARN |
workflow_id | String | Workflow identifier |
stopDate | String | Stop Datetime of the execution (if blank, still running) |
startDate | String | Start Datetime of the execution (if blank, still running) |
executionStatus | String | Execution final status (if blank, still running) |
assets | List, Map | List of Maps of asset objects (see AssetStorageTable for attribute definitions) |
Field | Data Type | Description |
---|---|---|
asset_id | String | Asset identifier for this workflow execution |
database_id | String | Database to which the asset belongs |
Attributes are driven by user input. No predetermined fields aside from the partition and sort key. From rel 1.4 onwards, when you add metadata on a file / folder, the s3 key prefix of the file/folder is used as the asset key in the metadata table
The dependencies for the backend lambda functions are handled using poetry. If you changed the lambda functions make sure to do a cdk deploy
to reflect the change.
The core VAMS lambda handlers are categorized based on the project domain. E.g you will find all assets related functions in the /backend/backend/assets
folder.
The pipeline containers and lambda handlers are categorized based pipeline use-case implementations. You will find all pipeline components related to these in the /backend/backendPipelines
folder.
When you create pipelines in VAMS, you currently have one execution option and two pipeline type options
- Create a lambda execution type pipeline. Pick either standard or preview pipeline type (currently there are no differences as of v2.1, preview will have a future implementation functionality)
AWS Lambda is the compute platform for VAMS Lambda Pipelines.
When you create a VAMS Lambda pipeline you can either allow VAMS to create a new AWS Lambda function or provide a name of an existing AWS Lambda function to be used as a pipeline.
When you create a VAMS Lambda pipeline and dont provide name of an existing AWS Lambda function, VAMS will create an AWS Lambda function in your AWS account where VAMS is deployed. This lambda function will have the same name as the pipelineId you provided while creating the pipeline and append vams-
. This lambda function contains an example pipeline code. This example code can be modified with your own pipeline business logic.
Sometimes you may want to write your pipelines separately from VAMS stack. Some reasons for this are
- Separating pipeline code from VAMS deployment code
- Different personas with no access to VAMS are working on pipeline code
- Pipelines are managed in a separate CDK/CloudFormation stack altogether.
If you want to use an existing AWS Lambda function as a pipeline in VAMS you can provide the function name of your AWS Lambda function in the create pipeline UI. See the section below for the event payload passed by VAMS workflows when your pipelines are executed.
The VAMS workflow functionality by default will have access to any lambda function within the deployed AWS account with the word vams
in it. If your existing function does not have this, you will have to grant manual invoke permissions to the workflow stepfunctions role.
When a VAMS workflow invokes a VAMS Lambda pipeline, it invokes the corresponding AWS Lambda function with an event payload like below:
"body": {
"inputS3AssetFilePath": "<S3 URI of the primary asset file to be used as input>",
"outputS3AssetFilesPath": "<Predetermined output path for asset files generated by pipeline's execution */**>",
"outputS3AssetPreviewPath": "<Predetermined output path for asset preview files generated by pipeline's execution **>",
"outputS3AssetMetadataPath": "<Predetermined output path for asset metadata generated by pipeline's execution */**/***>",
"inputOutputS3AssetAuxiliaryFilesPath": "<Predetermined path for asset auxiliary files generated by pipeline's execution ****>",
"inputParameters": "<Optional input JSON parameters specified at the time of pipeline creation. Generally these map to allowed pipeline configuration parameters of the call-to pipeline>",
"inputMetadata": "<Input metadata JSON constructed from the VAMS asset the pipeline was executed from to provide pipelines additional context. See below for the JSON schema>",
"executingUsername": <The username of the user who executed the pipeline for use with permissioning>,
"executingRequestContext": <The originating lambda request context from executing the pipeline for use with permissioning>
}
- * The path specified is a unique location based on the execution job run. No two executions will have the same output path for these paths.
- ** If no files are located at the output locatin at the end of an execution, the respective asset data will not be modified
- *** Metadata files are in the form of JSON objects. All key/value fields in the top-level JSOn object will be added to the asset's metadata. Fields that already exist will be overwriten.
- **** The asset axuliary location is used for pipeline or asset non-versioned files or pipeline temporary files. The file path provided is not unique to the job execution and is a global path based on the asset selected and pipeline name.
Below is the input metadata schema JSON object constructed and passed into each pipeline execution:
"VAMS": {
"assetData": {
"assetName":"<Name of the asset the workflow is executed from>",
"description": "<Description of the asset the workflow is executed from>",
"tags": ["<Array of tags of the asset the workflow is executed from>"]
},
"assetMetadata": {
<... Dynamically constructed fields of all the metadata fields (key/value) from the VAMS asset and metadata of the primary VAMS asset file ...>
}
}
A simple lambda handler is provided below for reference. You may chose to override your own function in place of write_input_output
function in below code.
def lambda_handler(event, context):
"""
Example of a NoOp pipeline
Uploads input file to output
"""
print(event)
if isinstance(event['body'], str):
data = json.loads(event['body'])
else:
data = event['body']
write_input_output(data['inputS3AssetFilePath'], data['outputS3AssetFilesPath'])
return {
'statusCode': 200,
'body': 'Success'
}
This section describes use-case specific pipelines that can be activated in the infrastructure deployment configuration file /infra/config/config.json
. These pipelines can be setup through VAMS pipelines and workflows and/or some may be called directly through other triggering mechanisms. See the Configuration Guide for the use-case pipeline configuration options.
Pipeline architectures can either be synchronous or asynchonous. If asynchronous, the option of "Wait for Callback with the Task Token" must be used when adding the pipeline to VAMS.
See the NOTICE file for specific third-party license information regarding each of these pipelines.
The 3dBasic Converter Pipeline is used to convert between various 3D mesh file types.
If you wish to trigger this pipelines additionally/manually through VAMS pipeline, you can setup a new VAMS pipeline using the table below. You will need to lookup the lambda function name in the AWS console based on the base deployment name listed.
The pipeline uses the third-party open-source Trimesh library to conduct the conversion.
The pipeline uses the selected file type on the asset as the input type and the registered pipeline outputType
as the final conversion type. For now a separate pipeline registration is required for each from-to file type conversion that a organization would like to support.
NOTE: Pipeline must be registered in VAMS WITHOUT the option of "Wait for Callback with the Task Token"
Input File Types Supported | Base Lambda Function Name |
---|---|
STL, OBJ, PLY, GLTF, GLB, 3MF, XAML, 3DXML, DAE, XYZ (3D Meshes) | vamsExecute3dBasicConversion |
The PotreeViewer Point Cloud Visualizer Pipeline is used to generate preview files for certain types of points clouds asset files. Currently preview pipelines like these are aprimarily implemented outside of VAMS pipelines/workflows but also have the ability to be called through traditional pipelines. Until preview type pipelines are fully integrated as part of the regular VAMS pipeline design and execution, this pipeline is triggered primarily through a S3 Event Notification on uploading new asset files to VAMS.
If you wish to trigger this pipelines additionally/manually through VAMS pipeline, you can setup a new VAMS pipeline using the table below. You will need to lookup the lambda function name in the AWS console based on the base deployment name listed.
The PotreeViewer pipeline outputs it's files (Potree OCTREE Formated Files) to the assetAuxiliary bucket. These are then retrieved through the PotreeViewer implementations on the VAMS UX side through the auxiliary assets stream API endpoint. This uses the third-party open-source PDAL and PotreeConverter libraries to convert and generate the final output
There are no defined input parameter configurations for this pipeline. This pipeline ignores inputMetadata as it's not needed for the operation of this pipeline.
NOTE: If pipeline registered separately in VAMS Pipelines, it must be registered in VAMS with the option of "Wait for Callback with the Task Token"
Input File Types Supported | Base Lambda Function Name - VAMS trigger | Base Lambda Function Name - SNS trigger |
---|---|---|
LAS, LAZ, E57, PLY (Point Clouds) | vamsExecutePreviewPcPotreeViewerPipeline | snsExecutePrviewPcPotreeViewerPipeline |
Notice: This use-case pipeline uses a open-source library that is GPL licensed. Please refer to the and review with your organizations legal team before enabling use.
The GenerativeAI 3D Metadata labeling Pipeline is used to generate 2D renders and metadata JSON labeling information for 3D mesh asset files. This is useful to auto-label asset data as it is ingested.
If you wish to trigger this pipelines additionally/manually through VAMS pipeline, you can setup a new VAMS pipeline using the table below. You will need to lookup the lambda function name in the AWS console based on the base deployment name listed.
The pipeline uses the third-party open-source Blender library to generate 2D renders of the 3D object. This is then ingested into both Amazon Rekognition and Amazon Bedrock to generate and summarize labels on these 2D images. The output is a JSON metadata keywords file in the asset S3 bucket. Input types are currently restricted by the input model formats that Blender can accept.
This pipeline can use the inputMetadata passed into the pipeline as additional seed data for more accurately labeling the 2D image objects. This can be toggled on or off with an inputParameter.
The following inputParameters are supported:
{
"includeAllAssetFileHierarchyFiles": "True", #Default: False. Pull in all the asset files from the folder that includes the original input asset and lower (i.e. can pull in other relative texture files or shaders for rendering)
"seedMetadataGenerationWithInputMetadata": "True" #Default: False. Seed the metadata label generation with the input metadata (e.g. VAMS asset data, asset metadata) to refine results and reduce outlier keywords
}
NOTE: Pipeline must be registered in VAMS with the option of "Wait for Callback with the Task Token"
Input File Types Supported | Base Lambda Function Name |
---|---|
OBJ, GLB/GLTF, FBX, ABC, DAE, PLY, STL, USD (3D Meshes) | vamsExecuteGenAiMetadata3dLabelingPipeline |
Please see the corresponding Postman Collection provided.
Once the solution is deployed, you will have to put in the below details as Global Variables in the Postman Collection
Within the web folder You can do npm run start
to start a local frontend application.