Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add multi-user-project byod scenario and updated ReadMe for potential errors + workarounds #4790

Open
wants to merge 3 commits into
base: default
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 35 additions & 2 deletions ml_ops/sm-datazone_import/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ aws configure add-model --service-model file://resources/datazone-linkedtypes-20
2. Create a federation role

This role will be used by DataZone to launch the SageMaker Domain. See [BringYourOwnDomainResources.yml](.resources/BringYourOwnDomainResources.yml) for an example.
If you are using a single SageMaker domain across multiple projects, you will need to create a separate SageMaker Execution Role and User Profile for each user in each project and a separate Federation Role per project.

### Prerequisites

Expand All @@ -31,11 +32,43 @@ Run the script and follow the instructions.
```bash
python import-sagemaker-domain.py \
--region REGION \
--federation-role ARN_OF_FEDERATION_ROLE \
--account-id ACCOUNTID
```

### Additional Configuration

- SageMaker execution roles need DataZone API permissions in order for the Assets UI to function. See [DataZoneUserPolicy.json](./resources/DataZoneUserPolicy.json) for an example.
- Ensure the DataZone Domain trusts SageMaker. In the AWS DataZone console navigate to Domain details and select the "Trusted services".
- Ensure the DataZone Domain trusts SageMaker. In the AWS DataZone console navigate to Domain details and select the "Trusted services".
svia3 marked this conversation as resolved.
Show resolved Hide resolved

### Potential errors and workarounds

**Cannot view ML assets in SageMaker Studio, missing "Assets" tab**

Make sure that the execution role that is attached to the SageMaker User in the attached domain has ListTags attached as a permissions policy to the role. A simple workaround is to attach AmazonSageMakerCanvasFullAccess policy which contains this permission. Without it - you will not be able to view the Assets tab in the Studio UI. If you were to inspect the network UI, you would see the following error:
```
User: arn:aws:sts::789706018617:assumed-role/AmazonSageMaker-ExecutionRole-20241127T120959/SageMaker is not authorized to
perform: sagemaker:ListTags on resource: arn:aws:sagemaker:us-east-1:789706018617:domain/d-qy9jzu4s7q0y because no
identity-based policy allows the sagemaker:ListTags action
svia3 marked this conversation as resolved.
Show resolved Hide resolved
```

**Able to view assets in sidebar, but page is not loading**

If you are able to view the assets - but are getting a `There was a problem when loading subscriptions` error in the page where your ML assets should be - ensure that the SageMaker Execution role tied to this SageMaker user has permissions. We can use the provided /resources/DatazoneUserPolicy.json or a more limited version of what is included in AmazonDataZoneFullUserAccess attached to it.

**DataZone portal is not showing a generated action-link for user**

If you are attempting to create ProjectB using a subset of users B under created environment B - make sure. that you use a separate federation role when the _associate_fed_role action is called. This is required or else the association will fail and thus the subsequent call to create_environment_action will fail with the following error.
See `../resources` for sample permissions and trust policies for the federation role. Be sure to fill in your SageMaker Domain Id.

```
An error occurred (ValidationException) when calling the AssociateEnvironmentRole operation: Role Arn
arn:aws:iam::789706018617:role/svia-test-byod-fed-role already being used in a different project
```

Successful association will return the following
svia3 marked this conversation as resolved.
Show resolved Hide resolved

```
Federation role to federate into sagemaker studio from datazone portal: arn:aws:iam::789706018617:role/svia-test-byod-fed-role
Associating Environment Role using Federation Role [arn:aws:iam::789706018617:role/svia-test-byod-fed-role] ...
Associating Environment Role using Federation Role [arn:aws:iam::789706018617:role/svia-test-byod-fed-role] COMPLETE
```
61 changes: 43 additions & 18 deletions ml_ops/sm-datazone_import/import-sagemaker-domain.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import argparse
import boto3
import botocore
import time
from botocore.exceptions import ClientError

"""
Expand All @@ -13,10 +14,9 @@


class SageMakerDomainImporter:
def __init__(self, region, stage, federation_role, account_id) -> None:
def __init__(self, region, stage, account_id) -> None:
self.region = region
self.stage = stage
self.federation_role = federation_role
self.account_id = account_id
# Setup client.
sm_endpoint_url = "https://api.sagemaker." + region + ".amazonaws.com" # prod
Expand Down Expand Up @@ -169,7 +169,9 @@ def _map_users(self):
exec_role = user_settings["ExecutionRole"]

if exec_role is None:
print(f'User {sm_user_name} has no execution role set, using default from domain.')
print(
f"User {sm_user_name} has no execution role set, using default from domain."
)
exec_role = self.default_execution_role

self.sm_user_info["exec_role_arn"] = exec_role
Expand Down Expand Up @@ -233,6 +235,30 @@ def _map_users(self):
break
self.dz_users_id_list.append(dz_uzer)

def _link_multiple_users_and_projects(self):
"""
Add the option for the user to attach Users in subset B to Project B.
"""
print("--------------------------------------------------------------------")
decision = input(
"Would you like to onboard an additional subset of user profiles to another project? "
"(This would require you to have another project created. In this new project, you will create"
"a new environment if not already created, as well) [y/n]: "
)
if decision == "y":
self._choose_dz_project()
self._configure_blueprint()
self._configure_environment()
self._tag_sm_domain()
self._map_users()
self._associate_fed_role()
self._add_environment_action()
self._link_domain()
self._link_users()
self._debug_print_results()
self._get_env_link()
self._link_multiple_users_and_projects()

def _configure_blueprint(self):
# [4] Create environment profile + environment and use new API BatchPutLinkedTypes to connect DataZone and SageMaker entities.

Expand Down Expand Up @@ -266,7 +292,6 @@ def _configure_blueprint(self):
return self.managed_blueprint_id

def _configure_environment(self):
print("--------------------------------------------------------------------")
decision_env = input(
"Do you need to create a new DataZone environment? [y/n]: "
)
Expand Down Expand Up @@ -346,6 +371,9 @@ def _add_environment_action(self):
def _associate_fed_role(self):
# Associate fed role
print("--------------------------------------------------------------------")
self.federation_role = input(
"Federation Role Arn to federate into sagemaker studio from datazone portal: "
)
print(
"Associating Environment Role using Federation Role [{}] ...".format(
self.federation_role
Expand All @@ -367,6 +395,8 @@ def _associate_fed_role(self):
print(
"Environment has a role configured already. Skipping role association ..."
)
else:
print(f"Caught error: {repr(e)}")

def _link_domain(self):
# attach SAGEMAKER_DOMAIN
Expand Down Expand Up @@ -394,15 +424,15 @@ def _link_domain(self):
)
print("--------------------------------------------------------------------")

print("Linking SageMaker Domain")
print(f"Linking SageMaker Domain using project id [{self.dz_project_id}]")
link_domain_response = self.byod_client.batch_put_linked_types(
domainIdentifier=self.dz_domain_id,
projectIdentifier=self.dz_project_id,
environmentIdentifier=self.env_id,
items=linkedDomainItems,
)
print(link_domain_response)
print("Linked SageMaker Domain")
print("Linked SageMaker Domain.")

def _link_users(self):
# attach SAGEMAKER_USER_PROFILE
Expand All @@ -429,15 +459,17 @@ def _link_users(self):
linkedUserItems.append(linkedUserItem)

print("--------------------------------------------------------------------")
print("Linking SageMaker User Profiles")
print(
f"Linking SageMaker User Profiles using project id [{self.dz_project_id}]"
)
link_users_response = self.byod_client.batch_put_linked_types(
domainIdentifier=self.dz_domain_id,
projectIdentifier=self.dz_project_id,
environmentIdentifier=self.env_id,
items=linkedUserItems,
)
print(link_users_response)
print("Linked SageMaker User Profiles")
print("Linked SageMaker User Profiles.")
print("--------------------------------------------------------------------")

def _debug_print_results(self):
Expand Down Expand Up @@ -482,12 +514,13 @@ def import_interactive(self):
self._configure_environment()
self._tag_sm_domain()
self._map_users()
self._add_environment_action()
self._associate_fed_role()
self._add_environment_action()
self._link_domain()
self._link_users()
self._debug_print_results()
self._get_env_link()
self._link_multiple_users_and_projects()


if __name__ == "__main__":
Expand All @@ -506,13 +539,6 @@ def import_interactive(self):
default="prod",
help="Stage to test e2e BYOD. This impacts the endpoint targeted.",
)
parser.add_argument(
"--federation-role",
type=str,
required=True,
default="test",
help="Role used to federate access into environment.",
)
parser.add_argument(
"--account-id",
type=str,
Expand All @@ -524,10 +550,9 @@ def import_interactive(self):

region = args.region
stage = args.stage
federation_role = args.federation_role
account_id = args.account_id

print("--------------------------------------------------------------------")
importer = SageMakerDomainImporter(region, stage, federation_role, account_id)
importer = SageMakerDomainImporter(region, stage, account_id)
importer.import_interactive()
print("--------------------------------------------------------------------")
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"sagemaker:CreateUserProfile",
"sagemaker:DescribeUserProfile",
"sagemaker:CreatePresignedDomainUrl"
],
"Resource": [
"arn:aws:sagemaker:*:789706018617:*/<YOUR-SM-DOMAIN-ID-HERE>/*"
],
"Effect": "Allow"
},
{
"Sid": "Statement1",
"Effect": "Allow",
"Action": [
"iam:ListRoleTags"
],
"Resource": [
"*"
]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": [
"datazone.amazonaws.com",
"lakeformation.amazonaws.com",
"glue.amazonaws.com",
"auth.datazone.amazonaws.com"
]
},
"Action": [
"sts:AssumeRole",
"sts:TagSession",
"sts:SetContext",
"sts:SetSourceIdentity"
]
}
]
}