-
Notifications
You must be signed in to change notification settings - Fork 161
feat: update lambda state machine to accommodate tenantId #367
Conversation
filtered_tenant_id_frame = Filter.apply(frame = original_data_source_dyn_frame, | ||
f = lambda x: | ||
x['_tenantId'] == tenantId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of doing glue side filtering would it be better to have a secondary index on the tenantId? This will become an expensive operation if we have to scan across all tenants
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emm that's a great question. In the design doc, it specified the filtering is to be done as part of the Glue job, and secondary index was not introduced for any tables. @carvantes Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The glue job always scans the entire DDB table no matter what, there's no way to use a query. This is a limitation on the current AWS Glue + DDB integration.
There are existing scenarios where this is far from ideal. e.g. exporting a single FHIR resource type or exporting the resources modified in the last hour will both scan the entire table.
There is room for improvement on the bulk export solution, but we are not changing the fundamentals here.
* feat: add tenantId attribute to Cognito user pool (#348) * feat: remove unneeded scope checks in authorizer (#347) * feat: update lambda state machine to accommodate tenantId (#367) * feat: add "enableMultiTenancy" CFN parameter (#381) * test: add multi-tenancy integ tests (#387) * fix: remove _id, _tenantId from bulk export results (#384) * feat: Group export scripts (#389) * fix: add multi-tenant metadata route (#392) * fix: allow more concurrent export jobs for multi-tenant deployments (#397) * test: integ tests for Group export (#393) * feat: add ES hard delete config value (#398) * docs: update postman collection and docs to use Id token (#399) * docs: add multi-tenancy docs (#400) Co-authored-by: Yanyu Zheng <[email protected]> BREAKING CHANGE: The Cognito IdToken is now used instead of the accessToken to authorize requests.
* feat: update lambda state machine to accommodate tenantId (#367) * feat: add "enableMultiTenancy" CFN parameter (#382) * fix: pass enableMultiTenancy to ES * fix: remove _id, _tenantId from bulk export results * feat: Group export scripts (#389) * chore: script generating patient compartment search params * feat: update Glue script for group export * Upload patient compartment jsons to S3 * fix: allow more concurrent export jobs for multi-tenant deployments (#397) * feat: add ES hard delete config value (#398) * docs: add multi-tenancy docs (#400) * fix: pass enableMultiTenancy flag to s3DataService * test: add multi-tenancy integ tests (#387) * test: integ tests for Group export (#393) * chore: upgrade dependencies * add public multi-tenant routes * add system/read and user/delete permissions to defaults * test: fix tests for smart multi-tenancy * test: update gh actions to also test multi-tenant environment * docs: update bulk export docs to mention group export Co-authored-by: Yanyu Zheng <[email protected]>
Issue #, if available:
Description of changes:
Checklist:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.