Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(graphql-model-transformer): minimal provider framework and inline policies #2490

Merged
merged 17 commits into from
Apr 27, 2024

Conversation

atierian
Copy link
Member

@atierian atierian commented Apr 23, 2024

Description of changes

Deploy Time Improvements

Deployment time depends on many factors outside our direct control. That said, multiple runs have demonstrated fairly precise benchmarks.

Simple Todo (npm create amplify@latest)

Current: ✨ Total time: 254.85s
With Changes: ✨ Total time: 178.64s

Schema
const schema = a.schema({
  Todo: a
    .model({
      content: a.string(),
    })
    .authorization((allow) => [allow.guest()]),
});

export type Schema = ClientSchema<typeof schema>;

export const data = defineData({
  schema,
  authorizationModes: {
    defaultAuthorizationMode: 'iam',
  },
});

Lots of GSIs w/ Composite Keys

Current: ✨ Total time: 514.53s
With Changes: ✨ Total time: 249.24s

Schema
type Primary @model {
  id: ID! @primaryKey
  relatedMany: [RelatedMany] @hasMany(references: "primaryId")
  relatedOne: RelatedOne @hasOne(references: "primaryId")
}

type RelatedMany @model {
  id: ID! @primaryKey
  primaryId: String
  primary: Primary @belongsTo(references: ["primaryId"])
}

type RelatedOne @model {
  id: ID! @primaryKey
  primaryId: String
  primary: Primary @belongsTo(references: ["primaryId"])
}

type PrimaryCPKSKOne @model {
  id: ID! @primaryKey(sortKeyFields: ["skOne"])
  skOne: ID!
  relatedMany: [RelatedManyCPKSKOne] @hasMany(references: ["primaryId", "primarySkOne"])
  relatedOne: RelatedOneCPKSKOne @hasOne(references: ["primaryId", "primarySkOne"])
}

type RelatedManyCPKSKOne @model {
  id: String! @primaryKey
  primaryId: ID
  primarySkOne: ID
  primary: PrimaryCPKSKOne @belongsTo(references: ["primaryId", "primarySkOne"])
}

type RelatedOneCPKSKOne @model {
  id: String! @primaryKey
  primaryId: ID
  primarySkOne: ID
  primary: PrimaryCPKSKOne @belongsTo(references: ["primaryId", "primarySkOne"])
}

type PrimaryCPKSKTwo @model {
  id: ID! @primaryKey(sortKeyFields: ["skOne", "skTwo"])
  skOne: ID!
  skTwo: ID!
  relatedMany: [RelatedManyCPKSKTwo] @hasMany(references: ["primaryId", "primarySkOne", "primarySkTwo"])
  relatedOne: RelatedOneCPKSKTwo @hasOne(references: ["primaryId", "primarySkOne", "primarySkTwo"])
}

type RelatedManyCPKSKTwo @model {
  id: String! @primaryKey
  primaryId: ID
  primarySkOne: ID
  primarySkTwo: ID
  primary: PrimaryCPKSKTwo @belongsTo(references: ["primaryId", "primarySkOne", "primarySkTwo"])
}

type RelatedOneCPKSKTwo @model {
  id: String! @primaryKey
  primaryId: ID
  primarySkOne: ID
  primarySkTwo: ID
  primary: PrimaryCPKSKTwo @belongsTo(references: ["primaryId", "primarySkOne", "primarySkTwo"])
}

type PrimaryCPKSKThree @model {
  id: ID! @primaryKey(sortKeyFields: ["skOne", "skTwo", "skThree"])
  skOne: ID!
  skTwo: ID!
  skThree: ID!
  relatedMany: [RelatedManyCPKSKThree] @hasMany(references: ["primaryId", "primarySkOne", "primarySkTwo", "primarySkThree"])
  relatedOne: RelatedOneCPKSKThree @hasOne(references: ["primaryId", "primarySkOne", "primarySkTwo", "primarySkThree"])
}

type RelatedManyCPKSKThree @model {
  id: String! @primaryKey
  primaryId: ID
  primarySkOne: ID
  primarySkTwo: ID
  primarySkThree: ID
  primary: PrimaryCPKSKThree @belongsTo(references: ["primaryId", "primarySkOne", "primarySkTwo", "primarySkThree"])
}

type RelatedOneCPKSKThree @model {
  id: String! @primaryKey
  primaryId: ID
  primarySkOne: ID
  primarySkTwo: ID
  primarySkThree: ID
  primary: PrimaryCPKSKThree @belongsTo(references: ["primaryId", "primarySkOne", "primarySkTwo", "primarySkThree"])
}

type PrimaryCPKSKFour @model {
  id: ID! @primaryKey(sortKeyFields: ["skOne", "skTwo", "skThree", "skFour"])
  skOne: ID!
  skTwo: ID!
  skThree: ID!
  skFour: ID!
  relatedMany: [RelatedManyCPKSKFour] @hasMany(references: ["primaryId", "primarySkOne", "primarySkTwo", "primarySkThree", "primarySkFour"])
  relatedOne: RelatedOneCPKSKFour @hasOne(references: ["primaryId", "primarySkOne", "primarySkTwo", "primarySkThree", "primarySkFour"])
}

type RelatedManyCPKSKFour @model {
  id: String! @primaryKey
  primaryId: ID
  primarySkOne: ID
  primarySkTwo: ID
  primarySkThree: ID
  primarySkFour: ID
  primary: PrimaryCPKSKFour @belongsTo(references: ["primaryId", "primarySkOne", "primarySkTwo", "primarySkThree", "primarySkFour"])
}

type RelatedOneCPKSKFour @model {
  id: String! @primaryKey
  primaryId: ID
  primarySkOne: ID
  primarySkTwo: ID
  primarySkThree: ID
  primarySkFour: ID
  primary: PrimaryCPKSKFour @belongsTo(references: ["primaryId", "primarySkOne", "primarySkTwo", "primarySkThree", "primarySkFour"])
}

Types, Models, and Relationships

Current: ✨ Total time: 282.97s
With Changes: ✨ Total time: 200.23s

Schema
const schema = a.schema({
  // Required array scalar types combinations
  RequiredArrayType: a.model({
    identifier: a.id().array().required(),
    requiredIdentifier: a.id().required().array().required(),
    string: a.string().array().required(),
    requiredString: a.string().required().array().required(),
    int: a.integer().array().required(),
    requiredInt: a.integer().required().array().required(),
    float: a.float().array().required(),
    requiredFloat: a.float().required().array().required(),
    boolean: a.boolean().array().required(),
    requiredBoolean: a.boolean().required().array().required(),
    datetime: a.datetime().array().required(),
    requiredDatetime: a.datetime().required().array().required(),
    time: a.time().array().required(),
    requiredTime: a.time().required().array().required(),
    date: a.date().array().required(),
    requiredDate: a.date().required().array().required(),
    email: a.email().array().required(),
    requiredEmail: a.email().required().array().required(),
    url: a.url().array().required(),
    requiredUrl: a.url().required().array().required(),
    ip: a.ipAddress().array().required(),
    requiredIp: a.ipAddress().required().array().required(),
    phone: a.phone().array().required(),
    requiredPhone: a.phone().required().array().required(),
    timestamp: a.timestamp().array().required(),
    requiredTimestamp: a.timestamp().required().array().required(),
  }),

  // Array scalar types combinations
  ArrayType: a.model({
    identifier: a.id().array(),
    requiredIdentifier: a.id().required().array(),
    string: a.string().array(),
    requiredString: a.string().required().array(),
    int: a.integer().array(),
    requiredInt: a.integer().required().array(),
    float: a.float().array(),
    requiredFloat: a.float().required().array(),
    boolean: a.boolean().array(),
    requiredBoolean: a.boolean().required().array(),
    datetime: a.datetime().array(),
    requiredDatetime: a.datetime().required().array(),
    time: a.time().array(),
    requiredTime: a.time().required().array(),
    date: a.date().array(),
    requiredDate: a.date().required().array(),
    email: a.email().array(),
    requiredEmail: a.email().required().array(),
    url: a.url().array(),
    requiredUrl: a.url().required().array(),
    ip: a.ipAddress().array(),
    requiredIp: a.ipAddress().required().array(),
    phone: a.phone().array(),
    requiredPhone: a.phone().required().array(),
    timestamp: a.timestamp().array(),
    requiredTimestamp: a.timestamp().required().array(),
  }),

  // Scalar type combinations
  DataType: a.model({
    identifier: a.id(),
    requiredIdentifier: a.id().required(),
    string: a.string(),
    requiredString: a.string().required(),
    int: a.integer(),
    requiredInt: a.integer().required(),
    float: a.float(),
    requiredFloat: a.float().required(),
    boolean: a.boolean(),
    requiredBoolean: a.boolean().required(),
    datetime: a.datetime(),
    requiredDatetime: a.datetime().required(),
    time: a.time(),
    requiredTime: a.time().required(),
    date: a.date(),
    requiredDate: a.date().required(),
    email: a.email(),
    requiredEmail: a.email().required(),
    url: a.url(),
    requiredUrl: a.url().required(),
    ip: a.ipAddress(),
    requiredIp: a.ipAddress().required(),
    phone: a.phone(),
    requiredPhone: a.phone().required(),
    timestamp: a.timestamp(),
    requiredTimestamp: a.timestamp().required(),
  }),


  // Identifier customizations
  ExplicitPk: a.model({
    id: a.id().required()
  }),
  ExplicitPkSk: a.model({
    id: a.id().required(),
    sk: a.string().required()
  }).identifier(['id', 'sk']),
  ExplicitPkSkSk: a.model({
    id: a.id().required(),
    sk: a.string().required(),
    sk2: a.string().required(),
  }).identifier(['id', 'sk', 'sk2']),
  CustomPk: a.model({
    customId: a.id().required()
  }).identifier(['customId']),
  CustomPkSk: a.model({
    customId: a.id().required(),
    sk: a.string().required()
  }).identifier(['customId', 'sk']),
  CustomPkSkSk: a.model({
    customId: a.id().required(),
    sk: a.string().required(),
    sk2: a.string().required(),
  }).identifier(['customId', 'sk', 'sk2']),
  CustomStringPk: a.model({
    customId: a.string().required()
  }).identifier(['customId']),
  CustomStringPkSk: a.model({
    customId: a.string().required(),
    sk: a.string().required()
  }).identifier(['customId', 'sk']),
  CustomStringPkSkSk: a.model({
    customId: a.string().required(),
    sk: a.string().required(),
    sk2: a.string().required(),
  }).identifier(['customId', 'sk', 'sk2']),

  // Relationships - Basic has-one
  Country: a.model({
    name: a.string().required(),
    capital: a.hasOne('Capital', 'countryId'),
  }),
  Capital: a.model({
    name: a.string().required(),
    country: a.belongsTo('Country', 'countryId'),
    countryId: a.id().required(),
  }),

  // has-one with custom PK

  Customer: a.model({
    customerId: a.id().required(),
    name: a.string(),
    activeCart: a.hasOne('Cart', 'customerId'),
  }).identifier(['customerId']),

  Cart: a.model({
    cartId: a.id().required(),
    customerId: a.id().required(),
    customer: a.belongsTo('Customer', 'customerId'),
  }).identifier(['cartId']),

  // has-one with custom PK and SK

  Human: a.model({
    name: a.string().required(),
    dateOfBirth: a.date().required(),
    passport: a.hasOne('Passport', ['humanName', 'humanDateOfBirth']),
  }).identifier(['name', 'dateOfBirth']),

  Passport: a.model({
    country: a.string().required(),
    uniqueSequence: a.string().required(),
    expiryDate: a.date().required(),
    humanName: a.string(),
    humanDateOfBirth: a.date(),
    human: a.belongsTo('Human', ['humanName', 'humanDateOfBirth']),
  }).identifier(['country', 'uniqueSequence', 'expiryDate']),

  // Relationships - Basic has-many
  Company: a.model({
    name: a.string().required(),
    employees: a.hasMany('Employee', 'companyId'),
  }),
  Employee: a.model({
    name: a.string().required(),
    dateOfBirth: a.datetime().required(),
    company: a.belongsTo('Company', 'companyId'),
    companyId: a.id().required(),
  }),

  // has-many with custom PK
  Organization: a.model({
    orgId: a.id().required(),
    name: a.string().required(),
    employees: a.hasMany('Worker', 'companyId'),
  }).identifier(['orgId']),
  Worker: a.model({
    workerId: a.id().required(),
    name: a.string().required(),
    dateOfBirth: a.datetime().required(),
    company: a.belongsTo('Organization', 'companyId'),
    companyId: a.id().required(),
  }).identifier(['workerId']),

  // has-many with custom PK and SK
  Department: a.model({
    departmentId: a.id().required(),
    name: a.string().required(),
    workers: a.hasMany('DepartmentWorker', ['departmentId', 'departmentName']),
  }).identifier(['departmentId', 'name']),

  DepartmentWorker: a.model({
    workerId: a.id().required(),
    workerName: a.string().required(),
    departmentId: a.id(),
    departmentName: a.string(),
    department: a.belongsTo('Department', ['departmentId', 'departmentName'])
  }).identifier(['workerId', 'workerName']),

  // Relationships - Basic many-to-many
  Pizza: a.model({
    name: a.string().required(),
    price: a.float().required(),
    toppings: a.hasMany('PizzaTopping', 'pizzaId'),
  }),
  Topping: a.model({
    name: a.string().required(),
    pizzas: a.hasMany('PizzaTopping', 'toppingId'),
  }),
  PizzaTopping: a.model({
    // two rules: 1/ a connection field (e.g. pizza) must be optional, no exceptions
    //            2/ all reference fields (e.g. pizzaId, locationHash) must all have the same optionality
    pizza: a.belongsTo('Pizza', 'pizzaId'),
    pizzaId: a.id().required(),
    topping: a.belongsTo('Topping', 'toppingId'),
    toppingId: a.id().required(),
  }),

  // Custom type & enums
  Event: a.model({
    title: a.string().required(),
    location: a.customType({
      lat: a.float().required(),
      long: a.float().required(),
    }),
    type: a.enum(['party', 'birthday', 'wedding']),
    description: a.ref('LocationDescription').required(),
    size: a.ref('EventSize').required(),
  }),

  LocationDescription: a.customType({
    name: a.string().required(),
    description: a.string().required(),
  }),

  EventSize: a.enum(['small', 'medium', 'large']),
  lowercaseModel: a.model({
    content: a.string(),
  }),
  'sortaWeirdCase': a.model({
    test: a.string(),
  }),

}).authorization(allow => allow.guest());

Changes

1. Replace usage of AWS CDK's custom resource provider framework by Custom::AmplifyDynamoDBTable with a minimal provider within amplify-graphql-model-transformer

Extension / replacement of (see for context):

Current Architecture

Currently, our usage of the CDK's CustomResources.Provider construct leads to excessive resources that slow down initial deployment times. For the Custom::AmplifyDynamoDBTable custom resource, we deploy the following resources, which takes approximately 2 minutes on the initial deploy.

  • 7 AWS::IAM::Role
  • 6 AWS::IAM::Policy
  • 6 AWS::Lambda::Function
  • 1 AWS:StepFunctions::StateMachine

custom_resource_provider_current

New Architecture

With this change, we're reducing that custom resource supporting infrastructure to the minimum needed. This reduces the initial deploy time of the custom resource provider resources by approximately 60 seconds:

  • 3 AWS::IAM::Role
  • 3 AWS::IAM::Policy
  • 2 AWS::Lambda::Function
  • 1 AWS:StepFunctions::StateMachine

custom_resource_provider_new

2. Prevent default policy generation for IAM roles generated per model for DynamoDB table access by AppSync

The IAM role created for AppSync to access DynamoDB tables currently has two policies attached to it.

  1. The inline policy DynamoDBAccess we define when creating the role.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "dynamodb:BatchGetItem",
                "dynamodb:BatchWriteItem",
                "dynamodb:PutItem",
                "dynamodb:DeleteItem",
                "dynamodb:GetItem",
                "dynamodb:Scan",
                "dynamodb:Query",
                "dynamodb:UpdateItem"
            ],
            "Resource": [
                "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${tablename}",
                "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${tablename}/*"
            ],
            "Effect": "Allow"
        }
    ]
}
  1. The default policy generated by the CDK
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "dynamodb:BatchGetItem",
                "dynamodb:GetRecords",
                "dynamodb:GetShardIterator",
                "dynamodb:Query",
                "dynamodb:GetItem",
                "dynamodb:Scan",
                "dynamodb:ConditionCheckItem",
                "dynamodb:BatchWriteItem",
                "dynamodb:PutItem",
                "dynamodb:UpdateItem",
                "dynamodb:DeleteItem",
                "dynamodb:DescribeTable"
            ],
            "Resource": [
                "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${tablename}",
            ],
            "Effect": "Allow"
        }
    ]
}

The second policy is unnecessary and generated with its own Logical ID in the CFN stack. This leads to an additional ~15-20 seconds deployment time per model. Because resources are generated in parallel, this doesn't save 15 * models.count seconds, but it does save approximately 25 seconds for the initial deployment of a typical schema with 15-20 models.

This change consolidates those two policies into a single inline policy, and prevents the CDK from generating a default policy via the withoutPolicyUpdates() method.
See Opting out of automatic permissions management for more information.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "dynamodb:BatchGetItem",
                "dynamodb:GetRecords",
                "dynamodb:GetShardIterator",
                "dynamodb:Query",
                "dynamodb:GetItem",
                "dynamodb:Scan",
                "dynamodb:ConditionCheckItem",
                "dynamodb:BatchWriteItem",
                "dynamodb:PutItem",
                "dynamodb:UpdateItem",
                "dynamodb:DeleteItem",
                "dynamodb:DescribeTable"
            ],
            "Resource": [
                "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${tablename}",
                "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${tablename}/*"
            ],
            "Effect": "Allow"
        }
    ]
}
CDK / CloudFormation Parameters Changed

Issue #, if available

N/A

Description of how you validated changes

Checklist

  • PR description included
  • yarn test passes
  • Tests are changed or added
  • Relevant documentation is changed or added (and PR referenced)
  • New AWS SDK calls or CloudFormation actions have been added to relevant test and service IAM policies
  • Any CDK or CloudFormation parameter changes are called out explicitly

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@atierian atierian force-pushed the amplify-table-go-brr-no-breaky branch from c5c58bf to b309ca2 Compare April 24, 2024 18:46
Comment on lines +1 to +2
// The contents of this file were taken from the AWS CDK provider framework.
// https://github.com/aws/aws-cdk/blob/c52ff08/packages/aws-cdk-lib/custom-resources/lib/provider-framework/runtime/outbound.ts
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifications:

  • Removed defaultInvokeFunction / invokeFunction. These are used by the CDK's provider framework to invoke the Lambda functions we defined. The new design doesn't involve Lambda functions invoking other Lambda functions.

Comment on lines +1 to +2
// The contents of this file were adapted from the AWS CDK provider framework.
// https://github.com/aws/aws-cdk/blob/c52ff08cfd1515d35feb93bcba34a3231a94985c/packages/aws-cdk-lib/custom-resources/lib/provider-framework/waiter-state-machine.ts
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifications:

-        'framework-onTimeout-task': {
-         End: true,
-          Type: 'Task',
-          Resource: props.timeoutHandler.functionArn,
-        },

We're depending on the CloudFormation custom resource default (and unchangeable) timeout of 1 hour. We no longer need a timeout function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it simplify our future maintenance (by making it easier to compare changes from the CDK) to leave this unchanged, and simply add a noop timeout handler and default timeout value high enough to never be invoked?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that it would likely help with maintainability, but that comes at the cost of the "fake" timeout value and timeout task state (framework-onTimeout-task) being potentially misinterpreted by future readers as "real."

I don't have a strong opinion either way. If you think it's worthwhile to add back, happy to do it.

AaronZyLee
AaronZyLee previously approved these changes Apr 26, 2024
@atierian atierian changed the base branch from feature/gen2-release to main April 26, 2024 17:20
@atierian atierian dismissed AaronZyLee’s stale review April 26, 2024 17:20

The base branch was changed.

AaronZyLee
AaronZyLee previously approved these changes Apr 26, 2024
phani-srikar
phani-srikar previously approved these changes Apr 26, 2024
Comment on lines +1 to +2
// The contents of this file were adapted from the AWS CDK provider framework.
// https://github.com/aws/aws-cdk/blob/c52ff08cfd1515d35feb93bcba34a3231a94985c/packages/aws-cdk-lib/custom-resources/lib/provider-framework/waiter-state-machine.ts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it simplify our future maintenance (by making it easier to compare changes from the CDK) to leave this unchanged, and simply add a noop timeout handler and default timeout value high enough to never be invoked?

@atierian atierian dismissed stale reviews from phani-srikar and AaronZyLee via 2c4d2a3 April 26, 2024 19:53
@atierian atierian merged commit a86c816 into main Apr 27, 2024
6 checks passed
@atierian atierian deleted the amplify-table-go-brr-no-breaky branch April 27, 2024 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants