Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validateVariantOptionIds is memory inefficient and can easily cause the server to crash #328

Closed
Tyratox opened this issue May 3, 2020 · 1 comment
Assignees
Labels
type: bug 🐛 Something isn't working

Comments

@Tyratox
Copy link
Contributor

Tyratox commented May 3, 2020

Describe the bug
The function

private async validateVariantOptionIds(ctx: RequestContext, input: CreateProductVariantInput) {
const product = await getEntityOrThrow(this.connection, Product, input.productId, ctx.channelId, {
relations: ['optionGroups', 'optionGroups.options', 'variants', 'variants.options'],
});

is memory inefficient.
The function
export async function getEntityOrThrow<T extends VendureEntity>(
connection: Connection,
entityType: Type<T>,
id: ID,
findOptionsOrChannelId?: FindOneOptions<T> | ID,
maybeFindOptions?: FindOneOptions<T>,
): Promise<T> {
let entity: T | undefined;
if (isId(findOptionsOrChannelId)) {
entity = await findOneInChannel(connection, entityType, id, findOptionsOrChannelId, maybeFindOptions);
} else {
entity = await connection.getRepository(entityType).findOne(id, findOptionsOrChannelId);
}
if (!entity || (entity.hasOwnProperty('deletedAt') && (entity as T & SoftDeletable).deletedAt !== null)) {
throw new EntityNotFoundError(entityType.name as any, id);
}
return entity;
}

is called which itself calls findOneInChannel on line 37

export function findOneInChannel<T extends ChannelAware | VendureEntity>(
connection: Connection,
entity: Type<T>,
id: ID,
channelId: ID,
findOptions?: FindManyOptions<T>,
) {
const qb = connection.getRepository(entity).createQueryBuilder('product');
FindOptionsUtils.applyFindManyOptionsOrConditionsToQueryBuilder(qb, findOptions);
// tslint:disable-next-line:no-non-null-assertion
FindOptionsUtils.joinEagerRelations(qb, qb.alias, qb.expressionMap.mainAlias!.metadata);
return qb
.leftJoin('product.channels', 'channel')
.andWhere('product.id = :id', { id })
.andWhere('channel.id = :channelId', { channelId })
.getOne();
}

All functions do work in theory but the generated sql query looks like the following:

SELECT `product`.`createdAt` AS `product_createdAt`, `product`.`updatedAt` AS `product_updatedAt`, `product`.`deletedAt` AS `product_deletedAt`, `product`.`enabled` AS `product_enabled`, `product`.`id` AS `product_id`, `product`.`featuredAssetId` AS `product_featuredAssetId`, `product`.`customFieldsProductrecommendationsenabled` AS `product_customFieldsProductrecommendationsenabled`, `product`.`customFieldsGroupkey` AS `product_customFieldsGroupkey`, `product__optionGroups`.`createdAt` AS `product__optionGroups_createdAt`, `product__optionGroups`.`updatedAt` AS `product__optionGroups_updatedAt`, `product__optionGroups`.`code` AS `product__optionGroups_code`, `product__optionGroups`.`id` AS `product__optionGroups_id`, `product__optionGroups`.`productId` AS `product__optionGroups_productId`, `product__optionGroups_translations`.`createdAt` AS `product__optionGroups_translations_createdAt`, `product__optionGroups_translations`.`updatedAt` AS `product__optionGroups_translations_updatedAt`, `product__optionGroups_translations`.`languageCode` AS `product__optionGroups_translations_languageCode`, `product__optionGroups_translations`.`name` AS `product__optionGroups_translations_name`, `product__optionGroups_translations`.`id` AS `product__optionGroups_translations_id`, `product__optionGroups_translations`.`baseId` AS `product__optionGroups_translations_baseId`, `product__optionGroups__options`.`createdAt` AS `product__optionGroups__options_createdAt`, `product__optionGroups__options`.`updatedAt` AS `product__optionGroups__options_updatedAt`, `product__optionGroups__options`.`code` AS `product__optionGroups__options_code`, `product__optionGroups__options`.`id` AS `product__optionGroups__options_id`, `product__optionGroups__options`.`groupId` AS `product__optionGroups__options_groupId`, `product__optionGroups__options_translations`.`createdAt` AS `product__optionGroups__options_translations_createdAt`, `product__optionGroups__options_translations`.`updatedAt` AS `product__optionGroups__options_translations_updatedAt`, `product__optionGroups__options_translations`.`languageCode` AS `product__optionGroups__options_translations_languageCode`, `product__optionGroups__options_translations`.`name` AS `product__optionGroups__options_translations_name`, `product__optionGroups__options_translations`.`id` AS `product__optionGroups__options_translations_id`, `product__optionGroups__options_translations`.`baseId` AS `product__optionGroups__options_translations_baseId`, `product__variants`.`createdAt` AS `product__variants_createdAt`, `product__variants`.`updatedAt` AS `product__variants_updatedAt`, `product__variants`.`deletedAt` AS `product__variants_deletedAt`, `product__variants`.`enabled` AS `product__variants_enabled`, `product__variants`.`sku` AS `product__variants_sku`, `product__variants`.`lastPriceValue` AS `product__variants_lastPriceValue`, `product__variants`.`productId` AS `product__variants_productId`, `product__variants`.`stockOnHand` AS `product__variants_stockOnHand`, `product__variants`.`trackInventory` AS `product__variants_trackInventory`, `product__variants`.`id` AS `product__variants_id`, `product__variants`.`featuredAssetId` AS `product__variants_featuredAssetId`, `product__variants`.`taxCategoryId` AS `product__variants_taxCategoryId`, `product__variants`.`customFieldsBulkdiscountenabled` AS `product__variants_customFieldsBulkdiscountenabled`, `product__variants`.`customFieldsMinimumorderquantity` AS `product__variants_customFieldsMinimumorderquantity`, `product__variants_productVariantPrices`.`createdAt` AS `product__variants_productVariantPrices_createdAt`, `product__variants_productVariantPrices`.`updatedAt` AS `product__variants_productVariantPrices_updatedAt`, `product__variants_productVariantPrices`.`price` AS `product__variants_productVariantPrices_price`, `product__variants_productVariantPrices`.`id` AS `product__variants_productVariantPrices_id`, `product__variants_productVariantPrices`.`channelId` AS `product__variants_productVariantPrices_channelId`, `product__variants_productVariantPrices`.`variantId` AS `product__variants_productVariantPrices_variantId`, `product__variants_translations`.`createdAt` AS `product__variants_translations_createdAt`, `product__variants_translations`.`updatedAt` AS `product__variants_translations_updatedAt`, `product__variants_translations`.`languageCode` AS `product__variants_translations_languageCode`, `product__variants_translations`.`name` AS `product__variants_translations_name`, `product__variants_translations`.`id` AS `product__variants_translations_id`, `product__variants_translations`.`baseId` AS `product__variants_translations_baseId`, `product__variants__options`.`createdAt` AS `product__variants__options_createdAt`, `product__variants__options`.`updatedAt` AS `product__variants__options_updatedAt`, `product__variants__options`.`code` AS `product__variants__options_code`, `product__variants__options`.`id` AS `product__variants__options_id`, `product__variants__options`.`groupId` AS `product__variants__options_groupId`, `product__variants__options_translations`.`createdAt` AS `product__variants__options_translations_createdAt`, `product__variants__options_translations`.`updatedAt` AS `product__variants__options_translations_updatedAt`, `product__variants__options_translations`.`languageCode` AS `product__variants__options_translations_languageCode`, `product__variants__options_translations`.`name` AS `product__variants__options_translations_name`, `product__variants__options_translations`.`id` AS `product__variants__options_translations_id`, `product__variants__options_translations`.`baseId` AS `product__variants__options_translations_baseId`, `product_translations`.`createdAt` AS `product_translations_createdAt`, `product_translations`.`updatedAt` AS `product_translations_updatedAt`, `product_translations`.`languageCode` AS `product_translations_languageCode`, `product_translations`.`name` AS `product_translations_name`, `product_translations`.`slug` AS `product_translations_slug`, `product_translations`.`description` AS `product_translations_description`, `product_translations`.`id` AS `product_translations_id`, `product_translations`.`baseId` AS `product_translations_baseId`
FROM `product` `product`
LEFT JOIN `product_option_group` `product__optionGroups`
    ON `product__optionGroups`.`productId`=`product`.`id`
LEFT JOIN `product_option_group_translation` `product__optionGroups_translations`
    ON `product__optionGroups_translations`.`baseId`=`product__optionGroups`.`id` 
LEFT JOIN `product_option` `product__optionGroups__options`
    ON `product__optionGroups__options`.`groupId`=`product__optionGroups`.`id` 
LEFT JOIN `product_option_translation` `product__optionGroups__options_translations`
    ON `product__optionGroups__options_translations`.`baseId`=`product__optionGroups__options`.`id`  
LEFT JOIN `product_variant` `product__variants`
    ON `product__variants`.`productId`=`product`.`id` 
LEFT JOIN `product_variant_price` `product__variants_productVariantPrices`
    ON `product__variants_productVariantPrices`.`variantId`=`product__variants`.`id`
LEFT JOIN `product_variant_translation` `product__variants_translations`
    ON `product__variants_translations`.`baseId`=`product__variants`.`id` 
LEFT JOIN `product_variant_options_product_option` `product__variants_product__variants__options`
    ON `product__variants_product__variants__options`.`productVariantId`=`product__variants`.`id`
LEFT JOIN `product_option` `product__variants__options`
    ON `product__variants__options`.`id`=`product__variants_product__variants__options`.`productOptionId`
LEFT JOIN `product_option_translation` `product__variants__options_translations`
    ON `product__variants__options_translations`.`baseId`=`product__variants__options`.`id`  
LEFT JOIN `product_translation` `product_translations`
    ON `product_translations`.`baseId`=`product`.`id` 
LEFT JOIN `product_channels_channel` `product_channel`
    ON `product_channel`.`productId`=`product`.`id`
LEFT JOIN `channel` `channel`
    ON `channel`.`id`=`product_channel`.`channelId`
WHERE `product`.`id` = ?
AND `channel`.`id` = ?

For a product with 10 option groups with a few options each where all options are translated into two languages this query already yields 133'392 rows with 75 columns each which the database can handle without any issues but the node server simply crashes. (Yes I have added some custom fields but they only increase the columns, not the rows)

[run:server] <--- Last few GCs --->
[run:server] 
[run:server] [10341:0x103b10000]    36046 ms: Scavenge 1392.2 (1423.2) -> 1391.3 (1423.7) MB, 31.8 / 0.0 ms  (average mu = 0.147, current mu = 0.061) allocation failure 
[run:server] [10341:0x103b10000]    36076 ms: Scavenge 1392.2 (1423.7) -> 1391.3 (1424.2) MB, 6.1 / 0.0 ms  (average mu = 0.147, current mu = 0.061) allocation failure 
[run:server] [10341:0x103b10000]    36088 ms: Scavenge 1392.3 (1424.2) -> 1391.3 (1424.7) MB, 3.9 / 0.0 ms  (average mu = 0.147, current mu = 0.061) allocation failure 
[run:server] 
[run:server] 
[run:server] <--- JS stacktrace --->
[run:server] 
[run:server] ==== JS stack trace =========================================
[run:server] 
[run:server]     0: ExitFrame [pc: 0x236e291dbe3d]
[run:server]     1: StubFrame [pc: 0x236e291934b0]
[run:server] Security context: 0x22e59a91e6c1 <JSObject>
[run:server]     2: __read [0x22e52a92e269] [/Users/.../node_modules/tslib/tslib.js:~161] [pc=0x236e295cbf79](this=0x22e502982241 <Object map = 0x22e507893921>,o=0x22e5ea4058b9 <JSArray[2]>,n=2)
[run:server]     3: get [0x22e5f2e41ee9] [/Users/.../node/...
[run:server] 
[run:server] FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
[run:server]  1: 0x10096d2f0 node::Abort() (.cold.1) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server]  2: 0x10003b24f node_module_register [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server]  3: 0x10003b410 node::FatalTryCatch::~FatalTryCatch() [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server]  4: 0x1001798ff v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server]  5: 0x1001798a1 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server]  6: 0x10042bbe6 v8::internal::Heap::UpdateSurvivalStatistics(int) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server]  7: 0x10042d83d v8::internal::Heap::CheckIneffectiveMarkCompact(unsigned long, double) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server]  8: 0x10042aedc v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server]  9: 0x100429d16 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server] 10: 0x10043239c v8::internal::Heap::AllocateRawWithLigthRetry(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server] 11: 0x1004323e8 v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server] 12: 0x100412d07 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server] 13: 0x1005ed150 v8::internal::Runtime_AllocateInNewSpace(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/Cellar/node@10/10.20.0/bin/node]
[run:server] 14: 0x236e291dbe3d 
[run:server] yarn run run:server exited with code SIGINT

To Reproduce
Steps to reproduce the behavior:

  1. Create a new product with about 10 group options, option values and translations. You can also use my data for testing. vendure_issue.zip

  2. Try to create a new product variant using the graphql API, i.e. the playground with for example
    mutation CreateProductVariants($input: [CreateProductVariantInput!]!){ createProductVariants(input: $input){ id sku } }
    Query Variables
    {"input": [ { "productId": "56", "translations": [], "sku": "test", "price": 5 } ]}

  3. See error in the vendure console

Expected behavior
No crash, obviously. This means the validation algorithm has to be made more memory efficient.
What conditions have to be checked, i.e. what are the specifications of the function? The I could try to help with the implementation.
If the validation can't be made more efficient without relaxing the specifications it might be possible to do the whole thing in SQL s.t. node doesn't have to load this much data into memory.

Environment (please complete the following information):

  • @vendure/core version: 0.11.1
  • Nodejs version: v10.20.0
  • Database (mysql/postgres etc): mysql
@Tyratox Tyratox added the type: bug 🐛 Something isn't working label May 3, 2020
@michaelbromley
Copy link
Member

Hi, thanks for this exemplary issue - great delineation of the problem.

So, the validateVariantOptionIds function ensures that the optionIds array passed in the input to CreateProductVariantInput:

  1. has a length equal to the number of OptionGroups assigned to that Product
  2. one option ID from each OptionGroup is specified
  3. that no existing ProductVariant of that Product already has that exact combination of optionIds.

I think a simple way to solve the memory issue would be to break up the mega-join into multiple queries. E.g.:

  1. first join only the optionGroups, check condition 1 above
  2. If OK, then get the options for each group and check condition 2
  3. If OK, then get the productVariants & their options and check condition 3.

Since this is a more rarely-used and admin-only code path, the marginal decrease in speed caused by multiple queries would be totally acceptable.

If you'd like to work on this, a PR would be very welcome. If you can think of a better way to do it (the suggestion above was just an off-the-top-of-my-head idea) feel free to do it that way.

michaelbromley pushed a commit that referenced this issue May 11, 2020
These changes reduce the amount of data loaded into memory which hopefully prevents some server crashes. #328.

Also added the eager option for allowing performance gains in certain queries.
@Tyratox Tyratox closed this as completed May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug 🐛 Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants