-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache strategy for channels in ChannelService #988
Comments
Yes this needs to be solved. Let's consider the approaches:
I personally like the simplicity of option 2, but in some circumstances I can imagine even a short amount of inconsistency might not be acceptable. Investigating further, it can be seen that a similar in-memory caching technique is used in PromotionService.activePromotions, and ZoneService.updateZonesCache() which has the exact same issues as with the ChannelService. Therefore it might make more sense to define a generalized caching strategy, which would encompass Sessions, Channels, Promotions, Zones and anything else needed in the future. This could be a single config field, which would deprecate and succeed the SessionCacheStrategy. I'm not exactly sure of what the API of a general cache strategy would look like, but to me that's quite an elegant solution. Thoughts? |
Hi Michael, sorry to get back to you on this so late. |
Hi Michael. If at all possible I would like to bump the priority on this issue as it is really causing us trouble in our serverless setup. Multiple instances of Vendure exist in parallel, but they need to be restarted every time a Channel is added to or removed from the system. |
Yep, let's get it in the next minor 👍 |
Also handle this line as part of the solution: vendure/packages/core/src/service/services/product-variant.service.ts Lines 273 to 274 in 6f7cc34
This is making a db query for every product variant where the stock level is selected. |
I've been doing some exploratory work on this issue and here's a problem with the configurable cache strategy option: Several of the internal caches are storing deeply-nested class instances. For example, A generic caching mechanism would need to be able to serialize this in order to store it in eg a database or Redis. The problem comes when we deserialize these data structures, the resulting objects will be POJOs and no longer class instances. Thus any downstream code which either relies on an One solution to this would be to build-in some logic which allows us to reconstruct object instances based on the deserialized data. This rehydration process will however add time to the I'm going to do a bit more exploration here and it might even be that with more use of the |
Hi Michael, this is a very valid point you are making here. We have been using A cache with a (configurable) TTL could make sense as well, although this might give very unexpected results in multi-server environment. Multiple parallel requests by a client are not necessarily processed by the same server instance. As a consequence, multiple requests might give different results. |
Yes I looked at class-transformer, which would solve the serialization issue but the computation time is the unknown, plus I'm not keen on adding another layer of annotations onto the TypeORM entities. I'm doing some benchmarking as I explore this, and will posts update this thread with results. To summarize, here is a list of all the points where data is currently cached in-memory:
|
You might have overlooked |
@hendrik-advantitge thanks, that's right I missed that one. So here is the direction I am going in so far: I have decided to go with the second option mentioned in the above comment, namely keep caching in memory, but add a TTL to expire the cache, mainly due to the issues mentioned above with deserializing class instances. I have created a function which allows the affected services to still work in the same way as presently, but adds a TTL to the caching. I found that it was important to keep the cache.get operation synchronous. I first of all attempted to make them async, but the infectious nature of async functions quickly started requiring a lot more changes than I wanted in the call stacks depending on these cached values. So I am working on a design which will allow the cache.get operation to always be sync, while still triggering refreshes of the value async in the background: /**
* @description
* A cache which always returns a value, and automatically refreshes itself if the value is found to be stale.
*/
export interface SelfRefreshingCache<V> {
/**
* @description
* The current value of the cache. If the value is stale, the stale value is still returned,
* but then will be asynchronously refreshed in the background.
*/
value: V;
/**
* @description
* Force a refresh of the value, e.g. when it is known that the value has changed such as after
* an update operation to the source data in the database.
*/
refresh(): Promise<void>;
}
export interface SelfRefreshingCacheConfig<V> {
name: string;
ttl: number;
refreshFn: () => Promise<V>;
}
/**
* @description
* Creates a {@link SelfRefreshingCache} object, which is used to cache a single frequently-accessed value. In this type
* of cache, the function used to populate the value (`refreshFn`) is defined during the creation of the cache, and
* it is immediately used to populate the initial value.
*
* From there, when the `.value` property is accessed, it will _always_ return a value synchronously. If the
* value has expired, it will still be returned and the `refreshFn` will be triggered to update the value in the
* background.
*/
export async function createSelfRefreshingCache<V>(
config: SelfRefreshingCacheConfig<V>,
): Promise<SelfRefreshingCache<V>> {
const { ttl, name, refreshFn } = config;
const initialValue = await refreshFn();
let value = initialValue;
let expires = new Date().getTime() + ttl;
let refreshing = false;
const refreshValue = () => {
if (refreshing) {
return Promise.resolve();
}
refreshing = true;
Logger.debug(`Refreshing the SelfRefreshingCache "${name}"`);
return refreshFn()
.then(newValue => {
value = newValue;
expires = new Date().getTime() + ttl;
})
.catch(err => {
Logger.error(
`Failed to update SelfRefreshingCache "${name}": ${err.message}`,
undefined,
err.stack,
);
})
.finally(() => (refreshing = false));
};
return {
get value() {
const now = new Date().getTime();
if (expires < now) {
refreshValue();
}
return value;
},
refresh: refreshValue,
};
} In the example of the ChannelService, it would be used like this: @Injectable()
export class ChannelService {
private allChannels: SelfRefreshingCache<Channel[]>;
/**
* When the app is bootstrapped, ensure a default Channel exists and populate the
* channel lookup array.
*/
async initChannels() {
await this.ensureDefaultChannelExists();
this.allChannels = await createSelfRefreshingCache({
name: 'ChannelService.allChannels',
ttl: 10000,
refreshFn: () => this.findAll(RequestContext.empty()),
});
}
/**
* Returns the default Channel.
*/
getDefaultChannel(): Channel {
const defaultChannel = this.allChannels.value.find(channel => channel.code === DEFAULT_CHANNEL_CODE);
if (!defaultChannel) {
throw new InternalServerError(`error.default-channel-not-found`);
}
return defaultChannel;
}
async update(
ctx: RequestContext,
input: UpdateChannelInput,
): Promise<ErrorResultUnion<UpdateChannelResult, Channel>> {
// omitted the usual update logic
// here we refresh the cached Channels array
await this.allChannels.refresh();
return assertFound(this.findOne(ctx, channel.id));
} In that example, the TTL is set to 10 seconds. This kind of timeframe seems reasonable, but we could also add some new config options to VendureConfig to make this configurable. I'd rather not have to do that, and if possible just find a value for each use-case which strikes a practical balance between avoiding staleness and avoiding hundreds of unneccesarry DB calls. |
Hi Michael TTL avoids all this mess and keeps things much simpler. The idea of returning a value even when it is stale seems unintuitive at first sight. I totally understand the reasoning though and doubt it will give any issues in practice. But in theory, with a TTL of 10 seconds, you could still retrieve a value that is hours old if the value has not been queried in that time. It not being an issue only holds if we stick to the entities that are currently cached. If we decide to keep What are your thoughts? |
Does it happen that with a multi-instance setup you could have once instance that does not get traffic for such a long time? I would have thought that e.g. with serverless the instance might be cleaned up, and with a long-running instance with load balancer in front, each instance would get a similar frequency of requests. But I have not practical experience here so I'm interested to know if you ran into something like this so far. |
In a serverless environment, unused instances indeed do get cleaned up after some time under normal circumstances. However Google, and possibly other providers as well, have introduced features that can keep a number instances 'warm'. In this scenario, the old cache is used when spinning up the instance. Another scenario, common for us, is that the API instance is kept running by an importer service that is updating products. This is a long running process, that will not trigger updates of the shippingMethod cache for example. |
OK, that makes sense. I took another go at making it async and I think it is doable. Rather get it correct than hope that edge cases won't appear - they always do! |
100% agree on this! |
Yes, I have a bias against adding config if at all possible. Don't want to turn VendureConfig into webpack! I'm not dogmatic about it, it is just my default stance until convinced otherwise. |
I think there is nothing wrong with a lot of config options, as long as there are good default values. It gives options to the advanced users without getting in the way of new users. |
Relates to #988. This change resolves stale data issues when running multiple Vendure instances in parallel, e.g. as serverless functions.
Relates to #988. Zone members (countries) were being newly translated for _every_ call to ZoneService.findAll(). In practice this was resulting in thousands of calls to `translateDeep`. This optimization cuts the calls down by a factor of 5-10x in various scenarios.
Hey!
I built a multi-tenant solution for vendure using row level security (RLS) in postgres, so the design is to have on each vendure entity 'tenantId' property, which identifies a tenant. So in my environment ChannelService caches wrong, because on refresh, it executes findAll() which will find all channels from current request's tenant. So if ChannelService will receive multiple requests from a few tenants, cache can return channels from the last request instead on your own. I understand that vendure recommends to use channels as tenants, but here my thoughts:
Currently there is no way to control caching behavior. I think 3rd option "Configurable cache strategy" would be better for custom deployments. I think I can contribute this feature. @michaelbromley What do you think? |
Hi @ashitikov, thank you for your input on this.
|
Is your feature request related to a problem? Please describe.
The ChannelService as-is holds an array of all channels in memory, which is updated whenever a channel is created/updated. This makes the vendure instance stateful and can lead to invalid/out-of-sync data if a project uses multiple vendure instances.
Describe the solution you'd like
Querying from the database on every call would solve this, but since the data is used on every call (specifically
getChannelFromToken
) this would become a bottleneck.Querying the session data presented a similar bottleneck, so I think we need a similar solution.
Similar to the
SessionCacheStrategy
config, I'd propose aChannelCacheStrategy
The text was updated successfully, but these errors were encountered: