Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map structure #4

Open
js-choi opened this issue Mar 30, 2022 · 13 comments
Open

Map structure #4

js-choi opened this issue Mar 30, 2022 · 13 comments
Labels
question Further information is requested

Comments

@js-choi
Copy link
Collaborator

js-choi commented Mar 30, 2022

If we go with a Map-like cache (see policy Maps), how should we structure the cache? For example, we could use a tree of Maps, or we could use argument-tuples as keys in one Map.

There is also the compositeKeys proposal (#1).

js-choi added a commit that referenced this issue Mar 30, 2022
See #2, #3, #4, and #5.
Closes #1 in favor of #3.
@js-choi js-choi added the question Further information is requested label Mar 30, 2022
@michaelficarra
Copy link
Member

Is it observable?

@js-choi
Copy link
Collaborator Author

js-choi commented Mar 30, 2022

@michaelficarra: It is observable if we expose the cache to the developer somehow. For example, we may allow the developer to supply a map-like cache argument. (See also proposal-policy-map-set, which I will also present to plenary.)

For example:

const cache = new Map;
function f (x, y) { return x + y; }
const fMemo = f.memo(cache);
f(0, 0); f(1, 1);

If we went with a flat map-like with argument-tuple keys, then cache would be Map(2) { #[undefined, undefined, 0, 0] ⇒ 0, #[undefined, undefined, 1, 1] ⇒ 2 } (where the first two values in each tuple key is the this and new.target of the function call or whatever.)

This would allow the developer to cache.clear() or to instead use const cache = new LRUMap(10) or whatever.

@michaelficarra
Copy link
Member

I don't like using a user-provided map. If we want memoised functions to have methods like clear, we can give them a prototype that extends Function.prototype.

@js-choi
Copy link
Collaborator Author

js-choi commented Mar 30, 2022

@michaelficarra: The more personally important use case for allowing custom caches is to allow the user to specify the memoization’s cache replacement policy. For example, I may want to specify that the cache store up to 100 entries with a least recently used (LRU) policy, like how Python does, so I may wish to use new LRUMap(100).

But, yes, hiding the cache’s structure completely is also an option for this issue.

@js-choi
Copy link
Collaborator Author

js-choi commented Mar 30, 2022

#5 (comment) reminded me that records/tuples cannot directly contain objects, so that goes out the window.

Looks like we would have to use trees of Map-likes—or compositeKeys—after all. Or maybe there’s another way.

@js-choi js-choi closed this as completed Mar 30, 2022
@js-choi js-choi reopened this Mar 30, 2022
@michaelficarra
Copy link
Member

Once you choose a tree of map-likes, now you are giving the consumer per-argument-position control over the caching strategy, which I don't think anyone wants.

@zloirock
Copy link

zloirock commented Mar 30, 2022

Here you are trying to solve a problem that compositeKey was intended to solve directly. My polyfill of compositeKey / compositeSymbol represents the such tree of WeakMaps and Maps.

@js-choi
Copy link
Collaborator Author

js-choi commented Mar 30, 2022

@michaelficarra: I agree. I would much prefer a flat one-level cache data structure with complex keys.

I suppose we could still:

  1. Use argument tuples as keys in the developer-supplied map-like cache. (The tuples would be of the form #[thisVal, newTargetVal, ...args].)
  2. Replace objects in the tuples with symbol placeholders, before calling cache.has, cache.get, or cache.set.
  3. Store those symbols as values in a separate object→symbol WeakMap, which might or might not be exposed to the developer.

For example:

const cache = new LRUMap(256);
const f = (function f (arg0) { return this.x + arg0; }).memo(cache);
const o0 = { x: 'a' }, o1 = { x: 'b' };
f.call(o0, 0); // Returns 'a0'.
f.call(o1, 1); // Returns 'b1'.

Now cache would be LRUMap(2) { #[s0, undefined, 0] ⇒ 'a0', #[s1, undefined, 1] ⇒ 'b1' }, where s0 and s1 are unique symbols, and f’s closure internally closes over a WeakMap { o0 ⇒ s0, o1 ⇒ s1 }. (Maybe the WeakMap would also be accessible from a f.symbolMap property.)


@zloirock: Although composite keys would probably also be good for this use case, composite keys have been stuck at Stage 1 with no progress for two years, so I am concerned that it will be a long time before they get added to the language (if ever).

At plenary, I will present compositeKey as an alternative possibility to using tuple keys containing symbols in a WeakMap. We will discuss both approaches’ trade-offs.

@zloirock
Copy link

zloirock commented Mar 30, 2022

Although compositeKey would be great for this, it has been stuck at Stage 1 with no progress for two years, so I am concerned that it will be a long time before it gets added to the language (if ever).

This proposal is a good chance to advance compositeKey. Maybe someone will start work on it.

js-choi added a commit that referenced this issue Mar 31, 2022
@js-choi
Copy link
Collaborator Author

js-choi commented Jul 9, 2022

According to composite keys’ explainer, “at least one component must be a valid key that can be placed in a WeakMap”. Unless I am mistaken, this restriction would exclude them from being usable for argument-list keys, since both compositeKey() (for f()) and (compositeKey(1) for f(1)) would be invalid. (In contrast, things like compositeKey([]) – for f([]) – would be valid.)

We still have the “replace objects with symbols from a WeakMap” solution, though.

@zloirock
Copy link

zloirock commented Jul 9, 2022

@js-choi a first component can be, for example, a function. compositeKey(baseFunction, this, ...arguments). Initially, I meant this approach.

@theScottyJam
Copy link

theScottyJam commented Aug 21, 2022

A Map tree would both be difficult to manage if you ever want to touch it by hand in userland, and may prevent a built-in LRU-Map from working, unless it's also built to support this tree shape.

I think an ideal default option would be to use a tuple for the cache key (update: I discuss more on how I think this is still possible despite previous concerns near the bottom). I also feel that this should be completely customizable, as I don't think it's possible to provide an algorithm that would satisfy everyone, and requiring people to customize the algorithm on a custom Map instance itself would be tedious, and make many use cases of function memoization overly difficult.

An option I've used previously is to require the user to generate a "cache key" from the provided parameters. So, something like this:

const add = (x, y) => x + y
const memoizedAdd = memoize(add, {
  argsToKey: (...args) => JSON.stringify(args)
})

In this example, if you call memoizedAdd with 2 and 3, the engine will call argsToKey with those parameters to determine what should be used as the map key. In this case, the key will be '[2,3]'.

This puts a ton of power into the hands of the user - power which is often needed. It can solve scenarios such as these:

  • What if the function expects an options bag as an argument? There's no resonable defaults you can do to peer into objects and compare their values (there's too many issues related to prototypes, circular references, private fields, etc). An argsToKey function allows the user to hand-pluck important properties from an object bag and generate a key from it.
  • What if you want to treat two types of values as equivalent? e.g. perahps the second parameter of your function will always behave the same if it's set to null, undefined, it's missing, or you provide the value it would have used as the default anyways. Or, you can imagine a function that takes an angle's degrees as one of its parameters, and perhaps you want to normalize the numbers so they're in the range 0-360 before you use them as a cache key (e.g. if 361 is passed in, you don't use 361 as a cache key, you just use 1).
  • You can ignore arguments that don't need to be part of the cache key. For example, imagine memoizing a function that takes an Express request instance as a parameter. The function being memoized will pluck some query parameters and headers off of the request instance, but that's it, it doesn't care what endpoint was hit to trigger this request or anything else that could also be found on this instance. With a custom argsToKey function, you have the power to pluck what you want off of the req object and incorporate it as part of the cache key, then ignore everything else that's irrelevant to the function being memoized.
  • We can use the Tuple constructor as the default for the argsToKey function once it's available (I think that would make the most sense as a default), however, we don't have to tie this proposal to the Tuple proposal. To start with, we could make it an error to use the memoize function without providing an argsToKey function. A follow-on proposal can change the default to something else when we're ready for it. (Update: Sorry, only quickly skimmed over this conversation, but reading through it again, I noticed there was concern about the fact that objects can't go into tuples. We could choose to auto-box objects so they can go in them, or, we could choose to throw an error if an object is provided - is the user doing the memoization wants to support objects, they have to supply their own argsToKey function).

@js-choi
Copy link
Collaborator Author

js-choi commented Aug 21, 2022

It is a good point that key customization is important, and that the tuple-based approach in #4 (comment) would not address even common use cases like option objects.

It is also a good point that we can punt this problem to the future by making keying functions be required. It’s not a big deal to write something like fn.memo(memoCache, ([ key ]) => key). This would decouple this proposal from tuples (or compositeKeys). We could always add a default keying function in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants