Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default memoize hasher returns invalid key for multiple arguments #575

Closed
lxe opened this issue Jul 7, 2014 · 7 comments
Closed

Default memoize hasher returns invalid key for multiple arguments #575

lxe opened this issue Jul 7, 2014 · 7 comments
Labels

Comments

@lxe
Copy link

lxe commented Jul 7, 2014

Currently, the default memoize hasher takes only the first argument to create the memo key:

https://github.com/caolan/async/blob/master/lib/async.js#L1000-L1002

This will cause memoized functions that take multiple arguments to call back with invalid results:

var fn = async.memoize(function (a, b, c, cb) {
  cb(a, b, c);
});

fn(1, 2, 3, function (a, b, c) {
  console.log(a, b, c); 
  // 1, 2, 3 - correct
});

fn(1, 10, 20, function (a, b, c) {
  console.log(a, b, c); 
  // 1, 2, 3 - incorrect -- should be "1, 10, 20"
});
@lxe
Copy link
Author

lxe commented Jul 7, 2014

I'm testing performance of an alternative hasher that would just do

    function(x) {
        if (arguments.length <= 1) {
            return x;
        } else {
            return Array.prototype.slice.call(arguments);
        }
    };

with hope that the check for arguments.length will limit the exposure to the slow(?) Array.prototype.slice.call invocation.

@lxe
Copy link
Author

lxe commented Jul 7, 2014

http://jsperf.com/async-js-memoize-hashers

doing #575 (comment) seems to be 5% slower on chrome. I think the correctness justifies the performance hit

@aearly
Copy link
Collaborator

aearly commented Jul 7, 2014

When using things that are easily stringified, using all the args won't be that much slower. The real performance hits when you have to stringify large object arguments.

It is also extremely common in other memoize implementations (such as Underscore/Lodash) to only use the first argument by default. That's what the custom hasher is for.

@aearly
Copy link
Collaborator

aearly commented Jul 7, 2014

There is also tons of existing code that relies on async.memoize only using the first arg. This would cause a huge regression. It would have to be a 1.0.0 type of change.

@lxe
Copy link
Author

lxe commented Jul 7, 2014

@aearly I've been solving this by passing a custom hasher, but I think it's important that it's at least documented that the hash key is the first argument only. (as it is in lodash)

@owenallenaz
Copy link

@lxe We just ran into this same problem. One issue with your solution is that effectively what it's doing is a toString() on the args array, that's what occurs when you set data[key] and key = ["arg1", "arg2"]. This opens up the door for some more unintended consequences, in example func("foo", "bar", cb) and func("foo,bar", cb) will result in the same hash. You could count the args, but it could have similar drawbacks such as ["foo,bar", "baz", "qux"] would hash to the same as ["foo", "bar,baz", "qux"].

What about if we just did a JSON.stringify, falling back to Array.prototype.call(arguments).toString() when JSON isn't available (if we have to worry about IE6,7 support).

As proof, I altered @lxe JSperf to show an example of what I mean http://jsperf.com/async-js-memoize-hashers/2. I also changed some of the loops because they previously had some minor async errors. And really, is the performance of the hasher really that important? The entire intent of async.memoize is to save the cost of an expensive async operation, the cost of toString() or JSON.stringify() is moot in comparison right?

To @aearly can we really expect anyone out there to have written code such that they want func("foo", "bar", cb) to return the same value as func("foo", "baz", cb) for all calls but the first? In example if func("foo", "baz", cb) was made first it would forever return that result, but if func("foo", "bar", cb) was called first, it would forever return that result. That feels like programming on top of a bug and then not wanting that bug to get fixed, is that a realistic scenario?

@aearly aearly added the docs label May 19, 2015
@aearly
Copy link
Collaborator

aearly commented May 19, 2015

Fixed the docs through #631

@aearly aearly closed this as completed May 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants