-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only denormalize results that have an id #1300
Comments
We would still have to piece the query back together when reading from the store no matter how we save the data, so I’m not convinced that this would improve read performance 😣 Maybe you could put together a quick test with ESBench to show that our reading algorithm would be faster on a denormalized data structure? Remember that we can’t write aliases to the store so we would still need an algorithm that reads back data and gives it the proper alias when we return it to the user. There will always be some read performance hit given that we normalize our data. Perhaps in the future, we could explore a client that doesn’t normalize data at all… As for devtools performance, that is definitely something we should look at. |
I think the slowness is partly because the cache has so many keys, and
partly because many things need to be pieced together that should not have
been denormalized in the first place. In my case, I would have 1 cache
lookup of a single object with embedded arrays instead of about 30 lookups.
That's an order of magnitude…
BTW, if you cache queries outside of the store, you could cache each query
in its entirety, except using proxy objects that point to the full cache
object and only expose the requested fields. Then you can have your cake
and eat it too…
(not an actual Proxy object, IE doesn't support that, but an object with
getters and a link to the cached object)
On Thu, Feb 16, 2017 at 8:49 PM Caleb Meredith <[email protected]> wrote:
We would still have to piece the query back together when reading from the
store no matter how we save the data, so I’m not convinced that this would
improve read performance 😣
Maybe you could put together a quick test with ESBench
<https://esbench.com/> to show that our reading algorithm would be faster
on a denormalized data structure? Remember that we can’t write aliases to
the store so we would still need an algorithm that reads back data and
gives it the proper alias when we return it to the user.
There will always be some read performance hit given that we normalize our
data. Perhaps in the future, we could explore a client that doesn’t
normalize data at all…
As for devtools performance, that is definitely something we should look at.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1300 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADWlj7bVCcY9vd2uN8K3dkOTBf1xEMGks5rdKg4gaJpZM4MCpkU>
.
|
Yep, I agree with your premises, but we’re hesitant to act on this until we see that an alternative where we don’t normalize would actually show a major perf win. Since we are currently discussing refactoring our store code in a major way, this is a change that we could fit in. If anyone in the community can provide the data and prove that this difference would actually matter for perf 👍 (because currently, it is keeping the implementation simpler) |
Well, I have a working app you can inspect (privately) where simple queries
take disproportionate amounts of time to get from cache. So for sure it's a
problem.
However, I assume you want a benchmark of sorts? Is there already something
similar in the codebase?
…On Thu, Feb 16, 2017, 11:52 PM Caleb Meredith ***@***.***> wrote:
Yep, I agree with your premises, but we’re hesitant to act on this until
we see that an alternative where we don’t normalize would actually show a
major perf win.
Since we are currently discussing refactoring our store code in a major
way, this is a change that we could fit in. If anyone in the community can
provide the data and prove that this difference would actually matter for
perf 👍
(because currently, it is keeping the implementation simpler)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1300 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADWlkl4hC6zzny5XHMcNebOp6VOMnGUks5rdNM3gaJpZM4MCpkU>
.
|
Benchmarking two implementations in Here’s the benchmark I would like to see. We want to read the following object out of a cache: const object = {
objects: [
{ foo: 1.01, bar: 2.01, baz: 3.01 },
{ foo: 1.02, bar: 2.02, baz: 3.02 },
{ foo: 1.03, bar: 2.03, baz: 3.03 },
{ foo: 1.04, bar: 2.04, baz: 3.04 },
{ foo: 1.05, bar: 2.05, baz: 3.05 },
// 100+ more objects...
],
}; The keys and values are arbitrary. Then we want to test against two different cache shapes. One where the objects array is normalized, and the other where it is denormalized. However, here is the catch, the objects are stored with different keys. So const cache1 = {
'root': {
objects: {
$type: 'reference',
ids: [
'root.objects.0',
'root.objects.1',
'root.objects.2',
'root.objects.3',
'root.objects.4',
// 100+ more ids...
],
},
},
'root.objects.0': { a: 1.00, b: 2.00, c: 3.00 },
'root.objects.1': { a: 1.01, b: 2.01, c: 3.01 },
'root.objects.2': { a: 1.02, b: 2.02, c: 3.02 },
'root.objects.3': { a: 1.03, b: 2.03, c: 3.03 },
'root.objects.4': { a: 1.04, b: 2.04, c: 3.04 },
// 100+ more nodes...
};
const cache2 = {
'root': {
objects: [
{ a: 1.00, b: 2.00, c: 3.00 },
{ a: 1.01, b: 2.01, c: 3.01 },
{ a: 1.02, b: 2.02, c: 3.02 },
{ a: 1.03, b: 2.03, c: 3.03 },
{ a: 1.04, b: 2.04, c: 3.04 },
// 100+ more objects
],
},
}; This case isn’t super realistic given that in a real cache there would be some mix of references and actual arrays. If the difference is minor then I don’t think this should be our top concern as there are much more pressing performance issues in the cache that we need to solve, but if the difference is major, say For an example of a benchmark on ESBench, see my immutable data implementations benchmark: https://esbench.com/bench/588a5ee399634800a03476b0 |
Benchmarking two implementations should actually be pretty easy with the benchmarks we currently have. I think that would probably work better since it would capture some of the unforeseen differences in perf due to how things within AC are actually implemented. For example, it may be that doing a de-normalized look up is faster in raw JS but might be slower in reality due to how we actually handle the data within Apollo Client. Agree it would probably take a bit more effort than just using a simple raw benchmark though. |
Aside from the benchmarks, I think a nice way to denormalize the queries is as follows:
So if you get The net result of this is that you can directly read query results, but you also share storage between queries so the denormalization works as before. The slowness I experience upon getting queries from cache would be completely gone since fetching results from cache would be Notes:
|
Thinking some more on what I propose above:
|
@wmertens This approach does sound interesting, especially when you are dealing with lists of objects returned. Our current approach there is O(n) but this should give us an O(1) read for the list. To clarify, if you have one query that returns a list of, say, to-do items and another query returns an update to a specific to-do item (with a particular id), then we'd only be storing one "real" object between the two but we'd store two different proxy objects which then contain references to the "real" object? |
Yes, that's right
…On Thu, Mar 2, 2017, 5:06 PM Dhaivat Pandya ***@***.***> wrote:
@wmertens <https://github.com/wmertens> This approach does sound
interesting, especially when you are dealing with lists of objects
returned. Our current approach there is O(n) but this should give us an
O(1) read for the list. To clarify, if you have one query that returns a
list of, say, to-do items and another query returns an update to a specific
to-do item (with a particular id), then we'd only be storing one "real"
object between the two but we'd store two different proxy objects which
then contain references to the "real" object?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1300 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADWlpKrDcNULkw86yjsZ6QpPcGSEvxcks5rhukJgaJpZM4MCpkU>
.
|
@wmertens could you provide an example of a single query and dataset so that we could test out different reading/writing methods? |
a quick note: Storing the cache this way allows updating cached objects (CO) by mutating the single base CO, but it doesn't allow removing objects easily. To do that, you could keep a WeakSet of queries that refer to the CO (per CO), so they can be traversed and the CO can be removed from lists and references. (on remove, check all the queries to see if they are in the WeakSet, if so, traverse) |
Let's track this under #1409 |
In my app, I have a "medium" amount of data (several hundred small-ish objects) coming in from the server, and getting it and handling it with REST+Redux would not be an issue at all.
With apollo though, the results are denormalized and all lists are split out. The result is that I have thousands upon thousands of entries in the apollo cache, so much that it crashes the apollo dev tools when I inspect it.
Furthermore, satisfying a single object query from cache takes about 30ms, presumably because it needs to be pieced back together
Would it be possible to change the cache so that only objects that have a non-null
idFromObject
are denormalized? Maybe even via a function that looks at the schema instead?The text was updated successfully, but these errors were encountered: