-
-
Notifications
You must be signed in to change notification settings - Fork 15.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalising state and garbage collection #1824
Comments
Is this a real use case in your app? |
@gaearon the profiles/files thing was a made up example. But let's take something like Twitter of Facebook for example. It's not unreasonable to think a user might browse through tons of stuff before closing the browser window, right? Maybe it's not something I should worry about but it feels to me like it could eventually become an issue. |
Do you suggest I should go with the normalised approach and worry about "garbage collection" if/when it becomes an issue? |
I think avoiding premature optimization is generally good advice, do some stress tests on your app first and see if performance suffers, and if so you can consider optimizations then. |
I think it's worth considering the perspective that denormalization itself can be a premature optimization. In the examples above, @olalonde was potentially concerned about storing files and events in more than one place in the state tree. Storing the same kind of thing in more than one place is not intrinsically a poor choice. If you're just displaying the data (but not performing any operations on it), it's likely fine to store it without denormalizing it. When you add operations against the data (say renaming a file, for example), then you begin to cross the line into duplicating logic in more than one place, at which point you could consider denormalization as a strategy to remove the duplication of logic. If I had a phone number object that was stored inside two different kinds of things (say, a phone number for a main business profile, as well as phone numbers for each employee within an array of employees), then there wouldn't necessarily be any immediate value to denormalizing business phone numbers and employee phone numbers into a combined On the other hand, going back to the file example, if you wanted to rename a file, and you wanted such a feature to operate identically for that file regardless of whether it happened to be residing in the I think you have to look at the big picture of how you intend to use the data on the page before you decide when and where to denormalize. I wouldn't necessarily see your 2nd refactoring as superior to your 1st refactoring. |
@naw solid advice |
Normally, yes. |
(If you’re certain it’ll become an issue you might want to use something like Relay instead which I think handles this for you. Or you can roll your own normalized reducer factory based on schema that would keep track of what’s referenced where—but it sounds complicated.) |
@naw thanks, good advice. @gaearon thanks I'll have a look at Relay as well though it looks like it would require a bigger refactor. My app state is mostly "read only" so I think I'll keep going with the denormalised approach and maybe name my keys more explicitly to make it clear which views owns which piece of state (e.g. |
Would be interesting to see different redux state shapes examples... I went through the |
Yeah, that's right along with the kind of stuff I'm hoping to cover in my "Structuring Reducers" recipe ( #1784 ). Which is actually kind of an issue - there's three or four overlapping topics, and I'm not yet sure how to approach them yet. State shape, normalizing data, various ways to define reducers, ... |
That said, I do have links to a number of real and example apps over at https://github.com/markerikson/redux-ecosystem-links/blob/master/apps-and-examples.md. Looking through those apps might be instructive. |
I have a question that might be related to "normalized state." Let's assume that we're trying to build Reddit (just like in your tutorial example) and the following comes from the API server. subreddits: [{
title: "food"
posts: [{
id: "",
body: "..",
comments: [{
id: ".."
body: "..",
}]
..morePosts
},
title: "culture"
posts: [{
id: "",
body: "..",
comments: [{
id: ".."
body: "..",
}]
..morePosts
},
] What's the conventional way of using reducers for the above response? I am generally following the idea of having a look-up table (e.g. Should it look like this? subredditByTitle: {
food: {
id: subreddit_1,
title: "food"
posts: [post_1, post_2]
}
culture: {
id: subreddit_2,
title: "culture"
posts: [post_3, post_4]
}
}
postsById: {
post_1: {
body: ".."
comments: [comment_1, comment_2]
},
post_2: {
body: "..",
comments: [comment_3, comment_4]
}
}
commentsById: {
comment_1: {
body: ".."
},
comment_2: {
body: ".."
}
} |
First, the typical layout for an app with normalized data would be something like: {
someAppData1 : { ..... },
someAppData2 : { ..... },
entities : {
EntityType1 : {
byId : {
et1id1 : {},
et1id2 : {},
// etc
},
items : ["et1id1", "et1id2"]
},
EntityType2 : {
byId : {
et2id1 : {},
et2id2 : {},
// etc
},
items : ["et2id1", "et2id2"]
},
}
} So, put all the normalized data under a parent key, have a key for each type of item, and then those keys keys contain the lookup tables and ID arrays. The next issue is how to structure the logic for managing those entities. Some of your reducer logic could be fairly generic, like "look up an item based on its type name and its ID, and update its attributes". Some may be specific to a certain entity type. Some may require access to multiple entity keys at once. This is where you need to start thinking outside the box of |
@markerikson Thank you, that's helpful. One more question: where should
But what if I am updating/creating a single entity? For example, when I am fetching |
Haven't actually dealt with that concern myself. I suppose it might depend on whether you're expecting to be fetching multiple entities with multiple requests at once, or only one request out at a time. Could either put it on each entity value if you're expecting to do multiple requests, or having a single You might also want to look at some of the utilities listed in my Redux addons catalog, particularly the libs that try to handle collection CRUD, network requests, and more network requests, and see how they handle things. |
For future reference, I found the new Wordpress to be a good example of a relatively large React/Redux app: https://github.com/Automattic/wp-calypso/tree/master/client |
@markerikson Thanks. I have one last question. Let's say I have the following DB structure where Page
Post
Comment Then the normalized state looks like this (using subdocuments): pagesById: {
page_1: {
pageTitle: "Elmo",
posts: [post_1, post_2]
}
},
postsById: {
post_1: {
body: "First post"
comments: [comment_1, comment_2]
},
post_2: {
body: "Second post"
comments: [comment_3, comment_4]
},
},
commentsById: {
comment_1: {
body: "First comment"
}..
} When I add a post, I have to do two things:
Doing these two actions when I add a new post feels a little awkward because of how On a similar note, I am curious whether the use of pages: {
allPageIds: [page_1],
pagesById: {
page_1: {
pageTitle: "Elmo",
}
},
},
posts: {
allPostIds: [post_1, post_2]
postsById: {
post_1: {
page_id: page_1,
body: "First post"
},
post_2: {
page_id: page_1,
body: "Second post"
},
},
},
comments: {
allCommentIds: [comment_1, comment_2, comment_3, comment_4]
commentsById: {
comment_1: {
post_id: post_1
body: "First comment"
}..
}
} With this new state, only the I'd appreciate any insight 😄 |
Yeah, updating the relational info for a parent item when creating a new child (as one example) is absolutely valid. It's not that, say, your It would also be entirely valid to take a similar-ish but different approach, where the I've never touched Mongoose myself, and have no idea what a "subdocument is", so can't help you there. |
@markerikson awesome. Thank you. @gaearon do you have an opinion on using |
I'm going to close this out in favor of Mark's documentation efforts. I think there's enough here to going on for your original question and it's now down to things more specific to your project wants and needs. |
I'm interested in ways to garbage collect normalized data in Redux. This has become a problem in the Mastodon web UI, as the users have a habit of leaving it open with the firehose of content on and browsing a lot of stuff (mastodon/mastodon#787). It's unfortunate that OP's concerns were dismissed, as it would have been nice to find a ready solution here. |
We are creating an email app where we persist the state to cache so on reboot it gets hydrated and looks like a normal desktop app. Since we are dealing with an email app, we have to be very careful about what we store in our redux state. Since we persist the state on reboot it is imperative that we properly manage it as it is extremely easy to see our thread and message slices reach 10k+ records. As it stands today we are very aggressive with pruning, but we see a not-so-distant future where this luxury is no longer viable. It's really not an ideal solution but we essentially listen for when a thread gets added to a folder and then kick-off a saga that will scan for any de-referenced threads and then cascade remove any corresponding messages to that thread. It does feel like we are reinventing the garbage collection "wheel" here but I'm not exactly sure what other solutions we have. |
Out of curiosity, how would this get handled in any other client-side application/framework? I've noted several times that Redux is (as far as I can see) really no different than any other client-side technology in terms of caching data and memory usage, it's just that all the data is attached to one tree rather than split up into separate "model" instances or something. |
I, too, am curious about how others are doing this sort of thing. I keep thinking of implementing a state path subscription model would work best, but I haven't thought it all the way through... Where to "subscribe" or otherwise signal intent to keep a given state path (selectors)? When to perform clean up (after several "ticks" with no further subscriptions to that path)? How to uniformly perform a cleanup (dispatch some standard action with enough context, and allow reducers to handle it)? Then again.. this could be a terrible idea... |
My approach for garbage collection is reference counting items that are in use by any component. I have a number of HOCs that dispatch actions on mount/unmount which result in inc/dec of counters. Then there's a task that runs in an interval that prunes unused items. |
I dislike the way state is normalized. Nested reducers with nested state makes more sense to me, and it's so easy to garbage collect when you aren't using it any more. |
@mrpmorris : It's entirely up to you how you structure your own app's state. Redux doesn't enforce any particular approach. However, there are very good reasons for normalization. Quoting the Structuring Reducers - Normalizing State Shape docs page:
|
@aikoven That seems similar to what I'm thinking of. I'm curious, though... How are you describing the state paths which should be cleaned up? Some kind of DSL? |
@jpdesigndev The data is organized as follows: {
collection1: {
pk1: item1,
pk2: item2,
...
},
collection2: {
pk1: item1,
pk2: item2,
...
},
} Garbage collector dispatches action with payload {
collectionN: <array of pks to remove>,
collectionM: ...,
} |
The way my state is structured right now is a bit messy.
I have multiple top level keys which are related to the currently logged in user and some keys
which aren't related to the logged in user, e.g.:
It's a bit messy and I was looking for suggestions to refactor.
When logging out I'd have to check for a
LOGOUT
action in theme
,myFiles
and
myEvents
reducers which is not really maintainable if I keep addingstuff. Another option would be to put everything related to the logged in user
under a
me
key/reducer and clearme
whenLOGOUT
is dispatched, e.g.:A bit cleaner... But there will be some duplication in my
files
anduser
reducers. Another option would be to normalise the whole thing:
Now this kind of normalisation leads to new problems. How do I garbage collect
the state? e.g. If a user browses thousands of profiles, I don't want to end up
with a state that has thousands of entities. I could issue an
{ type: 'RESET_PROFILE_DATA', user_id: 'billgates' }
action when a user's profile component isunmounted. But how does my reducer know which piece of data can be safely removed (e.g. the logged in user state shouldn't be cleared if the user navigates outside its own profile page). We
could add special cases everywhere but it doesn't feel maintainable. Another problem is that there
could be two separate components which are using the same piece of state and it'd be hard to tell.
I guess we could keep a garbage collection counter for every entity which gets incremented when the data is needed and decremented when the data can be garbage collected, etc. But that can get quite complex.
Also, what if I need to display a user's file list simultaneously in two different components and both have their own pagination. I could have
filesForComponentA
andfilesForComponentB
instead offiles
?Anyways, I've got more ideas but would be curious how people typically handle this situation.
The text was updated successfully, but these errors were encountered: