-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggested data modelling approach #1529
Comments
(pinging @leplatrem in case he wants to also share his thoughts) I think your best bet would be to store shelves, decks, and cards as three separate collections, and use a property on the deck and card records to point to the parent record (that is, decks would have a property pointing to the parent shelf, and cards would have a property pointing to the parent deck). This would mean that when a new shelf/deck/card is created, your client will pull only a single record representing the data unique to the newly created resource (technically it will pull all updated records in that collection). The most painful part of this setup would be the initial sync. If you have dozens of shelves (~36) with hundreds of decks each (~400), each with thousands of cards (~5000), your initial sync would be 36 shelf records, 14,400 deck records, and 72,000,000 card records. While I doubt these numbers are accurate, I'm yielding to your opinion on that matter. With a dataset this large, you'll need to tweak a few important Kinto server settings to adjust the maximum collection size and the default page size (since However, you could choose to merge decks and cards into a single resource to reduce the number of records required for storage. In the above example, you'd only have 36+14,400 records per user, which is a lot more reasonable than tens of millions of records! The downside here is that you'd need to rely heavily on Kinto's JSON Patch operation support, which isn't used for synchronization with the If you have more questions, please feel free to reach out again! I encourage you to give it a shot using real data so you can get a feel for what the performance characteristics are like. Kinto works fairly well for small and medium sized datasets, but when you're talking about datasets as large as the one you're describing, it would take a bit of tinkering to get it working smoothly. The Kinto server API has all the bits you'd need to get this working; it's just that our JS client is optimized for the more common use cases. |
I agree with Dylan suggestions. If I understood correctly, the data that you sync is readonly and coming from the server. So if the cost of the initial sync is your only issue, you can also ship «dumps» of the server records as JSON, along with the assets of your application, that you load into your local DB without pulling from the network. With regards to managing links between different collections, it really depends how often you would create/delete/update the records behind the different links. From my experience, beyond a couple of thousands records that change often, synchronization can be painful. Make sure you try out performance of synchronization with thousands of records before going too far with coding. |
Thanks @dstaley, @leplatrem!
The most realistic scenario for a very active app user (like myself) who is learning multiple languages is to have from 10,000 to 50,000 card records overall. Would this number of records be OK for initial sync?
Most updates should be fairly small, except for when the user imports cards into a deck from an external source. I want to allow them to import up to 1,000 cards at once, but I could limit it to a smaller number, and sync after each import.
I'd like to avoid this if at all possible.
I will try and do that.
Would 50,000 records constitute a medium sized dataset?
Overall I expect the data to be mostly static (most of it should rarely change), but it is not readonly. It is the data that the user has generated while working with the app: creating new shelves, decks, cards, learning their cards.
I expect the most frequent operations to be updating cards learning statistics (each card has a stage and due date_time) which change during a learning session, and creating new cards. I would expect a typical user to learn from 10 to 100 cards in a typical day, and to create from 0 to maybe 50 new cards per day.
What do you mean by 'painful'? Slow? |
Hello,
I'm evaluating Kinto for use in a language learning web app.
Users can learn any number of languages, they can create shelves of flashcard lists (card decks), and they can learn the card lists using spaced repetition.
How would I model the data for optimal sync?
If I use one collection (languages) and nest everything under it, would sync be very slow for dozens of shelves containing hundreds of decks which contain thousands of cards?
Another approach I can think of is using one kinto collection per language. But I'm not sure this will speed up sync enough.
What about splitting the data into several collections (one per entity): languages, shelves, decks, cards?
Then I would need to link them together in some manner. This would presumable make sync faster. But this would cause relationship management issues, wouldn't it?
languages
-- shelves
---- decks
------ cards
The text was updated successfully, but these errors were encountered: