-
Notifications
You must be signed in to change notification settings - Fork 3
Present log book pages for annotation in date order #95
Comments
this will depend totally on the subject upload order being correct and not concurrent. I'm not a fan of this. We've got selection strategies in the API to deal with this and we should support this strategy there. We have plans to provide the ability to select the first page randomly and then provide a prev / next subject link to allow page turning. Also this will suffer under a user influx / media event where we serve the start images to a large set of users and oversample before retirement (SGL, etc). |
Completely understand. Sounds like we're better off waiting for the backend support. In the meantime, you and @eatyourgreens have mentioned priorities on subject sets (or subject set links?); @saschaishikawa suggested manually adding priorities to the existing OW data based on the page number in the metadata - does that sound viable? |
To be honest the actual page linking feature is really just metadata each subject. i.e. add prev: subject_id_X, next: subject_id_y links to each one. I can help do this if needed but we really need that curated linked list to do it. Priority selection strategies work but suffer from the oversampling issue under media events, from memory we used this for Annotate with small subject sets and it didn't work so well. |
Oki doki! I'll take a look at the subjects and see what metadata we have on each set to infer order |
Sorry forgot to add that we need add api support to allow a param to set the selection context (seen_before / retired) to the normal |
If we implement a linked list by adding metadata, does that mean we can only grab one subject at a time (because we won't know where the "next" link will point to)? |
Yes... but you could still get it to construct a queue in the background while you're looking at the current subject |
If we know the subject ids we can request in bulk via a URL like /resource_name?ids=1,2,3 https://github.com/RestPack/restpack_serializer/blob/master/README.md#by-primary-key So we can get a random offset using subject selection service and then allow page turning / URL linking via subject ids. If this doesn't work let me know, it should be supported. |
Grabbing a list of subjects by ids works fine. @camallen so you're suggesting we fetch the first unseen subject randomly, using cellect, something like this:
And then use the
Is there a way to dump a list of subject ids for a given subject_set? Is it safe to assume sequentially increasing subject ids are in order of increasing page number? I'm trying to figure out the best way to create those linked lists. |
Almost, they will be increasing but its a shared tablespace so the ok may not increment sequentially. you'll have to traverse the subjects depending on the metadata (linked list vs array of subject ids), I'm not sure how you can create a page of data in 1 go (array of subject ids?) but you can get next / prev in 1 go. As rog said you can construct a queue in the background and the normal mode of load / transcribe would help with this. Page turning may just want to get next on each turn page event...? |
Cool. This is helpful. Here's a proposed fix that I wrote up, mostly to wrap my head around the problem and bounce ideas to make sure I'm not over complicating anything. Proposed Fix: To curate a linked list of subjects in the proper page order. This requires a traversal through all the subjects in order to determine a correspondence between subject id and page number, followed by a sorting of subject ids by page number. Note that for a given subject, the only reliable source to determine ordering is by the Use API to get the first subject, Store Sort the array by page number. This'll give the proper ordering of subject ids. For each subject id, we first get the subject hash and add Lastly, the Old Weather codebase needs to be modified to fetch a random unseen subject and, as a secondary step (with at least one additional API call), the next/prev subjects can be cached. One idea would be to store the next 5 (or however many) subject ids in metadata instead. Something like
We would just have to take care to handle the near-end cases where we have less array elements. |
This looks good, a few thoughts.
Yep just use the normal subject end point selection here (most likely passing the set_id and sort param).
So this will traverse the list and create the link list and then update the API subject metadata. I think we can just use the subjects export csv file for this instead of querying the API.
Yes!
Re the first point, I don't think so, looks like it currently uses the correct url (except that damn page param)
Could be an idea to test per one and per batch(5) response times manually to see how this goes..as one should be able to buffer at least 1 (probably more) during render / after page load, etc. That'll give you a good idea about how you want to do it. |
At the basic level this is probably largely solved by sorting the subject fetch by
created_at
, eschewingcellect
for a manual check on the frontend for whether the user has seen the subject. We could then add some nice UI around showing the current page in the context of the log book and allow navigation between pages. @rogerhutchings says he has some nice designs somewhere :)The text was updated successfully, but these errors were encountered: