IndexedDB putAll #536

nums11 · 2020-07-14T18:08:17Z

Saluton TAG!

I'm requesting a TAG review of IndexedDB putAll.

Add a new endpoint to the IndexedDB API that will provide developers with a way to bulk-insert many records into an IndexedDB object store.

Explainer¹ (minimally containing user needs and example code): https://github.com/nums11/idb-putall/blob/master/Explainer.md
Security and Privacy self-review²: N/A
GitHub repo (if you prefer feedback filed there): https://github.com/nums11/idb-putall
Primary contacts (and their relationship to the specification):
- Daniel Murphy (@dmurph), Google
- Olivier Yiptong (@oyiptong), Google
- Joshua Bell (@inexorabletash), Google
Organization/project driving the design: Google
External status/issue trackers for this feature (publicly visible, e.g. Chrome Status): https://chromestatus.com/features/5789998911586304

Further details:

[ ✓ ] I have reviewed the TAG's API Design Principles
The group where the incubation/design work on this is being done (or is intended to be done in the future): Google
The group where standardization of this work is intended to be done ("unknown" if not known): unknown
Existing major pieces of multi-stakeholder review or discussion of this design: Feature Request: Allow batch adding records (from an Array) w3c/IndexedDB#69
Major unresolved issues with or opposition to this design: N/A
This work is being funded by: N/A

You should also know that...

N/A

We'd prefer the TAG provide feedback as (please delete all but the desired option):

☂️ open a single issue in our GitHub repo for the entire review

kenchris · 2020-07-15T06:00:50Z

Why is there All in the name. You are just doing bulk insert, that might not be the same as "all" from developer point of view.

dmurph · 2020-07-15T15:34:06Z

I believe it was chosen to match getAll, and it's functionality seems to match other instances of putAll in languages (all java collections implementation of putAll, "insert all" for sql, etc).

What do you think developers will think putAll means, if not to put all of the arguments into the database? I'm not aware of another interpretation, but it could definitely exist :)

cynthia · 2020-07-20T13:02:29Z

I don't have a strong opinion on what is better, but aside from putAll(), putMultiple() and putBatch() also seem to exist as potential options in different contexts. Explicitly indicating that the function is intended for multiple entries does make perfect sense though.

Question 1) The explainer suggests that a later key-value pair clobbering an early key-value pair is fine; but this feels like it might obstruct debugging for cases where this happens due to human/code errors. (Imagine debugging something where the 3rd entry is clobbered by the 499844th entry out of 1M inserts, and you are trying to figure out why) Would this make sense to throw? (or optionally throw?)

Question 2) The examples seem to suggest this is a synchronous API; given that this may be used to batch insert lots of data that could potentially block - would there be a async path for this feature?

nums11 · 2020-07-20T16:58:43Z

Thanks for the input :). Inserting a record with a duplicate key using put overrides the previous record so I assumed that putAll would have the same behavior for consistency.

Sorry for the miscommunication, this API is asynchronous and uses the existing 'onsuccess' and 'onerror' event handling present in IndexedDB. I've updated the explainer to be more clear on this.

nums11 · 2020-07-20T17:05:09Z

Additionally, IndexedDB already has a separate API endpoint called add() that throws an error on duplicate key insertion so I tried to stay away from that. In the future considerations section I mentioned an addAll() which could possibly have similar behavior to putAll() except it would throw an error on duplicate insertion like add().

cynthia · 2020-07-20T23:02:29Z

this API is asynchronous and uses the existing 'onsuccess' and 'onerror' event handling present in IndexedDB. I've updated the explainer to be more clear on this.

Thanks for following up on this. There is an inherent risk of inconsistency in the API surface, but could a non-event based asynchronous design be considered here? The event based design I believe that IDB has predates idiomatic async patterns in JS, so it made sense back then but it feels a bit strange to continue with that pattern for a new-ish API.

In the future considerations section I mentioned an addAll() which could possibly have similar behavior to putAll() except it would throw an error on duplicate insertion like add().

We'll look into this and provide further feedback. Thank you for the info.

dmurph · 2020-07-20T23:21:09Z

this API is asynchronous and uses the existing 'onsuccess' and 'onerror' event handling present in IndexedDB. I've updated the explainer to be more clear on this.

Thanks for following up on this. There is an inherent risk of inconsistency in the API surface, but could a non-event based asynchronous design be considered here? The event based design I believe that IDB has predates idiomatic async patterns in JS, so it made sense back then but it feels a bit strange to continue with that pattern for a new-ish API.

I think this is definitely an option. The main hurdle, in my opinion, is event ordering. If this was a promise, how would we order the execution of this promise vs all of the other events? We have a lot of this defined by spec, and it could get complicated.

@inexorabletash you probably have the best context here. How difficult would it be to make this API return, say, a Promise spec-wise?

I know implementation-wise this would be a little bit rough. We would probably have to still use IDBRequest internally, and have that hold onto the promise resolve, to keep ordering correct with our auto-wrapper.

In the future considerations section I mentioned an addAll() which could possibly have similar behavior to putAll() except it would throw an error on duplicate insertion like add().

We'll look into this and provide further feedback. Thank you for the info.

inexorabletash · 2020-07-20T23:27:37Z

There has been some exploration of a new API based on top of Indexed DB that introduces a Promise-based async API rather than event-based async API kv-storage but that incubation has not progressed.

Incorporating Promises into Indexed DB is non-trivial, see for example https://github.com/inexorabletash/indexeddb-promises which describes some of the complexity around event loop / transaction integration. In short, doing it for a single method would introduce confusion for developers (as it would differ from other methods) and as the transaction activation mechanism would need to be redesigned to accommodate it (transaction activation is closely tied to the event loop, not microtasks.)

Practically speaking, Indexed DB is predominantly used indirectly by users who instead select libraries, many of which wrap the usage in promises. It's important to make sure that additions to the API surface can be integrated into libraries - i.e. by following the same transaction model.

annevk · 2020-07-21T07:47:40Z

I'm a little worried that the motivation here comes down to a slow binding layer. Is that the case?

Or is it really the primitive of bulk-insert-or-fail? I.e., either all the given records get inserted or none get inserted.

nums11 · 2020-07-21T17:03:31Z

To answer your question, the primary motivations are partner requests and customer feedback. We hope for some small performance improvements but those aren't certain and would most likely come from amortized costs of certain operations and this will have to be verified through testing.

It would be such that all the records get inserted or none get inserted.

kbsspl · 2020-07-28T14:13:53Z

would a putAll accepting options with a returnDuplicatekeys:true which onsuccess return a set of keys which existed prior and were updated be useful.

dmurph · 2020-07-28T17:34:41Z

That's a great question. We actually are interested in the ideas:

putAll returns nothing as a result
putAll returns all keys as a result

The duplicate keys idea could be an add-on to the second option.

What would you expect of the above return values?

kbsspl · 2020-08-06T15:22:24Z

I would expect 1. to be a complete success with all data either added or updated.
For returning results, both, the updated and failed sets would be helpful.

apologies for the delay in responding.

dmurph · 2020-08-06T15:29:15Z

The API is being designed as all-or-nothing, so if one result fails then the whole set fails. Does this match your expectation? There shouldn't be some-failed-some-updated results right now, basically, as then the whole api call would fail anyways (and return an error event)

It sounds like you would expect the list of keys, which we can do.

kbsspl · 2020-08-07T14:05:32Z

I find both, all-or-nothing, or partial updates having genuine use-cases, but personally would expect a set of keys with partial commits happening.

additional thoughts -
At the risk of increasing the complexity of the putAll usage early on. I would prefer it to be parametrized.

parameters -
allOrPart = All/Part- the approach to take.
returnKeys = Success/Failure/Both - the sets to return

"All" - In this approach while the data itself is not committed.
For "All", "Success" returns all keys updated successfully (which is not very useful for an "All" approach and maybe ignored). "Failure" returns the keys which failed to update for whatever reason. "Both" returns whatever keys were successful and whatever keys failed, to help mitigate.

"Part" - In this approach, any data which could be successfully updated, gets committed.
For "Part" - "Success" returns only all keys updated successfully, "Failure", only the failed keys, "Both" returns both the success and failed key sets.

Another point that I can think of is that the definition of "Success" itself may need to be clarified.
Should everything that was "updated" irrelevant of an addition or update be labeled a "Success". OR Are only "updates" considered a success.

And what is the definition of an "update"
Is updating data without any difference in the incoming and existing sets -("key_1", {title: "Quarry Memories", author: "Fred"}) updated the ObjectStore which previously had ("key_1", {title: "Quarry Memories", author: "Fred"}) an update ?
vs
updating ("key_1", {title: "Quarry Memories", author: "Fred"}) to ("key_1", {title: "Quarry Memories original value", author: "Fred"}) in the ObjectStore

I am not sure how well thought out the comment is, so please ignore or provide feedback to help me understand and see context if I'm not making sense..

dmurph · 2020-08-07T18:46:39Z

Thanks for the feedback!

I guess the main difficulty here with the functionality that you're mentioning is that it conflicts with the existing patterns of the IndexedDB API. Currently all calls to get or modify data in the database follows an asynchronous pattern where the result of the API call is an IDBRequest object, and then the developer listens to a 'success' or 'error' event on that request object.

We don't have any API calls that can 'partially succeed', they all either succeed totally or fail totally. All requests are associated with a transaction, which again either totally atomically commits, or fails & is aborted (and all changes are not in the database).

Regarding updating as changing vs not changing, this also isn't an existing feature of the database. putAll is trying to model a bulk-inserting of put, so at least initially we want to model what put does. The one restriction we DO provide is put vs add, where add will error if a record already exists at the given key. It wouldn't be too hard for us to also create an addAll as well with our current implementation, not sure about others.

The feature knowing if a record has been 'changed' or not might be valuable, but we haven't received any feedback that people want this. We only have feedback that people want a putAll.

To keep things consistent with the existing API patterns, and to keep this change simple, I think we want to have putAll* be all-or-nothing, and have no extra options. I think we should probably return the array of keys that were committed - this is useful for people who have auto-increment on.

If you have a use cases for the behaviors above, definitely let us know. The only one that might be 'faster' if the API supported it would probably be the constraint for telling the user if the value was updated or not. Allowing some puts to fail is already supported by just using the put API as it already exists.

kbsspl · 2020-08-10T09:15:09Z

thank you for such a detailed response. I understand the considerations better now.

plinss · 2020-08-19T20:29:28Z

Overall we're satisfied with this and ready to close. We just wanted to check if other projects that build on top of IndexedDB, such as PouchDB have been consulted and had an opportunity to provide feedback.

torgo · 2020-09-22T09:32:32Z

Discussed and agreed to close at our f2f.

nums11 added Progress: untriaged Review type: CG early review An early review of general direction from a Community Group labels Jul 14, 2020

kenchris self-assigned this Jul 15, 2020

torgo self-assigned this Jul 15, 2020

torgo added Venue: WebAppSec WG Venue: Web Apps WG W3C Web Applications Working Group and removed Progress: untriaged Venue: WebAppSec WG labels Jul 15, 2020

torgo added this to the 2020-07-20-week milestone Jul 15, 2020

torgo modified the milestones: 2020-07-20-week, 2020-07-27-week Jul 21, 2020

plinss modified the milestones: 2020-07-27-week, 2020-08-17-week Aug 17, 2020

cynthia added the Progress: propose closing we think it should be closed but are waiting on some feedback or consensus label Aug 18, 2020

plinss modified the milestones: 2020-08-17-week, 2020-09-21-F2F-Cork Sep 13, 2020

torgo closed this as completed Sep 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IndexedDB putAll #536

IndexedDB putAll #536

nums11 commented Jul 14, 2020 •

edited

Loading

kenchris commented Jul 15, 2020

dmurph commented Jul 15, 2020 •

edited

Loading

cynthia commented Jul 20, 2020 •

edited

Loading

nums11 commented Jul 20, 2020

nums11 commented Jul 20, 2020

cynthia commented Jul 20, 2020

dmurph commented Jul 20, 2020

inexorabletash commented Jul 20, 2020

annevk commented Jul 21, 2020

nums11 commented Jul 21, 2020

kbsspl commented Jul 28, 2020

dmurph commented Jul 28, 2020

kbsspl commented Aug 6, 2020

dmurph commented Aug 6, 2020

kbsspl commented Aug 7, 2020 •

edited

Loading

dmurph commented Aug 7, 2020

kbsspl commented Aug 10, 2020

plinss commented Aug 19, 2020

torgo commented Sep 22, 2020 •

edited by dbaron

Loading

IndexedDB putAll #536

IndexedDB putAll #536

Comments

nums11 commented Jul 14, 2020 • edited Loading

kenchris commented Jul 15, 2020

dmurph commented Jul 15, 2020 • edited Loading

cynthia commented Jul 20, 2020 • edited Loading

nums11 commented Jul 20, 2020

nums11 commented Jul 20, 2020

cynthia commented Jul 20, 2020

dmurph commented Jul 20, 2020

inexorabletash commented Jul 20, 2020

annevk commented Jul 21, 2020

nums11 commented Jul 21, 2020

kbsspl commented Jul 28, 2020

dmurph commented Jul 28, 2020

kbsspl commented Aug 6, 2020

dmurph commented Aug 6, 2020

kbsspl commented Aug 7, 2020 • edited Loading

dmurph commented Aug 7, 2020

kbsspl commented Aug 10, 2020

plinss commented Aug 19, 2020

torgo commented Sep 22, 2020 • edited by dbaron Loading

nums11 commented Jul 14, 2020 •

edited

Loading

dmurph commented Jul 15, 2020 •

edited

Loading

cynthia commented Jul 20, 2020 •

edited

Loading

kbsspl commented Aug 7, 2020 •

edited

Loading

torgo commented Sep 22, 2020 •

edited by dbaron

Loading