Restrict allowed key types #39

pwnall · 2018-10-17T09:00:26Z

IndexedDB supports a fairly large set of key types, and comparing between them isn't very intuitive. How about supporting an explicit subset?

Proposal:

number
string
(maybe) exactly one type of typed array; Uint8Array seems the most intuitive

The ordering has to remain the same as in IndexedDB, if we want the option of async iterators over keys / values.

Dates can be serialized to numbers when passed in -- I think this would be consistent with WebIDL.

The main missing element here is Array objects. This does preclude some nice use cases, but I claim those are more advanced. If I'm wrong, we can always expand the set of supported types, whereas it's harder / impossible to narrow it.

domenic · 2018-10-17T09:07:58Z

Previous discussion in #2. To be clear, you are only talking about restricting keys, not values?

One thing I'm missing is, what are the advantages of restricting?

pwnall · 2018-10-17T09:37:10Z

Sorry, I have no idea how I missed #2. Would you prefer that I close this and comment there?

I haven't opened an issue for values yet because I'm still thinking through that. I think we can mostly discuss keys and values separately.

I'd like to discuss this for two reasons:

Databases (especially Chrome's backing store) perform better with small (<= thousands of bytes) keys. Arrays invite larger keys.
I'd prefer a simpler conceptual model to a more complicated one, and fewer key / value types help there, IMHO

If we adopt all types right away, I'd prefer that we have good use cases. I can see reasons for number / string keys right away. I can see raw buffers used to squeeze more storage out of the DB.

pwnall · 2018-10-17T09:38:23Z

Having written the above, I'm starting to think that I'd rather that we limit key sizes instead of types. What are your thoughts on the two axes?

domenic · 2018-10-17T09:58:06Z

No worries, a new fresh issue seems like a good idea in this case.

I'm not a heavy IDB user, but array keys seem really convenient for "tuple"-keyed storage. I'd love to get more feedback, either from users or from implementers who have worked with IDB users, to either substantiate or refute my intuition here. I agree that we could start out small though and expand upon encountering demand.

Restricting sizes seems pretty reasonable, but is there an easy way to determine key size? Maybe what we want here is a new option for IDB databases which allows them to reject all keys over a certain size? Is key size computation interoperable?

pwnall · 2018-10-17T10:32:22Z

Answering the last question: key serialization is implementation-dependent. To be consistent across browsers, we'd have to define some cost model and use that.

We can have a simple and efficient model as long as we leave Arrays out -- numbers would be in, strings and native arrays would have caps on their lengths / buffer sizes. Once we add Arrays, we'd have to come up with a full cost model, so the cost of an Array would be the sum of its elements + fixed overhead.

This is additional complexity for the implementation and spec. As a high-level API, I think it's worth taking up the extra complexity in order to keep developers on a "happy" path.

vweevers · 2018-10-17T10:49:38Z

I'm not sure how relevant this is, but array keys could be useful for compound indexes. For example [author, publisher] when storing books. Then, through key ranges you can answer queries like "give me all books by Roald Dahl", e.g. greater than ["Roald Dahl", ""].

I haven't used IndexedDB exactly like this, because my background is in Level (and level-js which wraps IndexedDB) where things like this are usually handled on a higher level - IndexedDB only sees string or binary keys. That said, compound keys are very powerful conceptually.

I do agree with:

IndexedDB supports a fairly large set of key types, and comparing between them isn't very intuitive

The approach we took in level-js is: you can use any key type you want. If you want to normalize types (e.g. serialize numbers to strings) or avoid the cost of the structured clone algorithm, then we recommend to wrap level-js with encoding-down. Array keys can then be encoded into binary keys with bytewise for example, which has a more intuitive sorting order. This encoding has a performance cost too, which might be an acceptable trade-off for a given app. So TLDR; we leave the choice of key type to the user.

asutherland · 2018-10-22T05:58:44Z

I think any sizing logic would want to go in the HTML standard with the StructuredSerialize algorithm and friends. This could be quite useful in other specs like the Notifiations API which uses the algorithm for Notification.data but does not impose any size limits. As APIs continue to add more spaces to stash user data, it could be useful to bound the data size for implementation sanity and to limit the impacts of broken or malicious sites.

If we did this, perhaps it would make sense to also spec the key size limit for IndexedDB as an immutable limitation from when a database is first opened. This would avoid edge-cases related to direct manipulation of the backing store, and potentially let IndexedDB implementations do some optimizations. Without this, we'd want to spec whether "illegal" key/value pairs are hidden from the async-local-storage API, etc.

Edit: Note that I had made the above comments previously as two separate comments that it looked like Github had eaten due to bad hotel wi-fi. They just appeared when I submitted this comment, and so I've deleted them since their thoughts have been superseded.

tophf · 2019-04-25T05:13:14Z

A use case for Uint8Array keys: encoding string keys in UTF8 manually via TextEncoder/TextDecoder to reduce the disk size of the db because at least in Chrome the string keys are always stored in UCS2 (two bytes per character), unlike values which get automatically compressed into one byte ASCII when possible.

asutherland · 2019-04-25T15:44:50Z

While it's unfortunately not unusual to optimize to a specific implementation, altering API design due to a single browser's implementation decisions runs counter to the point of web standards and requiring multiple implementations.

tophf · 2019-04-25T16:16:57Z

I wasn't suggesting anything, nothing needs to be changed AFAICT, it was just an informational side note confirming the proposed inclusion of Uint8Array.

domenic mentioned this issue Mar 13, 2019

Looking for feedback on async local storage in the browser localForage/localForage#791

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restrict allowed key types #39

Restrict allowed key types #39

pwnall commented Oct 17, 2018

domenic commented Oct 17, 2018

pwnall commented Oct 17, 2018

pwnall commented Oct 17, 2018

domenic commented Oct 17, 2018

pwnall commented Oct 17, 2018

vweevers commented Oct 17, 2018

asutherland commented Oct 22, 2018 •

edited

Loading

tophf commented Apr 25, 2019

asutherland commented Apr 25, 2019

tophf commented Apr 25, 2019

Restrict allowed key types #39

Restrict allowed key types #39

Comments

pwnall commented Oct 17, 2018

domenic commented Oct 17, 2018

pwnall commented Oct 17, 2018

pwnall commented Oct 17, 2018

domenic commented Oct 17, 2018

pwnall commented Oct 17, 2018

vweevers commented Oct 17, 2018

asutherland commented Oct 22, 2018 • edited Loading

tophf commented Apr 25, 2019

asutherland commented Apr 25, 2019

tophf commented Apr 25, 2019

asutherland commented Oct 22, 2018 •

edited

Loading