Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getAll() with both key and value (or index key and primary key) #206

Open
inexorabletash opened this issue Jun 5, 2017 · 20 comments
Open
Labels
feature request has-pr queries TPAC 2019 Discuss at Fukuoka TPAC2024 Topic for discussion at TPAC 2024
Milestone

Comments

@inexorabletash
Copy link
Member

As a cursor replacement, getAll()/getAllKeys() has the issue that you need to pick - you can't get both the key and the value. For object stores this isn't so bad if inline keys are used, but still awkward. For indexes you can't even get the primary key.

Here's an example where getAll() could help but needs index key and primary key

https://stackoverflow.com/questions/44349168/speeding-up-indexeddb-search-with-multiple-workers

@inexorabletash
Copy link
Member Author

A good name surfaced over in WICG/kv-storage#63 -

getAllEntries()

... which sort-of aligns with ECMAScript's array iterators: values(), keys(), and entries()

It would produce an array of two-element arrays.

@inexorabletash inexorabletash added the TPAC 2019 Discuss at Fukuoka label Jun 5, 2019
@brettz9
Copy link
Contributor

brettz9 commented Jun 5, 2019

Which could nicely be fed into Object.fromEntries, passed into a Map, etc.

@inexorabletash
Copy link
Member Author

Object stores are easy:

  • getAllEntries() → [[primaryKey, value], ...]

Indexes are less straightforward:

  • ??? → [[primaryKey, value], ...]
  • ??? → [[indexKey, primaryKey], ...]
  • ??? → [[indexKey, primaryKey, value], ...]

Having getAllEntries() on an Index produce [[primaryKey, value], ...] is closest to how getAllXXX() work on Indexes today, but there are use cases for the other permutations as well.

@inexorabletash
Copy link
Member Author

I put a PR up for a straightforward version of getAllEntries() but see above re: index use cases. No consensus on this yet, I was mostly playing with refactoring the methods.

@inexorabletash
Copy link
Member Author

inexorabletash commented Sep 20, 2019

TPAC 2019 Web Apps breakout:

  • Consensus that we should solve this; calling getAll() and getAllKeys() and correlating is unfortunately common
  • Object stores: getAllEntries() looks good
  • Indexes: no good intuition

Follow-up: refine proposal for indexes

@inexorabletash
Copy link
Member Author

Staring at this a bit more, I'd be tempted to have indexes produce:

  • [[primaryKey, value, indexKey], ... ]

i.e. if you only pay attention to the first two elements, this looks just like getAllEntries() and can be used with Map constructor. If you need the index key it's there.

@inexorabletash
Copy link
Member Author

inexorabletash commented Dec 16, 2019

Sketching this out a bit more:

store.getAllEntries({
   query: key_or_range,
   count: limit,
   direction: next_or_prev
});
// yields: [ [ key1, value1 ], [ key2, value2 ], ... ]

index.getAllEntries({
   query: key_or_range,
   count: limit,
   direction: next_or_prev
}); 
// yields: [ [ primaryKey1, value1, indexKey1 ], [ primaryKey2, value2, indexKey2 ], ... ]

Note that this still doesn't allow trivial paging of results i.e. with a "skip".

  • All options are optional
  • The direction option would be IDBCursorDirection limited to "prev" or "next" (throwing otherwise).
  • Open question: when direction is "prev" are the results in forward or reverse order?

@masterkidan
Copy link

By default IndexedDb seems to store by ascending order, so I guess prev would be descending or reverse order. This should be fairly straightforward to do with leveldb, I have the changes locally (albeit against an earlier version of chromium) if this change has traction, then am willing to go ahead and open a PR for chromium to add it in, need to see what should be done for FF/Webkit browsers though.

On a related note, this is something that would be extremely useful with messaging apps, by default we store messages in ascending time order (next cursor) but we often need to show them in descending time order in the UI (latest message first or by using the prev cursor).

@stefanhaustein
Copy link

Indexes are less straightforward:

  • ??? → [[primaryKey, value], ...]
  • ??? → [[indexKey, primaryKey], ...]
  • ??? → [[indexKey, primaryKey, value], ...]

For the index case, in my opinion, option 2 (indexKey, primaryKey) would be best, as this option would allow us to obtain the index keys for a range without the overhead of fetching full values.

I am not sure if this is covered by "key" value for the "query" option? Would be great if this could be clarified!

@SteveBeckerMSFT
Copy link

TPAC 2024: We discussed the getAllEntries() proposal referenced above, which we agreed is a good addition that should not be complex for browsers to implement. We discussed how this proposal addresses some of the limitations reported by developers in this issue, including:

  • Getting primary keys, values and indexes at the same time.
  • Adding a direction option to support reverse iteration.
  • Can reduce event loop overhead to potentially improve performance.

I've published an explainer that goes into more detail.

@evanstade
Copy link
Contributor

the spec consistently uses the term "record" rather than "entry". @inexorabletash wdyt of getAllRecords()?

@evanstade
Copy link
Contributor

Also, why: getAllEntries() → [[primaryKey, value], ...] instead of getAllEntries() → [{primaryKey: value}, ...]?

@inexorabletash
Copy link
Member Author

For both of the above comments (naming, structure) I was heavily influenced by ECMAScript's Map.prototype.entries(), which was designed to play well with the (at the time) newly introduced destructuring syntax i.e.: for (const [k, v] of map.entries()) { console.log(k, v); }

... and if I recall correctly, this was before JS supported destructuring of objects, i.e. for (const {k, v} of ...) ...

Revisiting the proposed API to improve ergonomics is fine! Consistency with JS should be a factor, but not foolish consistency.

@abhishek-shanthkumar
Copy link
Contributor

+1 about "record" being a better term for getAll...() when applied to regular object stores. But for indexes, IIUC, "record" => [indexKey, primaryKey]. If we're agreed on returning [primaryKey, value, indexKey] from IDBIndex.getAll...(), then the contract of the new API isn't exactly about returning the records as-is from the underlying (regular/index) object stores, but about returning a collection of attributes that we hope to be useful for both types of stores, and it just so happens to be the same as the record for regular object stores. From that perspective, it maybe better to use a new term, and "entry" could serve that purpose.

@evanstade
Copy link
Contributor

Fair points by @abhishek-shanthkumar

On naming: IDBIndex's getAllKeys() already uses "key" to refer to the object store record's key (i.e. primary key), not the index's key. Likewise getAll() returns referenced values i.e. the value half of an object store record. So the established getAllXXX terminology is relative to the referenced object store. (Well technically getAll() should probably be called getAllValues() and then we could just call the new thing getAll(), but anyway...)

On form:

If we're agreed on returning [primaryKey, value, indexKey]

I think we are aligned on that function but perhaps not the form. It seems to me a bit unclean thing to return an array of a particular length where you just have to know which element is which thing, as opposed to an object such as

{`primaryKey`: primaryKey, `referencedValue`: value, `indexKey`: indexKey}

or

{`indexKey`: indexKey, 'record': {key: value}}

@SteveBeckerMSFT
Copy link

SteveBeckerMSFT commented Oct 22, 2024

A few comments:

  1. About the form { key: value }, I don't think this will work for all types of IndexedDB keys. For example, binary keys and array keys may cause issues.

  2. For consistency, should we add a record property to IDBCursorWithValue? This property could return the same type as the equivalent value in the getAllRecords() result array?

Here are a few options we could consider for the form of the results:

  1. Continuing with getAllEntries(), update IDBIndex::getAllEntries() to return an array of nested entries:

[[ indexKey1, [ primaryKey1, value1]], [ indexKey2, [ primaryKey2, value2]], ... ]

Advantages:

  • Consistency: getAllEntries() returns an array of entries for both IDBObjectStore and IDBIndex.
  • Compatible with existing ECMAScript entries. Like @inexorabletash pointed out above, developers may directly use the results to construct a Map or Object.

Disadvantages:

  • Duplicate keys exist in the entries array for indexes with non-unique keys.
  • Unusual ergonomics. Like @evanstade pointed out above; to access the value of the second record, developers would use results[1][1][1], which is not very intuitive.
  1. Synthesizing the comments above with the existing IDBCursor interface, I propose the following form:
[  
{  'key': key1, 'primaryKey': primaryKey1, 'value': value1 },
{  'key': key2, 'primaryKey': primaryKey2, 'value': value2}, 
... 
]

Like an IDBCursor, for IDBObjectStore, key and primaryKey return the same value, which is the record's primary key. For IDBIndex, key returns the index key while primaryKey returns the record's primary key.

Continuing the example above, developers would use the following to access the value of the second record: results[1].value.

Like @inexorabletash pointed out above, developers have a lot of flexibility when using Array.map() with destructuring. For example, developers can construct a Map:

// Transform the results from `getAllRecords()` into an array of entries: 
// [ [ primaryKey1, { 'value': value1, 'indexKey': indexKey1 }], [ primaryKey2, { 'value': value2, ... ], ... ]
const map = new Map(get_all_records_results.map(
    ({primaryKey, value, key}) => [primaryKey, { value, 'indexKey': key}]));

// Use the map to access record values and index keys.
const value = map.get(primaryKey).value; 
const indexKey = map.get(primaryKey).indexKey; 

I plan to update the explainer and prototype to use this format, but feedback is very welcome! I'm happy to continue to iterate.

@evanstade
Copy link
Contributor

Conclusions above SGTM.

For consistency, should we add a record property to IDBCursorWithValue? This property could return the same type as the equivalent value in the getAllRecords() result array?

Isn't the IDBCursorWithValue already more or less the record (or a superset of the record)? We could maybe add an asRecord() convenience method, not sure how much demand there is though.

@jyasskin
Copy link
Member

jyasskin commented Nov 4, 2024

Is there something that distinguishes an IDB "record" from the "entries" in Object.entries and Map.entries? If there's not, I suggest using the name the rest of the platform has already used.

@inexorabletash
Copy link
Member Author

Also, while "record" is a spec term, I don't think it has made it into the API yet so there isn't a consistency argument.

@evanstade
Copy link
Contributor

while "record" is a spec term, I don't think it has made it into the API yet so there isn't a consistency argument.

My main argument here is that the spec and MDN (e.g.) both consistently use the term "record". Other spec terms tend to map directly into the API symbol names. To do otherwise would make it harder to reference the spec or MDN as documentation. But you are correct that if only looking at API symbol names, "record" hasn't been used yet.

Is there something that distinguishes an IDB "record" from the "entries" in Object.entries and Map.entries?

Here are a few:

  • Conceptually, records are immutable. You can only read them, delete them, or replace them.
  • Concretely
    • IDB records cannot be accessed synchronously
    • so far we're planning to use a different format for the data returned by getAllRecords as compared to Map.entries or Object.entries.

An IDB record is still conceptually a key/value pair, but "record" I think more strongly evokes exactly what it is, i.e. a "row in a database", than does "entry". TBH, either term work, so it feels like a bit of a bike shed and it's probably not possible to make a wrong decision here, but what is the concrete advantage of using "entry"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request has-pr queries TPAC 2019 Discuss at Fukuoka TPAC2024 Topic for discussion at TPAC 2024
Projects
None yet
Development

No branches or pull requests

8 participants