-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache transactions & fixing addAll
#823
Comments
@wanderview threw this together on a flight, let me know if it's unreadable and I'll see if I can repair it |
Can you share some real world use cases where this extra complexity was needed? Cache's simple API is an advantage over IDB, IMO. |
Looking through the cache and deleting old items isn't a good enough example? We also need to explain what happens if two puts (or addAlls) happen at the same time. |
In terms of simplicity, this doesn't replace regular |
I guess I'm more asking if this is something devs have asked for due to encountering the problem? Or is it all theory? |
"Squashing" would need to be defined. Also, we reject addAll() in squashable cases today. Why change to support it? |
People have asked for a way to look at the cache and decide what to delete. We don't support that safely right now. |
I guess my first take is this is close to a rewrite for our cache impl to support this. I don't want to do that unless it's a real problem for devs. |
Yeah, I hand-waved through squashing, and of course it would need to be defined. I'd be happy to reject if there are competing puts within a single transaction, but delete(request) put(request, response) should safely squash to just the put. I hand-waved through that as I didn't think it was make or break for the general idea. I can flesh that bit out if it is. |
Can we explain what happens with multiple overlapping addAlls & puts in a simpler way? What does gecko currently do? |
Can you explain the safety issue with this? For example, this seems somewhat safe today to me:
I agree this could have issues with many threads reading and writing, but right now we mainly have one thread; the service worker. |
What gecko does is very simple:
|
By the time you get to the .delete(request), the response that gets deleted may not be the response you decided was too old. Another may have add in the meantime time. |
So we have responses of wildly varying size being stored under the same request? What kind of html resources have this characteristic? I'm probably wrong, but it seems to me resources are mostly stable in size over the short term. Also, they could use a vary header to avoid this situation, no? |
Does this mean: cache.put(request, response1);
cache.put(request, response2); …response1 may overwrite response2 even though it was |
Yes, if response1 has a larger or slower body stream. |
Does your proposal allow multiple overlapping transactions? I don't see anything that would block that. Edit: Can you equate what you are asking for into the kind of transactions locking mechanisms supported in sqlite? https://www.sqlite.org/lang_transaction.html#immediate |
Oh, I see Request Transactions now. Does each transaction wait for any previous transactions affecting the given request(s) to complete before starting? |
I'm not following this, but I'm pretty jetlagged right now (hence all the spelling and grammar mistakes - more than usual). Taking your example, but making it about age, not size: caches.open('foo').then(cache => {
cache.keys().then(requests => {
return requests.map(request => {
cache.match(request).then(response => {
return isTooOld(response) ? request : null;
});
});
}).then(requestsToDelete => {
return Promise.all(requestsToDelete.map(request => {
if (!request) return;
return cache.delete(request);
}));
});
}); When we |
I guess that makes sense. If transaction wait for one another, and body streams can be js controlled in the future, do you worry about deadlocks? |
Yeah, that's 5.i.a of Request transaction. The "wait" is a bit hand-wavey, but the intent is that |
If a transaction fails, are all operations rolled back to the previous state? Maybe I was assuming that as I can't find it in your proposal now. But I'm obviously having trouble reading tonight. |
What would you do to cause a deadlock this way? Deadlocking shouldn't be easy. There's also: caches.open('foo', cache => {
return caches.transaction('foo').then(tx => {
tx.waitUntil(cache.put(request, response));
});
}); …which would deadlock. That's one of the reasons I put |
Start a put(), then lazy .match() a response to fill in the body for the response being putted. So a cache copy essentially. |
Hand-waved in Batch Cache Operations. Sorry, my laptop battery was running out when I got to this bit so it's a bit brief. We already handle |
This is a bit harder, but doable I guess. I will look more at the proposal before our meeting on Tuesday. I have to run now. |
I'm hoping squashing can always turn interleaved |
I think this depends on if we allow put(), match(), delete() within a transaction. If you can interleave reads and writes, then you cannot safely reorder writes. Also keep in mind we have the rules about match()+delete() leaving the Response returned from the match() as a functioning object. Any body file storage needs to exist until the Response is gc'd. How does this work with a match() within a transaction? Are Response and Request objects normally kept alive past the end of the transaction? What about a Response from a rejected/rolled-back transaction? I think I can accommodate keeping those Response objects alive in the gecko impl because I keep the body in a separate file from the sqlite database. I just need to track when the last use of any particulr body file goes away and reap the file at that point. That may not be the case for all implementations, though. |
F2F: change We should work towards under-the-hood transactions to prevent races on writes. Transaction API v2. Talk to Josh. |
Maybe it should be fixed here https://github.com/coonsta/cache-polyfill too (for older versions, i.e. before Chrome 50 and Firefox 46)? |
@NekR happy to merge a PR for it |
Make addAll() fail on responses other than OK in Firefox < 46 and Chrome < 50. Also, if addAll() had to be fixed even though native exists--polyfill add() method too. More details at w3c/ServiceWorker#823
PR is here: dominiccooney/cache-polyfill#19 Everyone is welcome to review :-) |
I retract the deleteResponse suggestion. I still think transactions combined with waitUntil support are super-powerful and may be used as a primitive for cool things that are unrelated to the intent of Cache, but transactions do seem like the best option for user code and implementer code. (And waitUntil is the only sane way to allow a transaction to do data-dependent things while holding the transaction open since body reads are async and don't happen directly from the Cache API, making IDB self-closing transaction semantics infeasible or just jerky.) Hooray for transactions! |
Use-case from #867 (comment): Allow a transactional form of |
F2F: need to make sure you can make a transaction across multiple caches Need to make sure |
F2F: we can do timeouts via |
F2F: be explicit that |
Just got into this case where I may need transaction: I have 2 hash maps: 1) stored on device from previous SW; 2) is in memory which came with new SW. On
On
Obvious problem here is that I shouldn't do operations direction on original-cache because everything could be stopped at middle of operation (e.g. computer hard power off). I can be wrong in everything, so please correct me. |
Just to clarify: Imagine 100 small files in SW cache (http2). On each project release 2-3 files are changed. |
- Change add and addAll to reject if any of the responses are not in ok status - Automatically disallow opaque responses and opaque-redirect responses
Chrome and Firefox differ in the order in which cache keys() are returned. Chrome orders by according to when the put()s were issued. Firefox orders by when the body is complete. The test helper prepopulated_cache_test did not guarantee that these matched, leading to the tests being flaky in Firefox. This change tweaks the helper so that the put()s are processed serially so that the order is deterministic for both. Spec issue: w3c/ServiceWorker#823 BUG=655479 Review-Url: https://codereview.chromium.org/2806793002 Cr-Commit-Position: refs/heads/master@{#463195}
Chrome and Firefox differ in the order in which cache keys() are returned. Chrome orders by according to when the put()s were issued. Firefox orders by when the body is complete. The test helper prepopulated_cache_test did not guarantee that these matched, leading to the tests being flaky in Firefox. This change tweaks the helper so that the put()s are processed serially so that the order is deterministic for both. Spec issue: w3c/ServiceWorker#823 BUG=655479 Review-Url: https://codereview.chromium.org/2806793002 Cr-Commit-Position: refs/heads/master@{#463195}
Chrome and Firefox differ in the order in which cache keys() are returned. Chrome orders by according to when the put()s were issued. Firefox orders by when the body is complete. The test helper prepopulated_cache_test did not guarantee that these matched, leading to the tests being flaky in Firefox. This change tweaks the helper so that the put()s are processed serially so that the order is deterministic for both. Spec issue: w3c/ServiceWorker#823 BUG=655479 Review-Url: https://codereview.chromium.org/2806793002 Cr-Commit-Position: refs/heads/master@{#463195}
Chrome and Firefox differ in the order in which cache keys() are returned. Chrome orders by according to when the put()s were issued. Firefox orders by when the body is complete. The test helper prepopulated_cache_test did not guarantee that these matched, leading to the tests being flaky in Firefox. This change tweaks the helper so that the put()s are processed serially so that the order is deterministic for both. Spec issue: w3c/ServiceWorker#823 BUG=655479 Review-Url: https://codereview.chromium.org/2806793002 Cr-Commit-Position: refs/heads/master@{#463195}
Note, I believe chrome is also non-deterministic when it comes to https://cache-api-perf.glitch.me/order.html This puts The difference between firefox and chrome is where the body is accumulated. In chrome its accumulated in the renderer in memory and then added to the operation queue. In firefox the body is accumulated on the disk and then added to the operation queue. So if the body of one resource is slow to come over the network then both firefox and chrome can result in non-deterministic commit order. If the body is very large and slow to write to disk, then firefox may also have non-deterministic commit order. Given that at three browsers are non-deterministic, I wonder if we really need to add that kind of determinism to the spec. Particularly if other primitives like WebLocks are coming along which might allow the site to enforce ordering if it needs it. Edit: It seems only edge results in |
I can't say I've heard of any complaints about ordering here. As you say, weblocks etc can provide it if needed. I'm happy to shelve the whole transactions thing. |
What we need to solve
cache.put
,cache.add
andcache.allAll
have transaction-like properties that aren't well explained by the spec. Eg, what happens in this case?Should the second cache abort the first? If so what happens to the
pa
promise? Shouldpb
wait untilpa
settles? This is not currently defined.cache.add
andcache.addAll
are atomic, but you cannot ensure responses are ok, making it a bit of a footgun. Developers are surprised to hear this, despite it being consistent withfetch()
. We should changeadd
andaddAll
to reject if any of the responses are!response.ok
, or if ok-ness cannot be determined (opaque response) - maybe include an option to revert to current behaviour.This change is safe, as there the number of opaque responses being added via
add
&addAll
rounds to 0.00%.Use-cases
Adding multiple resources to the cache atomically. Similar to:
The above isn't atomic, "/hello" can succeed & be written to the cache, but "/world" can fail.
Say I wanted to look into the cache and decide whether to delete entries based on their response:
The above is racy. The response we remove from the cache may not be
response
, as acache.put(request, differentResponse)
may have happened in a different thread.Proposal: cache transactions
Aims:
Examples
Multiple
.put
s atomically (from the usecases):Look into the cache and decide whether to delete entries based on their response (from usecases):
cache.put
andcache.delete
become shortcuts to creating transactions, eg:And our modified
addAll
can be defined (roughly):Detail
Some of this is a little rough. Apologies for lack of strictness, hopefully it gets the idea across.
Request
s, initially nullCacheBatchOperation
dictionary objectscaches.transaction(cacheName, options)
:SecurityError
exception.p
be a new promise.options.limitTo
into requestscaches.open
, aside from creating aCache
. Then once we have a request to response map…pendingTransaction
be the result of running the Request transaction algorithm, passing in the request to response map andoptions.limitTo
cacheTransaction
be ap
with a newCacheTransaction
associated with the cache transactionwaitFor
be waiting for all of cache transaction's extend lifetime promisespendingTransaction
's closed flagwaitFor
be a promise resolved with no valuewaitFor
to settlependingTransaction
's closed flagpendingTransaction
's aborted flagpendingTransaction
's aborted flag setpendingTransaction
from request to response map's pending transactionspendingTransaction
's settled flagp
withTypeError
p
with the result of running Batch Cache Operations passing thependingTransaction
's operations as the argumentp
.cacheTransaction.match
&cacheTransaction.matchAll
behave as they currently do, except:TypeError
TypeError
(TODO: or do we allow it, but developer should be aware it's racy?)cacheTransaction.keys
behaves as it currently does, except:TypeError
TypeError
(TODO: or… same question as above)cacheTransaction.put(request, response)
TypeError
TypeError
.put
steps 1-11, but also abort transaction before rejectingo
to operationscacheTransaction.delete(request)
TypeError
TypeError
.delete
steps 1-9o
to operationsRequest transaction:
Input
requestResponseMap
, a request to response maplockedRequests
, a sequence of requestsOutputs
pendingTransaction
, a pending transactioncurrentPendingTransactions
be a copy of the pending transactions associated withrequestResponseMap
pendingTransaction
be a cache transactionpendingTransaction
s locked request tolockedRequests
pendingTransaction
torequestResponseMap
's pending transactionspendingTransaction
incurrentPendingTransactions
:lockedRequests
is undefined, or if any oflockedRequests
have the same url as one or more of the requests inpendingTransaction
's locked requestspendingTransaction
's settled flag to become setpendingTransaction
Abort transaction:
cacheTransaction
, a cache transactionOutputs none
cacheTransaction
's aborted flagcacheTransaction
's extend lifetime promisesBatch Cache Operations
Similar to as it is now, except:
.put
is doing in step 14)Questions
Creating a transaction:
cacheTransaction.put
could return synchronously, is it better to return a promise for consistency?Should read operations like
cacheTransaction.match
consider pending tasks (operations) in their return values?I guess
cacheTransaction.abort()
is worth adding (and easy).Apologies for both the length and hand-waving in this proposal, but I'd rather not spend any more time on it if we think it's a no-go.
The text was updated successfully, but these errors were encountered: