Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat responses to collapsed requests as fresh #1927

Conversation

rousskov
Copy link
Contributor

Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.

Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in still-being-delivered-to-Squid response triggering
its own revalidation if that response is deemed stale on arrival (e.g.,
has a Cache-Control: max-age=0 field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change does
not prevent freshness checks for responses that were still receiving
body bytes at the time of the hit request.

TODO: Can we reduce the risk of forgetting to check didCollapse when
calling refreshCheckHTTP(), especially if we add more refreshCheckHTTP()
callers?

TODO: Current code changes only cover a refreshCheckHTTP() call, but
what about other refreshCheck() callers (e.g., refreshCheckICP())?
XXX: A single clientReplyContext::didCollapse member cannot distinguish
which request (initial or refresh) got collapsed. Code "works" because
the new member is ignored by refresh transactions -- they do not consult
refreshCheckHTTP() -- but this ambiguity is a design defect. Fixing this
requires merging store_client class into StoreClient class while
refactoring clientReplyContext to move initial and refresh fetching code
into two distinct/dedicated classes, each derived from the merged
StoreClient class (instead of current complex/error-prone saveState()
and restoreState() hacks that use _same_ classes for both fetches).
Requires squid-daft commit 12ae6fdd or better.
Copy link
Contributor Author

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments in this review annotate this PR without requesting any changes.

@@ -58,6 +59,9 @@ class StoreClient: public Acl::ChecklistFiller
bool mayInitiateCollapsing() const { return onCollapsingPath(); }
/// whether Squid configuration allows collapsing for this transaction
bool onCollapsingPath() const;

/// whether startCollapsingOn() was called and returned true
mutable bool didCollapse = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single clientReplyContext::didCollapse member cannot distinguish which request (initial or refresh) got collapsed. Code "works" because the new member is ignored by refresh transactions -- they do not consult refreshCheckHTTP() -- but this ambiguity is a design defect.

Fixing this requires merging store_client class into StoreClient class (I added a TODO about that) while refactoring clientReplyContext to move initial and refresh fetching code into two distinct/dedicated classes, each derived from the merged StoreClient class (instead of current complex/error-prone saveState() and restoreState() hacks that use the same classes for both fetches). That complex refactoring is outside this small PR scope, but I hope to be able to do it in the foreseeable future.

Comment on lines -984 to +987
} else if (refreshCheckHTCP(e, checkHitRequest.getRaw())) {
debugs(31, 3, "NO; cached response is stale");
} else if (e->hittingRequiresCollapsing() && !startCollapsingOn(*e, false)) {
debugs(31, 3, "NO; prohibited CF hit: " << *e);
} else if (!didCollapse && refreshCheckHTCP(e, checkHitRequest.getRaw())) {
debugs(31, 3, "NO; cached response is stale");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Git diff does not make this obvious, but this PR reorders HTCP and ICP freshness check steps to use didCollapse member (set during startCollapsingOn() step) for avoiding refreshCheck*() step. This reordering should not have a significant effect in most cases because hittingRequiresCollapsing() quickly returns false for most transactions. In rare/exceptional collapsed forwarding cases, the new computation order may be slower (both checks may perform configuration-dependent ACL or ACL-like computations), but it is a fair price to pay for avoiding revalidation transactions in those cases.

@@ -627,7 +627,7 @@ clientReplyContext::cacheHit(const StoreIOBuffer result)
http->updateLoggingTags(LOG_TCP_MISS);
processMiss();
return;
} else if (!r->flags.internal && refreshCheckHTTP(e, r)) {
} else if (!r->flags.internal && !didCollapse && refreshCheckHTTP(e, r)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementation requires this refreshCheckHTTP() and other indirect refreshCheck() callers to remember to check StoreClient::didCollapse member. We should embed that check, so that it cannot be forgotten! Doing so requires adding a pure virtual StoreClient::refreshCheck() method and moving Foo-specific refreshCheckFoo() functions into its overrides. The latter, in turn, requires some refresh.h API refactoring. All that is outside this small PR scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about adding a TODO or XXX to that effect? PR comments are bound to be forgotten

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about adding a TODO or XXX to that effect?

Done in commit 9fc1ed5.

PR comments are bound to be forgotten

Yes, but a much bigger problem, in my experience, is that C++ comments like that often trigger negative votes and associated overheads, especially in areas where the existing code is quite messed up, obscuring underlying problems and proper solutions. C++ comments are also bound to become stale and, hence, misleading. That is why I only added two minimal C++ TODO/XXX comments and kept the rest of the discussion to PR comments. Let's hope it is going to be different this time.

@kinkie
Copy link
Contributor

kinkie commented Oct 31, 2024

LGTM with a nit

@rousskov rousskov added the S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box label Oct 31, 2024
kinkie
kinkie previously approved these changes Oct 31, 2024
@kinkie kinkie added S-waiting-for-more-reviewers needs a reviewer and/or a second opinion and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Oct 31, 2024
squid-anubis pushed a commit that referenced this pull request Nov 3, 2024
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
@squid-anubis squid-anubis added M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 3, 2024
squid-anubis pushed a commit that referenced this pull request Nov 5, 2024
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
@squid-anubis squid-anubis added M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 5, 2024
squid-anubis pushed a commit that referenced this pull request Nov 5, 2024
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
@squid-anubis squid-anubis added M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 5, 2024
squid-anubis pushed a commit that referenced this pull request Nov 5, 2024
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
@squid-anubis squid-anubis added M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 14, 2024
squid-anubis pushed a commit that referenced this pull request Nov 16, 2024
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
@squid-anubis squid-anubis added M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 16, 2024
squid-anubis pushed a commit that referenced this pull request Nov 22, 2024
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
@squid-anubis squid-anubis added M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 22, 2024
squid-anubis pushed a commit that referenced this pull request Nov 23, 2024
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
@squid-anubis squid-anubis added M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 23, 2024
@rousskov
Copy link
Contributor Author

I plan to clear this previously approved PR for merging on November 27, 2024.

squid-anubis pushed a commit that referenced this pull request Nov 27, 2024
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).

HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html

This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.

After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
@squid-anubis squid-anubis added M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 27, 2024
@rousskov rousskov added M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels and removed S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Nov 30, 2024
@squid-anubis squid-anubis added M-merged https://github.com/measurement-factory/anubis#pull-request-labels and removed M-passed-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels labels Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
M-merged https://github.com/measurement-factory/anubis#pull-request-labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants