Skip to content

Commit

Permalink
Update default cache-control headers
Browse files Browse the repository at this point in the history
The default cache-control instruction will be:

max-age=60, stale-while-revalidate=86400, stale-if-error=300

Each can be overridden individually, leaving the others intact, or each can
be set to `None` to remove it.

Rationale
---

**max-age=60**

Hard-cache for a minimal amount of time so content can be easily
refreshed.

The considerations here are as follows:
- To avoid caching headaches, it's best if when people need to
  refresh a page (e.g. after they've just done a release) they
  can do so by simply refreshing their browser.
- However, it needs to be long enough to protect our
  applications from excessive requests from all the pages being
  generated (a factor)

1 minute seems like a good compromise.

**stale-while-revalidate=86400**

stale-while-revalidate defines a period after the cache has
expired (max-age) during which users will get sent the stale
cache, while the cache updates in the background. This mean
that users will continue to get speedy (potentially stale)
responses even if the refreshing of the cache takes a while,
but the content should still be no more than a few seconds
out of date.

We want this to be pretty long, so users will still get a
quick response (while triggering a background update) for
as long as possible after the cache has expired.

An additional day will hopefully be long enough for most cases.

**stale-if-error=300**

stale-if-error defines a period of time during which a stale
cache will be served back to the client if the cache observes
an error code response (>=500) from the backend. When the cache
receives a request for an erroring page, after serving the stale
page it will ping off a background request to attempt the
revalidate the page, as long as it's not currently waiting for a
response.

We set this value to protect us from transitory errors. The
trade-off here is that we may also be masking errors from
ourselves. While we have Sentry and Greylog to potentially alert
us, we might miss these alerts, which could lead to much more
serious issues (e.g. bringing down the whole site). The most
clear manifestation of these error would be simply an error on
the site.

So we set this to 5 minutes following expiry as a trade-off.
  • Loading branch information
nottrobin committed Mar 19, 2021
1 parent 3ecfb19 commit ebebf4a
Show file tree
Hide file tree
Showing 4 changed files with 153 additions and 22 deletions.
87 changes: 78 additions & 9 deletions canonicalwebteam/flask_base/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,18 +36,87 @@ def set_security_headers(response):


def set_cache_control_headers(response):
"""
Default caching rules that should work for most pages
"""

if flask.request.path.startswith("/_status"):
# Our status endpoints need to be uncached
# to report accurate information at all times
response.cache_control.no_store = True
response.cache_control.max_age = 0
elif not response.cache_control and response.status_code == 200:
# response.cache_control does not support stale_while_revalidate so...
response.headers[
"Cache-Control"
] = "public, max-age=300, stale-while-revalidate=360"
elif (
flask.request.path.startswith("/static") and "v" in flask.request.args
):
response.headers["Cache-Control"] = "public, max-age=31536000"

elif response.status_code == 200:
# Normal responses, where the cache-control object hasn't
# been independently modified, should:

if not response.cache_control.max_age:
# Hard-cache for a minimal amount of time so content can be easily
# refreshed.
#
# The considerations here are as follows:
# - To avoid caching headaches, it's best if when people need to
# refresh a page (e.g. after they've just done a release) they
# can do so by simply refreshing their browser.
# - However, it needs to be long enough to protect our
# applications from excessive requests from all the pages being
# generated (a factor)
#
# 1 minute seems like a good compromise.
response.cache_control.max_age = "60"

if not response.cache_control._get_cache_value(
"stale-while-revalidate", False, int
):
# stale-while-revalidate defines a period after the cache has
# expired (max-age) during which users will get sent the stale
# cache, while the cache updates in the background. This mean
# that users will continue to get speedy (potentially stale)
# responses even if the refreshing of the cache takes a while,
# but the content should still be no more than a few seconds
# out of date.
#
# We want this to be pretty long, so users will still get a
# quick response (while triggering a background update) for
# as long as possible after the cache has expired.
#
# An additional day will hopefully be long enough for most cases.
response.cache_control._set_cache_value(
"stale-while-revalidate", "86400", int
)

if not response.cache_control._get_cache_value(
"stale-if-error", False, int
):
# stale-if-error defines a period of time during which a stale
# cache will be served back to the client if the cache observes
# an error code response (>=500) from the backend. When the cache
# receives a request for an erroring page, after serving the stale
# page it will ping off a background request to attempt the
# revalidate the page, as long as it's not currently waiting for a
# response.
#
# We set this value to protect us from transitory errors. The
# trade-off here is that we may also be masking errors from
# ourselves. While we have Sentry and Greylog to potentially alert
# us, we might miss these alerts, which could lead to much more
# serious issues (e.g. bringing down the whole site). The most
# clear manifestation of these error would be simply an error on
# the site.
#
# So we set this to 5 minutes following expiry as a trade-off.
response.cache_control._set_cache_value(
"stale-if-error", "300", int
)

# TODO: Disabling this until we've got to the bottom of the
# cache-poisoning issue for static files
#
# elif (
# flask.request.path.startswith("/static") and
# "v" in flask.request.args
# ):
# response.headers["Cache-Control"] = "public, max-age=31536000"

return response

Expand Down
2 changes: 2 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#! /usr/bin/env python3

from setuptools import setup, find_packages

setup(
Expand Down
37 changes: 34 additions & 3 deletions tests/test_app/webapp/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,42 @@ def soft_redirect():
flask.session["test"] = True
return "Found", 302

@app.route("/cache")
def cache():
@app.route("/cache/max-age")
def cache_max_age():
response = flask.make_response()
response.cache_control.max_age = 4321

return response, 200

@app.route("/cache/none")
def cache_empty():
response = flask.make_response()
response.cache_control.max_age = None
response.cache_control._set_cache_value(
"stale-while-revalidate", None, int
)
response.cache_control._set_cache_value("stale-if-error", None, int)

return response, 200

@app.route("/cache/stale")
def cache_stale():
response = flask.make_response()
response.cache_control._set_cache_value(
"stale-while-revalidate", 4321, int
)

return response, 200

@app.route("/cache/all")
def cache_all():
response = flask.make_response()
response.cache_control.max_age = 4321
response.cache_control.public = True
response.cache_control.max_age = 1000
response.cache_control._set_cache_value(
"stale-while-revalidate", 4321, int
)
response.cache_control._set_cache_value("stale-if-error", 4321, int)

return response, 200

Expand Down
49 changes: 39 additions & 10 deletions tests/test_flask_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,11 @@ def test_security_headers(self):
def test_default_cache_headers(self):
with create_test_app().test_client() as client:
cached_response = client.get("page")
self.assertEqual(
cached_response.headers.get("Cache-Control"),
"public, max-age=300, stale-while-revalidate=360",
)
cache = cached_response.headers.get("Cache-Control")
self.assertNotIn("public", cache)
self.assertIn("max-age=60", cache)
self.assertIn("stale-while-revalidate=86400", cache)
self.assertIn("stale-if-error=300", cache)

def test_redirects_have_no_cache_headers(self):
with create_test_app().test_client() as client:
Expand All @@ -82,13 +83,41 @@ def test_vary_cookie_when_session(self):
cached_response_with_session.headers.get("Vary"), "Cookie"
)

def test_cache_does_not_overide(self):
def test_cache_override(self):
"""
Check each part of the cache-control instruction can be overridden
individually, or nullified
"""

with create_test_app().test_client() as client:
cached_response = client.get("cache")
self.assertEqual(
cached_response.headers.get("Cache-Control"),
"public, max-age=1000",
)
# all 3 values are overriden, and "public" is added
all_response = client.get("cache/all")
all_cache = all_response.headers.get("Cache-Control")
self.assertIn("public", all_cache)
self.assertIn("max-age=4321", all_cache)
self.assertIn("stale-while-revalidate=4321", all_cache)
self.assertIn("stale-if-error=4321", all_cache)

# all values are nullified, leading no cache-control header
none_response = client.get("cache/none")
none_cache = none_response.headers.get("Cache-Control")
self.assertIsNone(none_cache)

# only max-age is overridden, so the "stale" instructions remain
max_age_response = client.get("cache/max-age")
max_age_cache = max_age_response.headers.get("Cache-Control")
self.assertNotIn("public", max_age_cache)
self.assertIn("max-age=4321", max_age_cache)
self.assertIn("stale-while-revalidate=86400", max_age_cache)
self.assertIn("stale-if-error=300", max_age_cache)

# only "stale-while-revalidate" is overridden
stale_response = client.get("cache/stale")
stale_cache = stale_response.headers.get("Cache-Control")
self.assertNotIn("public", stale_cache)
self.assertIn("max-age=60", stale_cache)
self.assertIn("stale-while-revalidate=4321", stale_cache)
self.assertIn("stale-if-error=300", stale_cache)

def test_status_endpoints(self):
with create_test_app().test_client() as client:
Expand Down

0 comments on commit ebebf4a

Please sign in to comment.