-
Notifications
You must be signed in to change notification settings - Fork 50
Extend valid link cache time to 30 days and make cache expiration times minimally configurable #878
Extend valid link cache time to 30 days and make cache expiration times minimally configurable #878
Conversation
API Developer Docs Preview: Ready https://wordpress.github.io/openverse-api/_preview/878 Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again. You can check the GitHub pages deployment action list to see the current status of the deployments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! All I have are bikeshedding comments.
@@ -18,6 +19,10 @@ def _get_cached_statuses(redis, image_urls): | |||
return [int(b.decode("utf-8")) if b is not None else None for b in cached_statuses] | |||
|
|||
|
|||
def _get_expiry(status, default): | |||
return config(f"LINK_VALIDATION_CACHE_EXPIRY__{status}", default=default, cast=int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be bikeshedding but the __
in the name looks weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a not-uncommon separator for environment variables: https://github.com/WordPress/openverse-catalog/blob/f5de3e2a5f9c9da62f7e47e9c86b566e3e345123/env.template#L11
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's new to the API. But it's cool 👍. Could you move this to settings
(so that almost all env reading happens in one place) and also add it to the env.template
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the _get_expiry
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, my bad. I didn't realise that the default=
argument to config
was set by the caller to _get_expiry
. Please ignore my suggestion.
@@ -69,14 +74,17 @@ def validate_images(query_hash, start_slice, results, image_urls): | |||
# Cache successful links for a day, and broken links for 120 days. | |||
if status == 200: | |||
logger.debug("healthy link " f"key={key} ") | |||
pipe.expire(key, twenty_four_hours_seconds) | |||
expiry = _get_expiry(200, twenty_four_hours_seconds * 30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a variable called thirty_days_seconds
might make this clearer.
@@ -18,6 +19,10 @@ def _get_cached_statuses(redis, image_urls): | |||
return [int(b.decode("utf-8")) if b is not None else None for b in cached_statuses] | |||
|
|||
|
|||
def _get_expiry(status, default): | |||
return config(f"LINK_VALIDATION_CACHE_EXPIRY__{status}", default=default, cast=int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's new to the API. But it's cool 👍. Could you move this to settings
(so that almost all env reading happens in one place) and also add it to the env.template
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you for the thorough testing instructions as well, everything worked as expected when I tried locally 🚀
Fixes
Fixes #867 by @obulat
Description
Makes the link validation cache expiration times minimally configurable and increases the valid link cache time to 30 days.
I have another patch that introduces more flexible configuration by status code, in case that is something we want to explore.
Testing Instructions
Checkout this branch and
just down && just up
to ensure you get the latest code. Smoke test the app around link validation and confirm that things are getting cached. You can do so by observing the log output. For example, a request to/v1/images/
will output a bunch of warning logs about making insecure requests to staticflickr the first time but should not output any on the second request.If you want to confirm the cache times more explicitly, you'll need to upgrade redis locally to version 7 so that you can use the
EXPIRETIME
command. To do this, update thecache
image tag from4.x.x
to7
:Then run
redis-cli
locally and try the following:expiretime returns the Unix timestamp from origin so you'll have to do some math to confirm that it matches your expectation based on when the key was inserted. To clear your local Redis, run
redis-cli flushall
. This is useful for trying this process a couple times to confirm it's working as expected.Checklist
Update index.md
).main
) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin