Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thumbnail repsonse time runbooks #3053

Merged
merged 3 commits into from
Sep 26, 2023
Merged

Conversation

stacimc
Copy link
Collaborator

@stacimc stacimc commented Sep 21, 2023

Fixes

Related to #2502 by @sarayourfriend
Related to https://github.com/WordPress/openverse-infrastructure/pull/619

Description

Adds runbooks for thumbnail response time alarms.

Question for reviewers: the content for all of these is identical. I could make a single runbook and update the links in the alarms. The reason I didn't is that it seems possible (although perhaps unlikely) that false positives could occur separately for each of them 🤔

Testing Instructions

Make sure they look alright in the preview!

Checklist

  • My pull request has a descriptive title (not a vague title likeUpdate index.md).
  • My pull request targets the default branch of the repository (main) or a parent feature branch.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.
  • I ran the DAG documentation generator (if applicable).

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@stacimc stacimc added 🟧 priority: high Stalls work on the project or its dependents 🌟 goal: addition Addition of new feature 📄 aspect: text Concerns the textual material in the repository 🧱 stack: documentation Related to Sphinx documentation labels Sep 21, 2023
@stacimc stacimc self-assigned this Sep 21, 2023
@stacimc stacimc force-pushed the add/api-thumb-response-runbooks branch from 00214f7 to 3058cde Compare September 21, 2023 22:43
Copy link
Collaborator

@sarayourfriend sarayourfriend left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Regarding sharing them: I think we could share the anomaly ones, maybe, but I do think it's good to have separate instructions for debugging average vs p99. For example, for average response time, you probably don't need to do anything very special to find "the slow requests", because overall things are slow. For p99, however, as long as average is still okay, it is revealing some kind of edge case where we'd need to query the nginx logs for target response time. I don't know whether that needs to be in the runbook, because it is a different in principle that applies to all average and p99 response time debugging, but it is a notable difference between the two. But yeah, not sure whether that's a difference in the runbook or a difference in some other resource.

The anomaly and threshold alarms also have a difference, I think. Like the threshold alarm runbooks could probably have an instruction to say "check whether it is a one-off spike or persistent" and guide severity along those lines. If we spike to 10 seconds average response time, that's not good, but it also doesn't mean we need to drop everything to debug it if things have gone back down.

That's also just a general principle that could apply to almost every alarm, so I don't know whether that needs to be in the runbooks or, like above, in some other document that they link to. I think this relates somewhat to the level of confidence that something persistently bad is happening based on the alarm, which does also relate to the false alarms you mentioned. Ideally anomaly alarms would be higher confidence that something bad is persistently happening, once we've tuned them, and threshold alarms are kind of more up in the air, could be really bad or could just be a one-off spike that we should look into, but not necessarily with urgency.

Anyway, all of that is just my thoughts on this. I don't have strong recommendations either way, but whatever we come up with, ideally we can apply it consistently between the alarms. It looks like there are some important differences but also important similarities in the principles of these alarms and what they can indicate, I just don't know what the correct or most flexible way of sharing and differentiating between those is.

@stacimc stacimc marked this pull request as ready for review September 22, 2023 17:53
@stacimc stacimc requested a review from a team as a code owner September 22, 2023 17:53
@stacimc stacimc requested review from fcoveram and obulat September 22, 2023 17:53
@krysal krysal mentioned this pull request Sep 22, 2023
1 task
@stacimc stacimc removed the request for review from fcoveram September 25, 2023 18:31
@stacimc stacimc force-pushed the add/api-thumb-response-runbooks branch from cd41a35 to febaf89 Compare September 26, 2023 17:05
@stacimc stacimc merged commit e66d07e into main Sep 26, 2023
41 checks passed
@stacimc stacimc deleted the add/api-thumb-response-runbooks branch September 26, 2023 17:51
ngken0995 added a commit to ngken0995/openverse that referenced this pull request Oct 4, 2023
…WordPress#3055)

* change deprecated ES search body

* trim white space

Add thumbnail repsonse time runbooks (WordPress#3053)

* Add thumb repsonse time runbooks

* Add files to index

* Threshold alarms are low severity if not anomalous

generate-dag-docs recipe move DAGs.md to documentation folder (WordPress#3061)

Co-authored-by: Madison Swain-Bowden <[email protected]>

Upgrade psycopg to version 3 in the API (WordPress#3064)

Update VSourcesTable.vue (WordPress#3026)

Co-authored-by: sarayourfriend <[email protected]>
Co-authored-by: Olga Bulat <[email protected]>
Co-authored-by: Krystle Salazar <[email protected]>

Transfer UUID validation inside serializer (WordPress#3068)

* Transfer UUID validation inside serializer

* Add test case

Publish changelog for api-2023.09.28.00.26.34 (WordPress#3070)

Co-authored-by: AetherUnbound <[email protected]>

Use fully qualified docker image names (WordPress#3071)

Publish changelog for ingestion_server-2023.09.29.17.40.50 (WordPress#3082)

Co-authored-by: stacimc <[email protected]>

Update `_AIRFLOW_DB_UPGRADE` to `_AIRFLOW_DB_MIGRATE`

Increase the API sources cache TTL from 20 minutes to 4 hours (WordPress#3083)

Publish changelog for api-2023.09.30.00.15.32 (WordPress#3084)

Co-authored-by: AetherUnbound <[email protected]>

Bump ipython from 8.14.0 to 8.16.0 in /automations/python (WordPress#3099)

Bumps [ipython](https://github.com/ipython/ipython) from 8.14.0 to 8.16.0.
- [Release notes](https://github.com/ipython/ipython/releases)
- [Commits](ipython/ipython@8.14.0...8.16.0)

---
updated-dependencies:
- dependency-name: ipython
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Remove boto3 dependency (WordPress#3073)

Bump docker/login-action from 2 to 3 (WordPress#3089)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump elasticsearch from 8.8.2 to 8.10.0 in /api (WordPress#3103)

Bumps [elasticsearch](https://github.com/elastic/elasticsearch-py) from 8.8.2 to 8.10.0.
- [Release notes](https://github.com/elastic/elasticsearch-py/releases)
- [Commits](elastic/elasticsearch-py@v8.8.2...v8.10.0)

---
updated-dependencies:
- dependency-name: elasticsearch
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump psycopg2 from 2.9.7 to 2.9.8 in /ingestion_server (WordPress#3097)

Bumps [psycopg2](https://github.com/psycopg/psycopg2) from 2.9.7 to 2.9.8.
- [Changelog](https://github.com/psycopg/psycopg2/blob/master/NEWS)
- [Commits](psycopg/psycopg2@2.9.7...2.9.8)

---
updated-dependencies:
- dependency-name: psycopg2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump elasticsearch from 8.8.2 to 8.10.0 in /ingestion_server (WordPress#3095)

Bumps [elasticsearch](https://github.com/elastic/elasticsearch-py) from 8.8.2 to 8.10.0.
- [Release notes](https://github.com/elastic/elasticsearch-py/releases)
- [Commits](elastic/elasticsearch-py@v8.8.2...v8.10.0)

---
updated-dependencies:
- dependency-name: elasticsearch
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump pygithub from 1.59.1 to 2.1.1 in /automations/python (WordPress#3102)

Bumps [pygithub](https://github.com/pygithub/pygithub) from 1.59.1 to 2.1.1.
- [Release notes](https://github.com/pygithub/pygithub/releases)
- [Changelog](https://github.com/PyGithub/PyGithub/blob/main/doc/changes.rst)
- [Commits](PyGithub/PyGithub@v1.59.1...v2.1.1)

---
updated-dependencies:
- dependency-name: pygithub
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump furo from 2023.8.19 to 2023.9.10 in /documentation (WordPress#3092)

Bumps [furo](https://github.com/pradyunsg/furo) from 2023.8.19 to 2023.9.10.
- [Release notes](https://github.com/pradyunsg/furo/releases)
- [Changelog](https://github.com/pradyunsg/furo/blob/main/docs/changelog.md)
- [Commits](pradyunsg/furo@2023.08.19...2023.09.10)

---
updated-dependencies:
- dependency-name: furo
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump docker/build-push-action from 4 to 5 (WordPress#3090)

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 4 to 5.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](docker/build-push-action@v4...v5)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump actions/checkout from 3 to 4 (WordPress#3088)

Bump docker/setup-buildx-action from 2 to 3 (WordPress#3087)

Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 2 to 3.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](docker/setup-buildx-action@v2...v3)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump fakeredis from 2.18.0 to 2.19.0 in /api (WordPress#3100)

Bumps [fakeredis](https://github.com/cunla/fakeredis-py) from 2.18.0 to 2.19.0.
- [Release notes](https://github.com/cunla/fakeredis-py/releases)
- [Commits](cunla/fakeredis-py@v2.18.0...v2.19.0)

---
updated-dependencies:
- dependency-name: fakeredis
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Show timeout errors on the frontend (WordPress#2838)

* Show timeout errors on the frontend

* Use FetchingError in all stores

* Fix error

* Show client-side errors on single result pages

* Set 500 as a non-retriable error

* Add changes from code review

* Use local base64 image for thumbnail

* Fix footer

* Fix image-cell test

* Extract common error checking functionality

* Update unit tests

Add runbooks for API Thumbnails 2XX/5XX responses and Request Count alarms (WordPress#3076)

Bump jsonschema from 4.19.0 to 4.19.1 in /ingestion_server (WordPress#3098)

Bumps [jsonschema](https://github.com/python-jsonschema/jsonschema) from 4.19.0 to 4.19.1.
- [Release notes](https://github.com/python-jsonschema/jsonschema/releases)
- [Changelog](https://github.com/python-jsonschema/jsonschema/blob/main/CHANGELOG.rst)
- [Commits](python-jsonschema/jsonschema@v4.19.0...v4.19.1)

---
updated-dependencies:
- dependency-name: jsonschema
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump ipython from 8.15.0 to 8.16.1 in /ingestion_server (WordPress#3110)

Bumps [ipython](https://github.com/ipython/ipython) from 8.15.0 to 8.16.1.
- [Release notes](https://github.com/ipython/ipython/releases)
- [Commits](https://github.com/ipython/ipython/commits)

---
updated-dependencies:
- dependency-name: ipython
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump sphinx from 7.2.5 to 7.2.6 in /documentation (WordPress#3091)

Bumps [sphinx](https://github.com/sphinx-doc/sphinx) from 7.2.5 to 7.2.6.
- [Release notes](https://github.com/sphinx-doc/sphinx/releases)
- [Changelog](https://github.com/sphinx-doc/sphinx/blob/master/CHANGES.rst)
- [Commits](sphinx-doc/sphinx@v7.2.5...v7.2.6)

---
updated-dependencies:
- dependency-name: sphinx
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump tldextract from 3.4.4 to 3.6.0 in /catalog (WordPress#3093)

Bumps [tldextract](https://github.com/john-kurkowski/tldextract) from 3.4.4 to 3.6.0.
- [Release notes](https://github.com/john-kurkowski/tldextract/releases)
- [Changelog](https://github.com/john-kurkowski/tldextract/blob/master/CHANGELOG.md)
- [Commits](john-kurkowski/tldextract@3.4.4...3.6.0)

---
updated-dependencies:
- dependency-name: tldextract
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump urllib3 from 2.0.5 to 2.0.6 in /documentation (WordPress#3120)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump urllib3 from 1.26.16 to 1.26.17 in /api (WordPress#3123)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump urllib3 from 2.0.5 to 2.0.6 in /automations/python (WordPress#3119)

Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.0.5 to 2.0.6.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](urllib3/urllib3@v2.0.5...2.0.6)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bump urllib3 from 1.26.16 to 1.26.17 in /ingestion_server (WordPress#3124)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

implement vale

determine files

mircosoft sentence format

change alert level

change alert level

change alert level

change alert level

change alert level

change alert level

change alert level

Empty-Commit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📄 aspect: text Concerns the textual material in the repository 🌟 goal: addition Addition of new feature 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: documentation Related to Sphinx documentation
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants