Skip to content
This repository has been archived by the owner on Feb 22, 2023. It is now read-only.

Tally provider occurrences in results #1088

Merged
merged 7 commits into from
Jan 25, 2023
Merged

Conversation

sarayourfriend
Copy link
Contributor

@sarayourfriend sarayourfriend commented Jan 14, 2023

Fixes

Fixes #1084 by @sarayourfriend
Also fixes #1072 by @AetherUnbound

Description

Tallies provider occurrences in Redis. Disambiguates by week (via a Monday date-stamp shared for all days of a given week), index, and provider.

Only tallies the first four pages.

Testing Instructions

Check out the new unit tests.

For manual testing: run the application with the instructions from the README (you will need to rebuild web if you already have it for the new test dependency, btw).

Then, make a search. Open up a redis connection (redis-cli or use a Redis connection from django_redis in just ipython) and execute SCAN 0 'provider*'. You should see the new tallies appear. You can use the keys that you get back from SCAN to view the individual results.

Checklist

  • My pull request has a descriptive title (not a vague title like
    Update index.md).
  • My pull request targets the default branch of the repository (main) or
    a parent feature branch.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible
    errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@openverse-bot openverse-bot added ✨ goal: improvement Improvement to an existing user-facing feature 💻 aspect: code Concerns the software code in the repository 🟧 priority: high Stalls work on the project or its dependents labels Jan 14, 2023
@github-actions
Copy link

github-actions bot commented Jan 14, 2023

API Developer Docs Preview: Updating
(This comment will be automatically updated with the preview URL once it is ready)

@github-actions
Copy link

This PR has migrations. Please rebase it before merging to ensure that conflicting migrations are not introduced.

@github-actions github-actions bot added the migrations Modifications to Django migrations label Jan 17, 2023
@sarayourfriend sarayourfriend removed the migrations Modifications to Django migrations label Jan 18, 2023
@sarayourfriend sarayourfriend marked this pull request as ready for review January 18, 2023 22:43
@sarayourfriend sarayourfriend requested a review from a team as a code owner January 18, 2023 22:43
Copy link
Contributor

@AetherUnbound AetherUnbound left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I was able to test this locally and see that keys were added for different queries, but weren't added when I requested page=5.

== b"1"
)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to add a test (or would it be easily possible to test) that no results are incremented for page queries > 4?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tested by the test that verifies that the function isn't called by search_controller.search if the page is 5. Is there a different approach you would like to take?

@stacimc
Copy link
Contributor

stacimc commented Jan 23, 2023

@sarayourfriend Can you expand on your instructions for manual testing?

@sarayourfriend
Copy link
Contributor Author

sarayourfriend commented Jan 23, 2023

@stacimc I'm happy to, but can you share which part of the manual testing instructions you're stuck on? I'm not sure what to expand on 😕

Copy link
Contributor

@stacimc stacimc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. I was able to figure out how to test and verified that the occurrences and appeared_in_searches tallies were increased as expected, and not for pages > 4.

For what it's worth, I wasn't clear how to test in Redis. I looked for other PRs involving similar testing and first tried exec'ing into the container with docker-compose exec cache redis-cli, but then the given SCAN command threw syntax errors. I'm not sure if I was missing steps or if you meant a different command.

To get it working in just ipython I did:

> import django_redis
> tallies = django_redis.get_redis_connection("tallies")

# Get the keys
> tallies.keys("provider*")

# Probably an easier way to do this, but individually check the value of relevant keys after making searches at http://localhost:50280/v1/images
> tallies.get('provider_appeared_in_searches:image:2023-01-23:flickr')
(and so on)

@sarayourfriend
Copy link
Contributor Author

sarayourfriend commented Jan 24, 2023

Sorry, I left out the MATCH keyword for the SCAN argument (I just wrote it from memory). The Redis documentation is pretty good about these things though, if you run into similar issues in the future: https://redis.io/commands/scan/. FWIW, we have Redis mapped (kind of annoyingly, actually, because it conflicts with my actual personal redis on my machine) to the default Redis port out of docker-compose. If you run redis-cli on your local machine, outside of Docker, with the API running, it will connect to Redis directly, no need to exec into any containers. I didn't realise this wasn't well-known on the team, my apologies for not including the detailed instructions.

Testing using ipython is easier and as you did, can follow the code as an example.

@stacimc
Copy link
Contributor

stacimc commented Jan 24, 2023

No worries. I looked at that exact documentation when debugging the syntax error 😅 I needed to add match and remove the quotation marks around the pattern. However even once I did that, I wasn't able to see the tallies. The only key I could see at all was filtered_providers. I was not sure if maybe I was missing a step to connect to the right cache?

Before that FWIW I also initially tried running redis-cli outside of Docker with the API running and it did not work for me -- I got connection errors when trying to connect. I did not spend much time trying to debug that part in particular, though 🤷‍♀️

@sarayourfriend
Copy link
Contributor Author

I was not sure if maybe I was missing a step to connect to the right cache?

Ah, yes, that would be the issue.

Later I will open an issue to add more documentation for how to read from redis in development 👍

@sarayourfriend sarayourfriend force-pushed the add/provider-query-count branch from fb77e44 to 1bfb7a5 Compare January 24, 2023 23:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟧 priority: high Stalls work on the project or its dependents
Projects
None yet
4 participants