Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surfacing mcf required filters in the backend #327

Merged
merged 25 commits into from
Sep 17, 2024

Conversation

THOR300
Copy link
Contributor

@THOR300 THOR300 commented Sep 11, 2024

Description

This pull request surfaces changes made to the cpr_sdk to support filtering for the MCF corpus data.

  • Rename a data access legacy naming convension.
  • Add family geographies to the SearchResponseFamily object.
  • Update the default behaviour for which fields we filter on in vespa from geography to geographies.
  • Bump the cpr_sdk version to contain the new queries.
  • Update the test setup so we have the option to use deterministic metadata. This is to enable the metadata test.
  • Update tests.

Note:

We can't test for the corpus_type_name or corpus_import_id in the search response as it isn't included in the object.

Do we want this as part of this PR or as part of future work?

  • This would mean accessing it in the hit from vespa and adding it to the search response object.

Proposed version

Please select the option below that is most relevant from the list below. This
will be used to generate the next tag version name during auto-tagging.

  • Skip auto-tagging
  • Patch
  • Minor version
  • Major version

Visit the Semver website to understand the
difference between MAJOR, MINOR, and PATCH versions.

Notes:

  • If none of these options are selected, auto-tagging will fail
  • Where multiple options are selected, the most senior option ticked will be
    used -- e.g. Major > Minor > Patch
  • If you are selecting the version in the list above using the textbox, make
    sure your selected option is marked [x] with no spaces in between the
    brackets and the x

Type of change

Please select the option(s) below that are most relevant:

  • Bug fix
  • New feature
  • Breaking change
  • GitHub workflow update
  • Documentation update
  • Refactor legacy code
  • Dependency update

How Has This Been Tested?

Please describe the tests that you added to verify your changes.

Reviewer Checklist

  • DB_CLIENT DEPENDENCY IS ON THE LATEST VERSION
  • The PR represents a single feature (small driveby fixes are also ok)
  • The PR includes tests that are sufficient for the level of risk
  • The code is sufficiently commented, particularly in hard-to-understand areas
  • Any required documentation updates have been made
  • Any TODOs added are captured in future tickets
  • No FIXMEs remain

Copy link

linear bot commented Sep 11, 2024

@THOR300 THOR300 marked this pull request as ready for review September 12, 2024 15:18
@THOR300 THOR300 requested a review from a team as a code owner September 12, 2024 15:18
Copy link
Contributor

@jesse-c jesse-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasoanble to me. I don't think I have enough understanding of this service though!

app/api/api_v1/schemas/search.py Show resolved Hide resolved
app/core/search.py Show resolved Hide resolved
Copy link
Contributor

@olaughter olaughter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally this looks great to me, and I think the tests you've added look good. I am wondering if we might need one more test in the search tests that makes sure we get metadata back? Like how we are currently doing for geographies. This would give us good confidence in this all fitting together correctly. This is good to go from my view otherwise!

@THOR300
Copy link
Contributor Author

THOR300 commented Sep 12, 2024

A thought - add multiple geographies to test data to ensure we're ready for that as I think they're all just a list of the single geo currently.

@THOR300
Copy link
Contributor Author

THOR300 commented Sep 12, 2024

Functionally this looks great to me, and I think the tests you've added look good. I am wondering if we might need one more test in the search tests that makes sure we get metadata back? Like how we are currently doing for geographies. This would give us good confidence in this all fitting together correctly. This is good to go from my view otherwise!

Definitely! I've added some more assertions for the other fields to the test_create_vespa_search_params test.

Mark and others added 2 commits September 16, 2024 17:41
* Working commit.

* Clean up.

* Making populate db families deterministic.

* Adding distinct deterministic and random fixtures for family metadata.

---------

Co-authored-by: Mark <[email protected]>
Copy link
Contributor

@olaughter olaughter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Some minor points to make decisions on below. Nothing blocking though, good work! 👍 👍

tests/search/setup_search_tests.py Outdated Show resolved Hide resolved
tests/search/setup_search_tests.py Outdated Show resolved Hide resolved
@THOR300 THOR300 force-pushed the feature/pla-146-surface-sdk-changes-to-backend branch 2 times, most recently from dea39b7 to cc3a5f8 Compare September 17, 2024 13:46
THOR300 and others added 2 commits September 17, 2024 15:41
* Move vespa search tests & the search_fixtures they use under dedicated sub-folder  (#334)

* Move vespa search tests under dedicated vespa folder

* Move /search_fixtures under vespa search folder & rename to fixtures

* Bump to 1.14.20

* Move vespa search result order tests into a separate file (#335)

* Move vespa search tests under dedicated vespa folder

* Move /search_fixtures under vespa search folder & rename to fixtures

* Bump to 1.14.20

* Move vespa search result order tests to separate file

* Bump to 1.14.19

* Move continuation token vespa search tests to separate file (#336)

* Move vespa search tests under dedicated vespa folder

* Move /search_fixtures under vespa search folder & rename to fixtures

* Bump to 1.14.20

* Move vespa search result order tests to separate file

* Bump to 1.14.19

* Move vespa search continuation token tests to separate file

* Group pagination and continuation token tests

* Move keyword and range vespa search tests into separate file (#337)

* Move vespa search tests under dedicated vespa folder

* Move /search_fixtures under vespa search folder & rename to fixtures

* Bump to 1.14.20

* Move vespa search result order tests to separate file

* Bump to 1.14.19

* Move vespa search continuation token tests to separate file

* Move keyword and range vespa search tests into separate file

* Delete test_vespa_search_cont_tokens.py

* Move _make_search_request into vespa search setup

* Move vespa search tests for ignoring special chars & case to separate file (#338)

* Move data download tests into parent folder

* Move query insensitivity & special chars ignoring tests out

* Rename from test_vespasearch

* Bump to 1.14.20

* Removing refactored file.

* Adding back in the changes from the test_vespasearch.

---------

Co-authored-by: Katy Baulch <[email protected]>
Co-authored-by: Mark <[email protected]>
@THOR300 THOR300 merged commit a577fa9 into main Sep 17, 2024
12 checks passed
@THOR300 THOR300 deleted the feature/pla-146-surface-sdk-changes-to-backend branch September 17, 2024 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants