Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Change databuilder search data extractors to publish name in user document. #2274

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

glipR
Copy link

@glipR glipR commented Oct 23, 2024

Description

Changes the search data databuilder extractors to extract a name field, for publishing to elasticsearch.

Currently, the user document schema in elasticsearch expects a name keyword, which is the primary field for searching users. (Expected from Databuilder publish, and expected from search service)

Motivation and Context

The change is required because currently the elasticsearch query does not use the name field to create better matches for search results, and instead is likely just using first/last and key of the user. This is fine when searching for just first or last, but when searching both the search results currently aren't great for this reason.

This is actually fixed in the example query (PR) for posting from neo4j to elasticsearch, but this query is never used.

How Has This Been Tested?

This has only been tested with the neo4j_search_data_extractor, and not the other extractors. Testing was done by:

  1. Loading from databuilder via the search_data_extractor script
  2. Navigating to the index on localhost:9200 and inspecting document values
  3. Attempting a search in the frontend and comparing results

Documentation

No change in documentation

CheckList

  • PR title addresses the issue accurately and concisely

Believe this fix has no need for:

  • N/A: Updates Documentation and Docstrings
  • N/A: Adds tests
  • N/A: Adds instrumentation (logs, or UI events)

@glipR glipR requested a review from a team as a code owner October 23, 2024 00:17
@boring-cyborg boring-cyborg bot added the area:databuilder From databuilder folder label Oct 23, 2024
Copy link

boring-cyborg bot commented Oct 23, 2024

Congratulations on your first Pull Request and welcome to Amundsen community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/amundsen-io/amundsen/blob/main/CONTRIBUTING.md)

@kristenarmes
Copy link
Contributor

@glipR thanks for your contribution! I triggered the CI tests and see a couple minor test failures. if you can resolve, I'd be happy to approve

Signed-off-by: Jackson Goerner <[email protected]>
@glipR
Copy link
Author

glipR commented Oct 25, 2024

Thanks @kristenarmes ! I've edited the two failing tests to reflect the new return values / expected arguments.

Apologies I'm having trouble triggering the tests locally due to some version conflicts so may need a retrigger to confirm that that has fixed everything 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:databuilder From databuilder folder
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants