Skip to content

Commit

Permalink
Switch searches to use vespa by default (#205)
Browse files Browse the repository at this point in the history
* Switch searches to use vespa by default

Minimal change switchover to make it easy to switch back if needed, this
just makes vespa the default over opensearch. The api can still query
opensearch by setting `use_vespa` to False. We can delete the opensearch
code when we're confident we won't need it as a fallback

* Keep opensearch tests working

This distinguishes opensearch tests by giving them their own pytest mark,
it also adds a parameter to the api request to cause the endpoint to still
use opensearch for these tests. Finally we rename the make task for
opensearch specific tests

* Setup for vespa in CI

Adds test db setup, fixtures and commands to run vespa in ci

* Direct most opensearch tests towards vespa

Some have been made obsolete, either overtime or because vespa handles
something differently. But this adapts as many of the existing opensearch
tests as possible to point at the vespa route instead

* Tidy up make file commands

* remove redundant test

No longer needed as we store the slug in vespa

* increase disk space folllowing feed errors in github actions

* Remove filter keyword tests until issue resolved

These tests are currently failing due to an issue in the Data access
library. Removing them until the issue can be resolved

* Re add filter tests following DAL fix

Context: climatepolicyradar/data-access#118

* remove test override used to confirm dal changes

* add type hints to vespa test setup

* Move search test setup to their own file

* Tidy up makefile variable
  • Loading branch information
olaughter authored Jan 11, 2024
1 parent 9436fa9 commit 8dfc8e1
Show file tree
Hide file tree
Showing 17 changed files with 8,380 additions and 623 deletions.
23 changes: 19 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,15 @@ jobs:
- name: Get python Container
run: docker pull python:3.9

- name: Install latest Vespa CLI
env:
VESPA_CLI_VERSION: "8.250.43"
run: |
mkdir scripts/vespa-cli
curl -fsSL https://github.com/vespa-engine/vespa/releases/download/v${VESPA_CLI_VERSION}/vespa-cli_${VESPA_CLI_VERSION}_linux_amd64.tar.gz | \
tar -zxf - -C scripts/vespa-cli --strip-component=1
echo "scripts/vespa-cli/bin" >> $GITHUB_PATH
- name: Build
run: |
docker-compose build
Expand All @@ -49,14 +58,20 @@ jobs:
docker-compose ps
docker ps -a
ls -la
- name: Setup vespa for search
run: make vespa_setup

- name: Run backend search tests
- name: Run backend search tests for vespa
run: make test_search

- name: Browse Benchmark - response times in ms

- name: Run backend search tests for opensearch
run: make test_opensearch

- name: Browse Benchmark opensearch - response times in ms
run: cat benchmark_browse.txt

- name: Search Benchmark - response times in ms
- name: Search Benchmark opensearch - response times in ms
run: cat benchmark_search.txt

- name: Log Dump
Expand Down
6 changes: 3 additions & 3 deletions app/api/api_v1/routers/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@


def _search_request(
db: Session, search_body: SearchRequestBody, use_vespa: bool = False
db: Session, search_body: SearchRequestBody, use_vespa: bool = True
) -> SearchResponse:
if search_body.keyword_filters is not None and use_vespa is False:
search_body.keyword_filters = process_search_keyword_filters(
Expand Down Expand Up @@ -99,7 +99,7 @@ def search_documents(
request: Request,
search_body: SearchRequestBody,
db=Depends(get_db),
use_vespa: bool = False,
use_vespa: bool = True,
) -> SearchResponse:
"""Search for documents matching the search criteria."""
_LOGGER.info(
Expand All @@ -122,7 +122,7 @@ def download_search_documents(
request: Request,
search_body: SearchRequestBody,
db=Depends(get_db),
use_vespa: bool = False,
use_vespa: bool = True,
) -> StreamingResponse:
"""Download a CSV containing details of documents matching the search criteria."""
_LOGGER.info(
Expand Down
46 changes: 42 additions & 4 deletions makefile-docker.defs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# ----------------------------------
start_containers:
# Build and run containers
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d --remove-orphans

start_backendonly:
# Build and run containers
Expand Down Expand Up @@ -72,12 +72,50 @@ build:
test_bashscripts: build_bats
docker run --rm -v "${PWD}/.github:/code" bats-with-helpers:latest /code/tests/test_retag_and_push.bats

vespa_confirm_cli_installed:
@if [ ! $$(which vespa) ]; then \
echo 'ERROR: The vespa cli is not installed, please install and try again:' ; \
echo 'https://docs.vespa.ai/en/vespa-cli.html'; \
exit 1; \
fi

vespa_healthy:
@if [ ! $$(curl -f -s 'http://localhost:19071/status.html') ]; then \
echo 'ERROR: Bad response from local vespa cluster, is it running?'; \
exit 1; \
fi

.ONESHELL:
vespa_deploy_schema:
vespa config set target local
@vespa deploy tests/search_fixtures/vespa_test_schema --wait 300

.ONESHELL:
vespa_load_data:
vespa config set target local
vespa feed --progress=3 tests/search_fixtures/vespa_search_weights.json
vespa feed --progress=3 tests/search_fixtures/vespa_family_document.json
vespa feed --progress=3 tests/search_fixtures/vespa_document_passage.json

vespa_setup: vespa_confirm_cli_installed vespa_healthy vespa_deploy_schema vespa_load_data

.ONESHELL:
test_search:
docker-compose \
-f docker-compose.yml \
-f docker-compose.dev.yml \
run --rm --name search_test \
-v "${PWD}/data:/data" \
backend pytest \
-vvv tests/routes/test_vespasearch.py \
-m 'search'

setup_test_search_index:
docker-compose -f docker-compose.yml -f docker-compose.dev.yml run --rm backend curl -XDELETE -u "${OPENSEARCH_USER}:${OPENSEARCH_PASSWORD}" ${OPENSEARCH_URL}/${OPENSEARCH_INDEX_PREFIX}* --insecure
docker-compose -f docker-compose.yml -f docker-compose.dev.yml run --rm opensearch-test-loader multielasticdump --direction=load --input=/cpr-backend/tests/data/ --output=${OPENSEARCH_URL} --ignoreType=template

test_search: setup_test_search_index
docker-compose -f docker-compose.yml -f docker-compose.dev.yml run --name search_test -v "${PWD}/data:/data" backend pytest -vvv -m 'search'
test_opensearch: setup_test_search_index
docker-compose -f docker-compose.yml -f docker-compose.dev.yml run --name search_test -v "${PWD}/data:/data" backend pytest -vvv -m 'opensearch'
docker cp search_test:/data/benchmark_browse.txt .
docker cp search_test:/data/benchmark_search.txt .
docker rm search_test
Expand All @@ -89,7 +127,7 @@ test_unit:
docker-compose -f docker-compose.yml -f docker-compose.dev.yml run --rm backend pytest -vvv tests/unit

test:
docker-compose -f docker-compose.yml -f docker-compose.dev.yml run --rm backend pytest -vvv --test-alembic -m 'not search'
docker-compose -f docker-compose.yml -f docker-compose.dev.yml run --rm backend pytest -vvv --test-alembic -m 'not opensearch and not search'

# ----------------------------------
# tasks
Expand Down
Loading

0 comments on commit 8dfc8e1

Please sign in to comment.