Skip to content

valkyriekngt/cymbal-search

Repository files navigation

Cymbal Search

Video Walkthrough

alt text

Speed up 1.5x for optimal experience:

Disclaimer: This is NOT an official Google project.

Created by elroylbj@

Table of Contents

Software Architecture Diagram

alt text

Pre-requisites

You need to set up a Google Cloud Project with a Vertex AI Search app (unstructured datastore) and a GCS bucket. With these set up, you can simply define the environment variables and run the Dockerfile anywhere to deploy your app.

1. Google Cloud Project

  1. Create a Google Cloud project if you don't already have one.

2. Vertex AI Search App

Create a Vertex AI Search with unstructured datastore with defined metadata.

  1. Go to Search & Conversation page and enable the api if prompted
  2. Select Search
  3. At Configuration,
    1. Enable all features
    2. Set your App Name
    3. Set location as global
    4. Click CONTINUE
  4. Create a new data store - select Cloud Storage
  5. At Data import, we will load a set of public PDFs using a JSON file:
    1. Select FILE
    2. In the gs:// textbox, input cymbalsearch-alphabet-public/metadata/metadata.json
    3. Select JSON for unstructured documents with metadata
    4. Click CONTINUE
  6. Give your data store a name and click CREATE (wait a few seconds for datastore to create)
  7. Select your newly-created data store and click CREATE to create app
  8. After around 10 minutes, the import status should show "Import completed".
  9. Congratulations! Play around with the app in the Preview and Configurations tabs in the console.

3. Google Cloud Storage

Create a bucket to be used for uploading of new documents using the UI.

  1. Create a GCS bucket either in console or run this gcloud command in Cloud Shell:
    export BUCKET_FOR_UPLOAD=$BUCKET_NAME
    
    gcloud storage buckets create gs://$BUCKET_FOR_UPLOAD
    
  2. Configure CORS for your bucket.
    echo '[{"origin": ["*"], "method": ["GET"], "responseHeader": ["Content-Type"],"maxAgeSeconds": 3600}]' > cors.json
    
    gsutil cors set cors.json gs://$BUCKET_FOR_UPLOAD
    

Deployment Instructions

These instructions are suggested to be run in your GCP Project's Cloud Shell.
However, if you would like to run them locally, set up Application Default Credentials (ADC).

1. Clone this project repository

Follow this guide to create your personal access token. Then substitute the <token_name> and <token_value> below and run the command to clone the repository.

git clone https://<token_name>:<token_value>@gitlab.com/google-cloud-ce/googlers/elroylbj/cymbal-search.git

2. Set environment variables

Please replace these with your project details:

  • Add ENGINE_2 and ENGINE_3 if you want more apps.
  • Leave MODEL_1 and MODEL_2 values unchanged if there are no changes to the PaLM models.
export PROJECT_ID="PROJECT_ID"
export BUCKET_FOR_UPLOAD="BUCKET_NAME"
export ENGINE_1="VERTEX_AI_SEARCH_DATASTORE_ID"
export MODEL_1="text-bison"
export MODEL_2="text-bison-32k"

Set deployment details (leave the default values at your convenience):

export APP_NAME=cymbalsearch
export SERVICE_ACC=cymbalsearch-sa
export REGION=asia-southeast1
export REPOSITORY=my-repo
export IMAGE=my-image
export TAG=1.0
  • REGION is the regional or multi-regional location of the repository.
  • REPOSITORY is the name of the repository where the image is stored.
  • IMAGE is the name of the image in the repository.
  • TAG is the tag of the image version that you want to pull.

3. Enable APIs, create Service Account and push Docker image to Artifact Registry

Firstly, ensure you are in the root directory of the repository.

cd cymbal-search/

Next, run the script below:

chmod +x deploy.sh
./deploy.sh

The deploy.sh script does the following:

  1. Enable required APIs (requires the roles/servicemanagement.serviceConsumer role).
  2. Create Service Account with required permissions (requires the roles/iam.serviceAccountCreator role).
  3. Create Artifact Repository (requires the roles/artifactregistry.admin role).
  4. Build image and push to Artifact Registry.

Visit the Artifact Registry console to view your Docker image. It will take around 10 minutes for the build to complete.

4. Deploy your app on either Cloud Run or App Engine.

Option 1: Cloud Run

  1. If you are running in an Argolis Project, overwrite the Organization Policy as seen here to be able to unauthenticated invocations.
  2. Deploy Cloud Run using the command below. Add ENGINE_2 and ENGINE_3 in the env vars if you want to.
    gcloud run deploy $APP_NAME \
        --image $REGION-docker.pkg.dev/$PROJECT_ID/$REPOSITORY/$IMAGE:$TAG \
        --region $REGION \
        --service-account=$SERVICE_ACC@$PROJECT_ID.iam.gserviceaccount.com \
        --allow-unauthenticated \
        --set-env-vars=PROJECT_ID=$PROJECT_ID,BUCKET_FOR_UPLOAD=$BUCKET_FOR_UPLOAD,ENGINE_1=$ENGINE_1,MODEL_1=$MODEL_1,MODEL_2=$MODEL_2
    

Option 2: App Engine

  1. Create a app.yaml file at the project root where the Dockerfile is located. See yaml reference for more configurations.
    echo "runtime: custom
    env: flex
    env_variables:
        PROJECT_ID: $PROJECT_ID
        BUCKET_FOR_UPLOAD: $BUCKET_FOR_UPLOAD
        ENGINE_1: $ENGINE_1
        MODEL_1: $MODEL_1
        MODEL_2: $MODEL_2
    " > app.yaml
    
    Check that your env variables are reflected in the app.yaml:
    cat app.yaml
    
  2. To deploy the app, run the following command from the directory where your app.yaml and Dockerfile are located:
    gcloud app deploy --image-url=$REGION-docker.pkg.dev/$PROJECT_ID/$REPOSITORY/$IMAGE:$TAG
    
  3. To see your app running at https://PROJECT_ID.REGION_ID.r.appspot.com, run the following command to launch your browser:
    gcloud app browse
    
  • If there are any permission errors, check which Service Account your app engine is using (likely App Engine default service account with Editor role).
    Go to the IAM page at console to ensure the Service Account has the following roles (for convenience, Editor role will solve the issue):
    1. Discovery Engine Admin
    2. Logs Writer
    3. Storage Object User
    4. Vertex AI User

Local Development

All commands should be run at project root directory.

  1. Download required software and set to PATH.

  2. Set your GCP Project environment variables.

    export PROJECT_ID="Project ID"
    export BUCKET_FOR_UPLOAD="GCS Bucket name for document upload"
    export ENGINE_1="Vertex AI Search Engine ID"
    export ENGINE_2="Vertex AI Search Engine ID of a 2nd app (if any)"
    export ENGINE_3="Vertex AI Search Engine ID of a 3rd app (if any)"
    export MODEL_1="text-bison"
    export MODEL_2="text-bison-32k"
    
  3. Set up Application Default Credentials (ADC) for your local development environment.

    1. Install and initialize the gcloud CLI.
    2. Login with your Google Cloud user credentials:
      gcloud auth application-default set-quota-project <PROJECT_ID>
      gcloud auth application-default login
      gcloud config set project <PROJECT_ID>
      
  4. Install Python dependencies.

    python -m venv env
    source env/bin/activate
    pip install -r backend/requirements.txt
    
  5. Run Python backend.

    python backend/app.py
    
  6. Open another terminal and install the React dependencies.

    npm install
    
  7. Run the React app.

    npm start
    

    Open http://localhost:3000 to access your app with Hot Reload capabilities to develop your app dynamically.

Getting Started with Create React App and Redux

This project was bootstrapped with Create React App, using the Redux and Redux Toolkit template.

You can learn more in the Create React App documentation.

To learn React, check out the React documentation.

Author

Created by elroylbj@

Disclaimer: This is NOT an official Google project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •