serving-model-cards/bart at main · entrpn/serving-model-cards

History

Name		Name	Last commit message	Last commit date
parent directory ..
app		app
Dockerfile		Dockerfile
README.md		README.md
entrypoint.sh		entrypoint.sh
generate_request_vertex.py		generate_request_vertex.py
load_weights.sh		load_weights.sh
requirements.txt		requirements.txt
test_container.py		test_container.py

README.md

bart

Intro

This repo containerizes BART into a serving container using fastapi.

CPU and GPU inference supported.

The model license can be found here

Setup

Clone repo if you haven't, Navigate to the bart folder.
Build container. Don't forget to change the project_id to yours.
```
docker build . -t gcr.io/{project_id}/bart:latest
```

Run container. No GPU is needed for this model.

docker run --rm -p 80:8080 -e AIP_HEALTH_ROUTE=/health -e AIP_HTTP_PORT=8080 -e AIP_PREDICT_ROUTE=/predict gcr.io/{project_id}/bart:latest

Make predictions
```
python test_container.py
```

Deploy in Vertex AI.

You'll need to enable Vertex AI and have authenticated with a service account that has the Vertex AI admin or editor role.

Push the image

gcloud auth configure-docker
docker push gcr.io/{project_id}/bart:latest

Deploy in Vertex AI Endpoints.

python ../gcp_deploy.py --image-uri gcr.io/<project_id>/bart:latest --accelerator-count 0 --model-name bart --endpoint-name bart-endpoint --endpoint-deployed-name bart-deployed-name

Test the endpoint.
```
python generate_request_vertex.py
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bart

bart

README.md

bart

Intro

Setup

Deploy in Vertex AI.

Files

bart

Directory actions

More options

Directory actions

More options

Latest commit

History

bart

Folders and files

parent directory

README.md

bart

Intro

Setup

Deploy in Vertex AI.