Skip to content

Commit

Permalink
GitBook: [master] 5 pages modified
Browse files Browse the repository at this point in the history
  • Loading branch information
woop authored and Shu Heng committed Feb 13, 2020
1 parent 50a9992 commit c965a77
Show file tree
Hide file tree
Showing 5 changed files with 220 additions and 50 deletions.
1 change: 1 addition & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
* [Overview](installing-feast/overview.md)
* [Docker Compose](installing-feast/docker-compose.md)
* [Google Kubernetes Engine \(GKE\)](installing-feast/gke.md)
* [Troubleshooting](installing-feast/troubleshooting.md)

## Using Feast

Expand Down
32 changes: 20 additions & 12 deletions docs/installing-feast/docker-compose.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ cp .env.sample .env

## 2. Docker Compose for Online Serving Only

### 2.1 Start Feast \(without batch retrieval support\)

If you do not require batch serving, then its possible to simply bring up Feast:

```javascript
Expand All @@ -56,13 +58,15 @@ A Jupyter Notebook environment is now available to use Feast:

[http://localhost:8888/tree/feast/examples](http://localhost:8888/tree/feast/examples)

## 2. Docker Compose for Online and Batch Serving
## 3. Docker Compose for Online and Batch Serving

{% hint style="info" %}
Batch serving requires Google Cloud Storage to function, specifically Google Cloud Storage \(GCP\) and BigQuery.
{% endhint %}

Create a [service account ](https://cloud.google.com/iam/docs/creating-managing-service-accounts)from the GCP console and copy it to the `gcp-service-accounts` folder:
### 3.1 Set up Google Cloud Platform

Create a [service account ](https://cloud.google.com/iam/docs/creating-managing-service-accounts)from the GCP console and copy it to the `infra/docker-compose/gcp-service-accounts` folder:

```javascript
cp my-service-account.json ${FEAST_HOME_DIR}/infra/docker-compose/gcp-service-accounts
Expand All @@ -74,28 +78,32 @@ Create a Google Cloud Storage bucket. Make sure that your service account above
gsutil mb gs://my-feast-staging-bucket
```

### 2.1 Configure .env
### 3.2 Configure .env

Configure the `.env` file based on your environment. At the very least you have to modify:

* **FEAST\_CORE\_GCP\_SERVICE\_ACCOUNT\_KEY:** This should be your service account file name, for example `key.json`.
* **FEAST\_BATCH\_SERVING\_GCP\_SERVICE\_ACCOUNT\_KEY:** This should be your service account file name, for example `key.json`.
* **FEAST\_JUPYTER\_GCP\_SERVICE\_ACCOUNT\_KEY:** This should be your service account file name, for example `key.json`.
* **FEAST\_JOB\_STAGING\_LOCATION:** Google Cloud Storage bucket that Feast will use to stage data exports and batch retrieval requests, for example `gs://your-gcs-bucket/staging`
| Parameter | Description |
| :--- | :--- |
| FEAST\_CORE\_GCP\_SERVICE\_ACCOUNT\_KEY | This should be your service account file name, for example `key.json`. |
| FEAST\_BATCH\_SERVING\_GCP\_SERVICE\_ACCOUNT\_KEY | This should be your service account file name, for example `key.json` |
| FEAST\_JUPYTER\_GCP\_SERVICE\_ACCOUNT\_KEY | This should be your service account file name, for example `key.json` |
| FEAST\_JOB\_STAGING\_LOCATION | Google Cloud Storage bucket that Feast will use to stage data exports and batch retrieval requests, for example `gs://your-gcs-bucket/staging` |

### 2.2 Configure .bq-store.yml
### 3.3 Configure .bq-store.yml

We will also need to configure the `bq-store.yml` file inside `infra/docker-compose/serving/` to configure the BigQuery storage configuration as well as the feature sets that the store subscribes to. At a minimum you will need to set:

* **project\_id:** This is you [GCP project Id](https://cloud.google.com/resource-manager/docs/creating-managing-projects).
* **dataset\_id:** This is the name of the BigQuery dataset that tables will be created in. Each feature set will have one table in BigQuery.
| Parameter | Description |
| :--- | :--- |
| bigquery\_config.project\_id | This is you [GCP project Id](https://cloud.google.com/resource-manager/docs/creating-managing-projects). |
| bigquery\_config.dataset\_id | This is the name of the BigQuery dataset that tables will be created in. Each feature set will have one table in BigQuery. |

### 2.3 Start Feast with batch retrieval support
### 3.4 Start Feast \(with batch retrieval support\)

Start Feast:

```javascript
docker-compose -f docker-compose.yml -f docker-compose.batch.yml up -d
docker-compose up -d
```

A Jupyter Notebook environment is now available to use Feast:
Expand Down
67 changes: 30 additions & 37 deletions docs/installing-feast/gke.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ This guide will install Feast into a Kubernetes cluster on GCP. It assumes that
This guide requires [Google Cloud Platform](https://cloud.google.com/) for installation.

* [BigQuery](https://cloud.google.com/bigquery/) is used for storing historical features.
* [Cloud Dataflow](https://cloud.google.com/dataflow/) is used for running data ingestion jobs.
* [Google Cloud Storage](https://cloud.google.com/storage/) is used for intermediate data storage.
{% endhint %}

Expand All @@ -34,38 +33,26 @@ export FEAST_GCP_ZONE=us-central1-a
export FEAST_BIGQUERY_DATASET_ID=feast
export FEAST_GCS_BUCKET=${FEAST_GCP_PROJECT_ID}_feast_bucket
export FEAST_GKE_CLUSTER_NAME=feast
export FEAST_S_ACCOUNT_NAME=feast-sa
export FEAST_SERVICE_ACCOUNT_NAME=feast-sa
```

Create a Google Cloud Storage bucket for Feast to stage data during exports:
Create a Google Cloud Storage bucket for Feast to stage batch data exports:

```bash
gsutil mb gs://${FEAST_GCS_BUCKET}
```

Create a BigQuery dataset for storing historical features:

```bash
bq mk ${FEAST_BIGQUERY_DATASET_ID}
```

Create the service account that Feast will run as:

```bash
gcloud iam service-accounts create ${FEAST_S_ACCOUNT_NAME}
gcloud iam service-accounts create ${FEAST_SERVICE_ACCOUNT_NAME}

gcloud projects add-iam-policy-binding ${FEAST_GCP_PROJECT_ID} \
--member serviceAccount:${FEAST_S_ACCOUNT_NAME}@${FEAST_GCP_PROJECT_ID}.iam.gserviceaccount.com \
--member serviceAccount:${FEAST_SERVICE_ACCOUNT_NAME}@${FEAST_GCP_PROJECT_ID}.iam.gserviceaccount.com \
--role roles/editor

gcloud iam service-accounts keys create key.json --iam-account \
${FEAST_S_ACCOUNT_NAME}@${FEAST_GCP_PROJECT_ID}.iam.gserviceaccount.com
```

Ensure that [Dataflow API is enabled](https://console.cloud.google.com/apis/api/dataflow.googleapis.com/overview):

```bash
gcloud services enable dataflow.googleapis.com
${FEAST_SERVICE_ACCOUNT_NAME}@${FEAST_GCP_PROJECT_ID}.iam.gserviceaccount.com
```

## 2. Set up a Kubernetes \(GKE\) cluster
Expand All @@ -87,7 +74,7 @@ Create a secret in the GKE cluster based on your local key `key.json`:
kubectl create secret generic feast-gcp-service-account --from-file=key.json
```

For this guide we will use `NodePort` for exposing Feast services. In order to do so, we must find an internal IP of at least one GKE node.
For this guide we will use `NodePort` for exposing Feast services. In order to do so, we must find an External IP of at least one GKE node. This should be a public IP.

```bash
export FEAST_IP=$(kubectl describe nodes | grep ExternalIP | awk '{print $2}' | head -n 1)
Expand All @@ -96,18 +83,6 @@ export FEAST_ONLINE_SERVING_URL=${FEAST_IP}:32091
export FEAST_BATCH_SERVING_URL=${FEAST_IP}:32092
```

Confirm that you are able to access this node \(please make sure that no firewall rules are preventing access to these ports\):

```bash
ping $FEAST_IP
```

```bash
PING 10.123.114.11 (10.203.164.22) 56(84) bytes of data.
64 bytes from 10.123.114.11: icmp_seq=1 ttl=63 time=54.2 ms
64 bytes from 10.123.114.11: icmp_seq=2 ttl=63 time=51.2 ms
```

Add firewall rules to open up ports on your Google Cloud Platform project:

```bash
Expand Down Expand Up @@ -170,25 +145,29 @@ cp values.yaml my-feast-values.yaml
Update `my-feast-values.yaml` based on your GCP and GKE environment.

* Required fields are paired with comments which indicate whether they need to be replaced.
* All occurrences of `EXTERNAL_IP` should be replaced with either your domain name or the IP stored in `$FEAST_IP`.
* All occurrences of `EXTERNAL_IP` should be replaced with either a domain pointing to a load balancer for the cluster or the IP stored in `$FEAST_IP`.
* Replace all occurrences of `YOUR_BUCKET_NAME` with your bucket name stored in `$FEAST_GCS_BUCKET`
* Change `feast-serving-batch.store.yaml.bigquery_config.project_id` to your GCP project Id.
* Change `feast-serving-batch.store.yaml.bigquery_config.dataset_id` to the BigQuery dataset that Feast should use.

Install the Feast Helm chart:

```bash
helm install --name feast -f my-feast-values.yaml .
```

Ensure that the system comes online. This will take a few minutes
Ensure that the system comes online. This will take a few minutes.

```bash
watch kubectl get pods
kubectl get pods
```

There may be pod restarts while waiting for Kafka to come online.

```bash
NAME READY STATUS RESTARTS AGE
pod/feast-feast-core-666fd46db4-l58l6 1/1 Running 0 5m
pod/feast-feast-serving-online-84d99ddcbd 1/1 Running 0 6m
pod/feast-feast-core-666fd46db4-l58l6 1/1 Running 2 5m
pod/feast-feast-serving-online-84d99ddcbd 1/1 Running 3 6m
pod/feast-kafka-0 1/1 Running 0 3m
pod/feast-kafka-1 1/1 Running 0 4m
pod/feast-kafka-2 1/1 Running 0 4m
Expand All @@ -204,7 +183,7 @@ pod/feast-zookeeper-2 1/1 Running 0 5m
Install the Python SDK using pip:

```bash
pip install -e ${FEAST_HOME_DIR}/sdk/python
pip install feast
```

Configure the Feast Python SDK:
Expand All @@ -214,5 +193,19 @@ feast config set core_url ${FEAST_CORE_URL}
feast config set serving_url ${FEAST_ONLINE_SERVING_URL}
```

Test whether you are able to connect to Feast Core

```text
feast projects list
```

Should print an empty list:

```text
NAME
```

That's it! You can now start to use Feast!

Please see our [examples](https://github.com/gojek/feast/blob/master/examples/) to get started.

2 changes: 1 addition & 1 deletion docs/installing-feast/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This installation guide will demonstrate three ways of installing Feast:
* Does not officially support a production job manager like Dataflow
* \*\*\*\*[**Google Kubernetes Engine**](gke.md)**:**
* Recommended way to install Feast for production use.
* The guide has dependencies on BigQuery, Dataflow, and Google Cloud Storage.
* The guide has dependencies on BigQuery, and Google Cloud Storage.



168 changes: 168 additions & 0 deletions docs/installing-feast/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# Troubleshooting

If at any point in time you cannot resolve a problem, please see the [Getting Help](../getting-help.md) section for reaching out to the Feast community.

## How can I verify that all services are operational?

### Docker Compose

The containers should be in an `up` state:

```text
docker ps
```

```text
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d7447205bced jupyter/datascience-notebook:latest "tini -g -- start-no…" 2 minutes ago Up 2 minutes 0.0.0.0:8888->8888/tcp feast_jupyter_1
8e49dbe81b92 gcr.io/kf-feast/feast-serving:latest "java -Xms1024m -Xmx…" 2 minutes ago Up 5 seconds 0.0.0.0:6567->6567/tcp feast_batch-serving_1
b859494bd33a gcr.io/kf-feast/feast-serving:latest "java -jar /opt/feas…" 2 minutes ago Up About a minute 0.0.0.0:6566->6566/tcp feast_online-serving_1
5c4962811767 gcr.io/kf-feast/feast-core:latest "java -jar /opt/feas…" 2 minutes ago Up 2 minutes 0.0.0.0:6565->6565/tcp feast_core_1
1ba7239e0ae0 confluentinc/cp-kafka:5.2.1 "/etc/confluent/dock…" 2 minutes ago Up 2 minutes 0.0.0.0:9092->9092/tcp, 0.0.0.0:9094->9094/tcp feast_kafka_1
e2779672735c confluentinc/cp-zookeeper:5.2.1 "/etc/confluent/dock…" 2 minutes ago Up 2 minutes 2181/tcp, 2888/tcp, 3888/tcp feast_zookeeper_1
39ac26f5c709 postgres:12-alpine "docker-entrypoint.s…" 2 minutes ago Up 2 minutes 5432/tcp feast_db_1
3c4ee8616096 redis:5-alpine "docker-entrypoint.s…" 2 minutes ago Up 2 minutes 0.0.0.0:6379->6379/tcp feast_redis_1
```

### Google Kubernetes Engine

All services should either be in a `running` state or `complete`state:

```text
kubectl get pods
```

```text
NAME READY STATUS RESTARTS AGE
feast-feast-core-5ff566f946-4wlbh 1/1 Running 1 32m
feast-feast-serving-batch-848d74587b-96hq6 1/1 Running 2 32m
feast-feast-serving-online-df69755d5-fml8v 1/1 Running 2 32m
feast-kafka-0 1/1 Running 1 32m
feast-kafka-1 1/1 Running 0 30m
feast-kafka-2 1/1 Running 0 29m
feast-kafka-config-3e860262-zkzr8 0/1 Completed 0 32m
feast-postgresql-0 1/1 Running 0 32m
feast-prometheus-statsd-exporter-554db85b8d-r4hb8 1/1 Running 0 32m
feast-redis-master-0 1/1 Running 0 32m
feast-zookeeper-0 1/1 Running 0 32m
feast-zookeeper-1 1/1 Running 0 32m
feast-zookeeper-2 1/1 Running 0 31m
```

## How can I verify that I can connect to all services?

First find the `IP:Port` combination of your services.

### **Docker Compose \(from inside the docker cluster\)**

You will probably need to connect using the hostnames of services and standard Feast ports:

```bash
export FEAST_CORE_URL=core:6565
export FEAST_ONLINE_SERVING_URL=online-serving:6566
export FEAST_BATCH_SERVING_URL=batch-serving:6567
```

### **Docker Compose \(from outside the docker cluster\)**

You will probably need to connect using `localhost` and standard ports:

```bash
export FEAST_CORE_URL=localhost:6565
export FEAST_ONLINE_SERVING_URL=localhost:6566
export FEAST_BATCH_SERVING_URL=localhost:6567
```

### **Google Kubernetes Engine \(GKE\)**

You will need to find the external IP of one of the nodes as well as the NodePorts. Please make sure that your firewall is open for these ports:

```bash
export FEAST_IP=$(kubectl describe nodes | grep ExternalIP | awk '{print $2}' | head -n 1)
export FEAST_CORE_URL=${FEAST_IP}:32090
export FEAST_ONLINE_SERVING_URL=${FEAST_IP}:32091
export FEAST_BATCH_SERVING_URL=${FEAST_IP}:32092
```

`netcat`, `telnet`, or even `curl` can be used to test whether all services are available and ports are open, but `grpc_cli` is the most powerful. It can be installed from [here](https://github.com/grpc/grpc/blob/master/doc/command_line_tool.md).

### Testing Feast Core:

```bash
grpc_cli ls ${FEAST_CORE_URL} feast.core.CoreService
```

```text
GetFeastCoreVersion
GetFeatureSet
ListFeatureSets
ListStores
ApplyFeatureSet
UpdateStore
CreateProject
ArchiveProject
ListProjects
```

### Testing Feast Batch Serving and Online Serving

```bash
grpc_cli ls ${FEAST_BATCH_SERVING_URL} feast.serving.ServingService
```

```text
GetFeastServingInfo
GetOnlineFeatures
GetBatchFeatures
GetJob
```

```bash
grpc_cli ls ${FEAST_ONLINE_SERVING_URL} feast.serving.ServingService
```

```text
GetFeastServingInfo
GetOnlineFeatures
GetBatchFeatures
GetJob
```

## How can I print logs from the Feast Services?

Feast will typically have three services that you need to monitor if something goes wrong.

* Feast Core
* Feast Serving \(Online\)
* Feast Serving \(Batch\)

In order to print the logs from these services, please run the commands below.

### Docker Compose

```text
docker logs -f feast_core_1
```

```text
docker logs -f feast_batch-serving_1
```

```text
docker logs -f feast_online-serving_1
```

### Google Kubernetes Engine

```text
kubectl logs $(kubectl get pods | grep feast-core | awk '{print $1}')
```

```text
kubectl logs $(kubectl get pods | grep feast-serving-batch | awk '{print $1}')
```

```text
kubectl logs $(kubectl get pods | grep feast-serving-online | awk '{print $1}')
```

0 comments on commit c965a77

Please sign in to comment.