-
Notifications
You must be signed in to change notification settings - Fork 138
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add admin-guide for hubble
- Loading branch information
Showing
29 changed files
with
570 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Admin Guide | ||
sidebar_position: 15 | ||
--- | ||
|
||
import DocCardList from "@theme/DocCardList"; | ||
|
||
All you need to know about running a Hubble analytics platform. | ||
|
||
<DocCardList /> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Data Curation | ||
sidebar_position: 20 | ||
--- | ||
|
||
import DocCardList from "@theme/DocCardList"; | ||
|
||
Running stellar-dbt-public to transform raw Stellar network data into something better. | ||
|
||
<DocCardList /> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
--- | ||
title: Architecture | ||
sidebar_position: 10 | ||
--- | ||
|
||
import stellar_dbt_arch from '/img/hubble/stellar_dbt_architecture.png'; | ||
|
||
## Architecture Overview | ||
|
||
<img src={stellar_dbt_arch} width="300"/> | ||
|
||
In general stellar-dbt-public runs by: | ||
|
||
* Selecting a dbt model to run | ||
* Within the model run: | ||
* Sources are referenced and used to create staging tables | ||
* Staging tables then undergo various transformations and are stored in intermediate tables | ||
* Finishing touches and joins are done on the intermediate tables which produce the final analytics friendly mart tables | ||
|
||
We try to adhere to the best practices set by the [dbt docs](https://docs.getdbt.com/docs/build/projects) | ||
|
||
More detailed information about stellar-dbt-public and examples can be found in the [stellar-dbt-public](https://github.com/stellar/stellar-dbt-public/tree/master) repo. |
140 changes: 140 additions & 0 deletions
140
network/hubble/admin-guide/data-curation/getting-started.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
--- | ||
title: Getting Started | ||
sidebar_position: 20 | ||
--- | ||
|
||
[stellar-dbt-public GitHub repository](https://github.com/stellar/stellar-dbt-public/tree/master) | ||
|
||
[stellar/stellar-dbt-public docker images](https://hub.docker.com/r/stellar/stellar-dbt-public) | ||
|
||
## Recommended Usage | ||
|
||
### Docker Image | ||
|
||
Generally if you do not need to modify any of the stellar-dbt-public code, it is recommended that you use the [stellar/stellar-dbt-public docker images](https://hub.docker.com/r/stellar/stellar-dbt-public) | ||
|
||
Example to run locally with docker: | ||
|
||
``` | ||
docker run --platform linux/amd64 -ti stellar/stellar-dbt-public:latest <parameters> | ||
``` | ||
|
||
### Import stellar-dbt-public as a dbt Package | ||
|
||
Alternatively, if you need to build your own models on top of stellar-dbt-public, you can import stellar-dbt-public as a dbt package into a separate dbt project. | ||
|
||
Example instructions: | ||
|
||
* Create a new file `packages.yml` in your dbt project (not the stellar-dbt-public project) with the yml below | ||
|
||
``` | ||
packages: | ||
- git: "https://github.com/stellar/stellar-dbt-public.git" | ||
revision: v0.0.28 | ||
``` | ||
|
||
* (Optional) Update your profiles.yml to include profile configurations for stellar-dbt-public | ||
|
||
``` | ||
new_project: | ||
target: test | ||
outputs: | ||
test: | ||
project: <project> | ||
dataset: <dataset> | ||
<other configurations> | ||
stellar_dbt_public: | ||
target: test | ||
outputs: | ||
test: | ||
project: <project> | ||
dataset: <dataset> | ||
<other configurations> | ||
``` | ||
|
||
* (Optional) Update your dbt_project.yml to include project configurations for stellar-dbt-public | ||
|
||
``` | ||
name: 'stellar_dbt' | ||
version: '1.0.0' | ||
config-version: 2 | ||
profile: 'new_project' | ||
model-paths: ["models"] | ||
analysis-paths: ["analyses"] | ||
test-paths: ["tests"] | ||
seed-paths: ["seeds"] | ||
macro-paths: ["macros"] | ||
snapshot-paths: ["snapshots"] | ||
target-path: "target" | ||
clean-targets: | ||
- "target" | ||
- "dbt_packages" | ||
models: | ||
new_project: | ||
staging: | ||
+materialized: view | ||
intermediate: | ||
+materialized: ephemeral | ||
marts: | ||
+materialized: table | ||
stellar_dbt_public: | ||
staging: | ||
+materialized: ephemeral | ||
intermediate: | ||
+materialized: ephemeral | ||
marts: | ||
+materialized: table | ||
``` | ||
|
||
* Models from the stellar-dbt-public package/repo will now be available in your new dbt project | ||
|
||
## Building and Running Locally | ||
|
||
### Clone the repo | ||
|
||
``` | ||
git clone https://github.com/stellar/stellar-dbt-public | ||
``` | ||
|
||
### Install required python packages | ||
|
||
``` | ||
pip install --upgrade pip && pip install -r requirements.txt | ||
``` | ||
|
||
### Install required dbt packages | ||
|
||
``` | ||
dbt deps | ||
``` | ||
|
||
### Running dbt | ||
|
||
* There are many useful commands that come with dbt which can be found in the [dbt documentation](https://docs.getdbt.com/reference/dbt-commands#available-commands) | ||
* stellar-dbt-public is designed to use the `dbt build` command which will `run` the model and `test` the model table output | ||
* (Optional) run with the `--full-refresh` option | ||
|
||
``` | ||
dbt build --full-refresh | ||
``` | ||
|
||
* Subsequent runs can be run with incremental mode (only inserts the newest of data instead of rebuilding all of history every time) | ||
|
||
``` | ||
dbt build | ||
``` | ||
|
||
* You can also specify just a single model if you don't want to run all stellar-dbt-public models | ||
|
||
``` | ||
dbt build --select <model name or tag> | ||
``` | ||
|
||
Please see the [stellar-dbt-public/modles/marts](https://github.com/stellar/stellar-dbt-public/tree/master/models/marts) directory to see a full list of the available models that dbt can run |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
--- | ||
title: "Overview" | ||
sidebar_position: 0 | ||
--- | ||
|
||
Data curation in Hubble is done through [stellar-dbt-public](https://github.com/stellar/stellar-dbt-public). stellar-dbt-public transforms raw Stellar network data from BigQuery datasets and tables into aggregates for more user friendly analytics. | ||
|
||
It is worth noting that most users will not need to standup and run their own stellar-dbt-public instance. The Stellar Development Foundation provides public access to fully transformed Stellar network data through the public datasets and tables in GCP BigQuery. Instructions on how to access this data can be found in the [Connecting](https://developers.stellar.org/network/hubble/analyst-guide/connecting) section. | ||
|
||
## Why Run stellar-dbt-public? | ||
|
||
Running stellar-dbt-public within your own infrastructure provides a number of benefits. You can: | ||
|
||
- Have full operational control without dependency on the Stellar Development Foundation for network data | ||
- Run modified ETL/ELT pipelines that fit your individual business needs |
10 changes: 10 additions & 0 deletions
10
network/hubble/admin-guide/scheduling-and-orchestration/README.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Scheduling and Orchestration | ||
sidebar_position: 100 | ||
--- | ||
|
||
import DocCardList from "@theme/DocCardList"; | ||
|
||
Stitching all the components together. | ||
|
||
<DocCardList /> |
18 changes: 18 additions & 0 deletions
18
network/hubble/admin-guide/scheduling-and-orchestration/architecture.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
--- | ||
title: Architecture | ||
sidebar_position: 10 | ||
--- | ||
|
||
import stellar_etl_airflow_arch from '/img/hubble/stellar_etl_airflow_architecture.png'; | ||
|
||
## Architecture Overview | ||
|
||
<img src={stellar_etl_airflow_arch} width="300"/> | ||
|
||
In general stellar-etl-airflow runs by: | ||
|
||
* Scheduling DAGs to run `stellar-etl` and upload the data outputted to BigQuery | ||
* Scheduling DAGs to run `stellar-dbt-public` using the data in BigQuery | ||
* We try to adhere to the best practices set by the [dbt docs](https://docs.getdbt.com/docs/build/projects) | ||
|
||
More detailed information about stellar-etl-airflow can be found in the [stellar-etl-airflow](https://github.com/stellar/stellar-etl-airflow/tree/master) repo. |
87 changes: 87 additions & 0 deletions
87
network/hubble/admin-guide/scheduling-and-orchestration/getting-started.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
--- | ||
title: Getting Started | ||
sidebar_position: 20 | ||
--- | ||
|
||
import history_table_export from '/img/hubble/history_table_export.png'; | ||
import state_table_export from '/img/hubble/state_table_export.png'; | ||
import dbt_enriched_base_tables from '/img/hubble/dbt_enriched_base_tables.png'; | ||
|
||
[stellar-etl-airflow GitHub repository](https://github.com/stellar/stellar-etl-airflow/tree/master) | ||
|
||
## GCP Account Setup | ||
|
||
The Stellar Development Foundation runs Hubble in GCP using Composer and BigQuery. To follow the same deployment you will need to have access to GCP project. Instructions can be found in the [Get Started](https://cloud.google.com/docs/get-started) documentation from Google. | ||
|
||
Note: BigQuery and Composer should be available by default. If they are not you can find instructions for enabling them in the [BigQuery](https://cloud.google.com/bigquery?hl=en) or [Composer](https://cloud.google.com/composer?hl=en) Google documentation. | ||
|
||
## Create GCP Composer Instance to Run Airflow | ||
|
||
Instructions on bringing up a GCP Composer instance to run Hubble can be found in the [Installation and Setup](https://github.com/stellar/stellar-etl-airflow?tab=readme-ov-file#installation-and-setup) section in the [stellar-etl-airflow](https://github.com/stellar/stellar-etl-airflow) repository. | ||
|
||
:::note | ||
|
||
Hardware requirements can be very different depending on the Stellar network data you require. The default GCP settings may be higher/lower than actually required. | ||
|
||
::: | ||
|
||
## Configuring GCP Composer Airflow | ||
|
||
There are two things required for the configuration and setup of GCP Composer Airflow: | ||
|
||
* Upload DAGs to the Composer Airflow Bucket | ||
* Configure the Airflow variables for your GCP setup | ||
|
||
For more detailed instructions please see the [stellar-etl-airflow Installation and Setup](https://github.com/stellar/stellar-etl-airflow?tab=readme-ov-file#installation-and-setup) documentation. | ||
|
||
### Uploading DAGs | ||
|
||
Within the [stellar-etl-airflow](https://github.com/stellar/stellar-etl-airflow) repo there is an [upload_static_to_gcs.sh](https://github.com/stellar/stellar-etl-airflow/blob/master/upload_static_to_gcs.sh) shell script that will upload all the DAGs and schemas into your Composer Airflow bucket. | ||
|
||
This can also be done using the [gcloud CLI or console](https://cloud.google.com/storage/docs/uploading-objects) and manually selecting the dags and schemas you wish to upload. | ||
|
||
### Configuring Airflow Variables | ||
|
||
Please see the [Airflow Variables Explanation](https://github.com/stellar/stellar-etl-airflow?tab=readme-ov-file#airflow-variables-explanation) documentation for more information about what should and needs to be configured. | ||
|
||
## Running the DAGs | ||
|
||
To run a DAG all you have to do is toggle the DAG on/off as seen below | ||
|
||
![Toggle DAGs](/img/hubble/airflow_dag_toggle.png) | ||
|
||
More information about each DAG can be found in the [DAG Diagrams](https://github.com/stellar/stellar-etl-airflow?tab=readme-ov-file#dag-diagrams) documentation. | ||
|
||
## Available DAGs | ||
|
||
More information can be found [here](https://github.com/stellar/stellar-etl-airflow/blob/master/README.md#public-dags) | ||
|
||
### History Table Export DAG | ||
|
||
[This DAG](https://github.com/stellar/stellar-etl-airflow/blob/master/dags/history_tables_dag.py): | ||
|
||
- Exports part of sources: ledgers, operations, transactions, trades, effects and assets from Stellar using the data lake of LedgerCloseMeta files | ||
- Optionally this can ingest data using captive-core but that is not ideal nor recommended for usage with Airflow | ||
- Inserts into BigQuery | ||
|
||
<img src={history_table_export} width="300"/> | ||
|
||
### State Table Export DAG | ||
|
||
[This DAG](https://github.com/stellar/stellar-etl-airflow/blob/master/dags/state_table_dag.py) | ||
|
||
- Exports accounts, account_signers, offers, claimable_balances, liquidity pools, trustlines, contract_data, contract_code, config_settings and ttl. | ||
- Inserts into BigQuery | ||
|
||
<img src={state_table_export} width="300"/> | ||
|
||
### DBT Enriched Base Tables DAG | ||
|
||
[This DAG](https://github.com/stellar/stellar-etl-airflow/blob/master/dags/dbt_enriched_base_tables_dag.py) | ||
|
||
- Creates the DBT staging views for models | ||
- Updates the enriched_history_operations table | ||
- Updates the current state tables | ||
- (Optional) warnings and errors are sent to slack. | ||
|
||
<img src={dbt_enriched_base_tables} width="300"/> |
15 changes: 15 additions & 0 deletions
15
network/hubble/admin-guide/scheduling-and-orchestration/overview.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
--- | ||
title: "Overview" | ||
sidebar_position: 0 | ||
--- | ||
|
||
Hubble uses [stellar-etl-airflow](https://github.com/stellar/stellar-etl-airflow) to schedule and orchestrate all its workflows. This includes the scheduling and running of stellar-etl and stellar-dbt. | ||
|
||
It is worth noting that most users will not need to standup and run their own Hubble. The Stellar Development Foundation provides public access to the data through the public datasets and tables in GCP BigQuery. Instructions on how to access this data can be found in the [Connecting](https://developers.stellar.org/network/hubble/connecting) section. | ||
|
||
## Why Run stellar-etl-ariflow? | ||
|
||
Running stellar-etl-airflow within your own infrastructure provides a number of benefits. You can: | ||
|
||
- Have full operational control without dependency on the Stellar Development Foundation for network data | ||
- Run modified ETL/ELT pipelines that fit your individual business needs |
10 changes: 10 additions & 0 deletions
10
network/hubble/admin-guide/source-system-ingestion/README.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Source System Ingestion | ||
sidebar_position: 10 | ||
--- | ||
|
||
import DocCardList from "@theme/DocCardList"; | ||
|
||
Running stellar-etl for Stellar network data ingestion. | ||
|
||
<DocCardList /> |
25 changes: 25 additions & 0 deletions
25
network/hubble/admin-guide/source-system-ingestion/architecture.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
--- | ||
title: Architecture | ||
sidebar_position: 10 | ||
--- | ||
|
||
import stellar_arch from '/img/hubble/stellar_overall_architecture.png'; | ||
import stellar_etl_arch from '/img/hubble/stellar_etl_architecture.png'; | ||
|
||
## Architecture Overview | ||
|
||
<img src={stellar_arch} width="600"/> | ||
|
||
<img src={stellar_etl_arch} width="300"/> | ||
|
||
In general stellar-etl runs by: | ||
|
||
* Read raw data from the Stellar network | ||
* This can be done by running a stellar-etl export command to export data between a start and end ledger | ||
* stellar-etl has the ability to read from two different sources: | ||
* Captive-core directly to get LedgerCloseMeta | ||
* A data lake of compressed LedgerCloseMeta files from Ledger Exporter | ||
* Tranforms the LedgerCloseMeta XDR into an easy to parse JSON format | ||
* Optionally uploads the JSON files to GCS or any other cloud storage service | ||
|
||
More detailed information about stellar-etl and examples can be found in the [stellar-etl](https://github.com/stellar/stellar-etl/tree/master) repo. |
Oops, something went wrong.