Skip to content

Commit

Permalink
Merge branch 'master' into release-horizon-v22.0.1
Browse files Browse the repository at this point in the history
  • Loading branch information
tamirms authored Nov 14, 2024
2 parents f09f3e4 + 7137253 commit 7940f88
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 125 deletions.
2 changes: 1 addition & 1 deletion services/galexie/DEVELOPER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Galexie is a tool to export Stellar network transaction data to cloud storage in a way that is easy to access.

## Prerequisites
This document assumes that you have installed and can run Galexie, and that you have familiarity with its CLI and configuration. If not, please refer to the [Installation Guide](./README.md).
This document assumes that you have installed and can run Galexie, and that you have familiarity with its CLI and configuration. If not, please refer to the [Admin Guide](https://developers.stellar.org/docs/data/galexie/admin_guide).

## Goal
The goal of Galexie is to build an easy-to-use tool to export Stellar network ledger data to a configurable remote data store, such as cloud blob storage.
Expand Down
133 changes: 9 additions & 124 deletions services/galexie/README.md
Original file line number Diff line number Diff line change
@@ -1,129 +1,14 @@
## Galexie: Installation and Usage Guide
# Galexie

This guide provides step-by-step instructions on installing and using the Galexie - Ledger Exporter, a tool that exports Stellar network ledger data to a Google Cloud Storage (GCS) bucket for efficient analysis and storage.
Galaxie is a simple, lightweight application that bundles Stellar network data, processes it and writes it to an external data store.

* [Prerequisites](#prerequisites)
* [Setup](#setup)
* [Set Up GCP Credentials](#set-up-gcp-credentials)
* [Create a GCS Bucket for Storage](#create-a-gcs-bucket-for-storage)
* [Running Galexie](#running-galexie)
* [Pull the Docker Image](#1-pull-the-docker-image)
* [Configure](#2-configure-configtoml)
* [Run](#3-run)
* [Command Line Interface (CLI)](#command-line-interface-cli)
1. [append: Continuously Export New Data](#1-append-continuously-export-new-data)
2. [scan-and-fill: Fill Data Gaps](#2-scan-and-fill-fill-data-gaps)
## Resources

## Prerequisites
- **Introduction to Galexie**\
Learn about Galexie’s purpose, features and how it integrates with other system: [Introduction to Galexie](https://developers.stellar.org/docs/data/galexie)

* **Google Cloud Platform (GCP) Account:** You will need a GCP account to create a GCS bucket for storing the exported data.
* **Docker:** Allows you to run the Galexie in a self-contained environment. The official Docker installation guide: [https://docs.docker.com/engine/install/](https://docs.docker.com/engine/install/)
- **Installation and Usage Guide**\
A Step-by-step guide to install, configure and run Galexie: [Admin Guide](https://developers.stellar.org/docs/data/galexie/admin_guide)

## Setup

### Set Up GCP Credentials

Create application default credentials for your Google Cloud Platform (GCP) project by following these steps:
1. Download the [SDK](https://cloud.google.com/sdk/docs/install).
2. Install and initialize the [gcloud CLI](https://cloud.google.com/sdk/docs/initializing).
3. Create [application authentication credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc#google-idp) and store it in a secure location on your system, such as $HOME/.config/gcloud/application_default_credentials.json.

For detailed instructions, refer to the [Providing Credentials for Application Default Credentials (ADC) guide.](https://cloud.google.com/docs/authentication/provide-credentials-adc)

### Create a GCS Bucket for Storage

1. Go to the GCP Console's Storage section ([https://console.cloud.google.com/storage](https://console.cloud.google.com/storage)) and create a new bucket.
2. Choose a descriptive name for the bucket, such as `stellar-ledger-data`. Refer to [Google Cloud Storage Bucket Naming Guideline](https://cloud.google.com/storage/docs/buckets#naming) for more information.
3. **Note down the bucket name** as you'll need it later in the configuration process.


## Running Galexie

### 1. Pull the Docker Image

Open a terminal window and download the Stellar Galexie Docker image using the following command:

```bash
docker pull stellar/stellar-galexie
```

### 2. Configure (config.toml)
Galexie relies on a configuration file (config.toml) to connect to your specific environment. This file defines details like:
- Your Google Cloud Storage (GCS) bucket where exported ledger data will be stored.
- Stellar network settings, such as the network you're using (testnet or pubnet).
- Datastore schema to control data organization.

A sample configuration file [config.example.toml](config.example.toml) is provided. Copy and rename it to config.toml for customization. Edit the copied file (config.toml) to replace placeholders with your specific details.

### 3. Run

The following command demonstrates how to run the Galexie:

```bash
docker run --platform linux/amd64 \
-v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \
-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \
-v ${PWD}/config.toml:/config.toml \
stellar/stellar-galexie <command> [options]
```

**Explanation:**

* `--platform linux/amd64`: Specifies the platform architecture (adjust if needed for your system).
* `-v`: Mounts volumes to map your local GCP credentials and config.toml file to the container:
* `$HOME/.config/gcloud/application_default_credentials.json`: Your local GCP credentials file.
* `${PWD}/config.toml`: Your local configuration file.
* `-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json`: Sets the environment variable for credentials within the container.
* `stellar/stellar-galexie`: The Docker image name.
* `<command>`: The Stellar Galexie command: [append](#1-append-continuously-export-new-data), [scan-and-fill](#2-scan-and-fill-fill-data-gaps))

## Command Line Interface (CLI)

Galexie offers two mode of operation for exporting ledger data:

### 1. append: Continuously Export New Data


Exports ledgers initially searching from --start, looking for the next absent ledger sequence number proceeding --start on the data store. If abscence is detected, the export range is narrowed to `--start <absent_ledger_sequence>`.
This feature requires ledgers to be present on the remote data store for some (possibly empty) prefix of the requested range and then absent for the (possibly empty) remainder.

In this mode, the --end ledger can be provided to stop the process once export has reached that ledger, or if absent or 0 it will result in continous exporting of new ledgers emitted from the network.

It’s guaranteed that ledgers exported during `append` mode from `start` and up to the last logged ledger file `Uploaded {ledger file name}` were contiguous, meaning all ledgers within that range were exported to the data lake with no gaps or missing ledgers in between.


**Usage:**

```bash
docker run --platform linux/amd64 -d \
-v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \
-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \
-v ${PWD}/config.toml:/config.toml \
stellar/stellar-galexie \
append --start <start_ledger> [--end <end_ledger>] [--config-file <config_file>]
```

Arguments:
- `--start <start_ledger>` (required): The starting ledger sequence number for the export process.
- `--end <end_ledger>` (optional): The ending ledger sequence number. If omitted or set to 0, it will continuously export new ledgers as they appear on the network.
- `--config-file <config_file_path>` (optional): The path to your configuration file, containing details like GCS bucket information. If not provided, it will look for config.toml in the directory where you run the command.

### 2. scan-and-fill: Fill Data Gaps

Scans the datastore (GCS bucket) for the specified ledger range and exports any missing ledgers to the datastore. This mode avoids unnecessary exports if the data is already present. The range is specified using the --start and --end options.

**Usage:**

```bash
docker run --platform linux/amd64 -d \
-v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \
-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \
-v ${PWD}/config.toml:/config.toml \
stellar/stellar-galexie \
scan-and-fill --start <start_ledger> --end <end_ledger> [--config-file <config_file>]
```

Arguments:
- `--start <start_ledger>` (required): The starting ledger sequence number in the range to export.
- `--end <end_ledger>` (required): The ending ledger sequence number in the range.
- `--config-file <config_file_path>` (optional): The path to your configuration file, containing details like GCS bucket information. If not provided, the exporter will look for config.toml in the directory where you run the command.
- **Developer Guide**\
Documentation for developers working with or contributing to Galexie: [Developer Guide](https://github.com/stellar/go/blob/master/services/galexie/DEVELOPER_GUIDE.md)

0 comments on commit 7940f88

Please sign in to comment.