From 3b2dad58b65654f684aa1af1904915b33439feec Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Tue, 27 Jun 2023 13:12:03 -0700 Subject: [PATCH 01/25] #147: fixed broken mdx links and formatting --- docs/run-platform-server/ingestion.mdx | 291 +++--------------------- docs/run-platform-server/monitoring.mdx | 96 ++------ 2 files changed, 52 insertions(+), 335 deletions(-) diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index 7b75156fa..b05e4133c 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -5,26 +5,32 @@ sidebar_position: 45 import { CodeExample } from "@site/src/components/CodeExample"; -Horizon provides access to both current and historical state on the Stellar network through a process called **ingestion**. +Horizon API provides most of its utility through ingested data, and your Horizon server can be configured to listen for and ingest transaction results from the Stellar network. Ingestion enables API access to both current (e.g. someone's balance) and historical state (e.g. someone's transaction history). -Horizon provides most of its utility through ingested data, and your Horizon server can be configured to listen for and ingest transaction results from the Stellar network. Ingestion enables API access to both current (e.g. someone's balance) and historical state (e.g. someone's transaction history). +## Determine storage space + +You should think carefully about the historical timeframe of ingested data you'd like to retain on Horizon's database, the storage requirements for the entire Stellar network are substantial and grow unbounded, increasing storage used by the Horizon database well into many terabytes, and is an unsustainable deployment. Most organizations and operators only need recent fractions of the historical data, to fit their use case. To Achieve minimal storage footprint, we recommend leveraging the following: + +- [ingestion filter rules](./ingestion-filtering.mdx) based on accounts and assets related to your application space. +- temporal limits, we recommend limiting historical retention of ingested data to a sliding window of 1 month(`HISTORY_RETENTION_COUNT=518400`) and this is the [default set by Horizon](./configuring.mdx#ingestion-role-instance). This is based on metrics and trend analysis of our own API servers, we have seen most requests are for near term data. + - for access to historical data more than 1 month old, we recommend not using Horizon's database for this purpose, instead consider usage of [Stellar Hubble Data Warehouse](https://www.stellar.org/developers-blog/try-our-new-analytics-dataset?locale=en). ## Ingestion Types There are two primary ingestion use-cases for Horizon operations: - ingesting **live** data to stay up to date with the latest, real-time changes to the Stellar network, and -- ingesting **historical** data to peek how the Stellar ledger has changed over time +- ingesting **historical** data to retroactively add network data from a time range in the past to the database. ### Ingesting Live Data -Though this option is disabled by default, in this guide we've [assumed](./configuring.mdx) you turned it on. If you haven't, pass the `--ingest` flag or set `INGEST=true` in your environment. +This option is enabled by default, it is controlled with flag `INGEST`. Refer to [Configuration](./configuring.mdx) for how an instance of Horizon performs the ingestion role. -For a serious setup, **we highly recommend having more than one live ingesting instance**, as this makes it easier to avoid downtime during upgrades and adds resilience to your infrastructure, ensuring you always have the latest network data. +For a H/A requirements, **we highly recommend having more than one live ingesting instance**, as this makes it easier to avoid downtime during upgrades and adds resilience to your infrastructure, ensuring you always have the latest network data, refer to [Ingestion Role Instance](./configuring.mdx#multiple-instance-deployment) ### Ingesting Historical Data -Providing API access to historical data is facilitated by a Horizon subcommand: +Import network data from a past date range in to the database: @@ -34,29 +40,15 @@ stellar-horizon db reingest range -_(The command name is a bit of a misnomer: you can use `reingest` both to ingest new ledger data and reingest old data.)_ - -You can run this process in the background while your Horizon server is up. It will continuously decrement the `history.elder_ledger` in your `/metrics` endpoint until the `` ledger is reached and the backfill is complete. If Horizon receives a request for a ledger it hasn't ingested, it returns a 503 error and clarify that it's `Still Ingesting` (see [below](#some-endpoints-are-not-available-during-state-ingestion)). - -#### Deciding on how much history to ingest - -You should think carefully about the amount of ingested data you'd like to keep around. Though the storage requirements for the entire Stellar network are substantial, **most organizations and operators only need a small fraction of the history** to fit their use case. For example, - -- If you just started developing a new application or service, you can probably get away with just doing live ingestion, since nothing you do requires historical data. - -- If you're moving an existing service away from reliance on SDF's Horizon, you likely only need history from the point at which you started using the Stellar network. +Running any historical range of ingestion requires coordination with the data retention configuration chosen. If you have a temporal limit on history set with `HISTORY_RETENTION_COUNT=` then it makes no sense to ingest any time range that is older as it will get purged from the database almose as soon as it's added. -- If you provide temporal guarantees to your users--a 6-month guarantee of transaction history like some online banks do, or history only for the last thousand ledgers (see [below](#managing-storage)), for example--then you similarly don't have heavy ingestion requirements. +Typically the only time you need to run historical ingestion is once when boot-strapping a system after first deployment, from that point forward **live** ingestion will keep the database populated with the expected sliding window of trailing historical data. Maybe one exception is if you think you have a gap in the database caused by the **live** ingestion being down, in which case you can run historical ingestion range to essentially gap fill. -Even a massively-popular, well-established custodial service probably doesn't need full history to service its users. It will, however, need full history to be a [Full Validator](../run-core-node/index.mdx#full-validator) with published history archives. +You can run historical ingestion in parallel in background while your main Horizon server separately performs **live** ingestion. If the range specified overlaps with data already in the database, it will simply be overwritten, effectively idempotent. -#### Reingestion +#### Parallel ingestion workers -Regardless of whether you are running live ingestion or building up historical data, you may occasionally need to \_re_ingest ledgers anew (for example on certain upgrades of Horizon). For this, you use the same command as above. - -#### Parallel ingestion - -Note that historical (re)ingestion happens independently for any given ledger range, so you can reingest in parallel across multiple Horizon processes: +You can break up historical date range into slices and run each in parallel as a separate process: @@ -69,240 +61,29 @@ horizon3> stellar-horizon db reingest range 20001 30000 -#### Managing storage - -Over time, the recorded network history will grow unbounded, increasing storage used by the database. Horizon needs sufficient disk space to expand the data ingested from Stellar Core. Unless you need to maintain a [history archive](../run-core-node/publishing-history-archives.mdx), you should configure Horizon to only retain a certain number of ledgers in the database. - -This is done using the `--history-retention-count` flag or the `HISTORY_RETENTION_COUNT` environment variable. Set the value to the number of recent ledgers you wish to keep around, and every hour the Horizon subsystem will reap expired data. Alternatively, Horizon provides a command to force a collection: - - - -```bash -stellar-horizon db reap -``` - - - -### Common Issues - -Ingestion is a complicated process, so there are a number of things to look out for. - -#### Some endpoints are not available during state ingestion - -Endpoints that display state information are not available during initial state ingestion and will return a `503 Service Unavailable`/`Still Ingesting` error. An example is the `/paths` endpoint (built using offers). Such endpoints will become available after state ingestion is done (usually within a couple of minutes). - -#### State ingestion is taking a lot of time - -State ingestion shouldn't take more than a couple of minutes on an AWS `c5.xlarge` instance or equivalent. - -It's possible that the progress logs (see [below](#reading-the-logs)) will not show anything new for a longer period of time or print a lot of progress entries every few seconds. This happens because of the way history archives are designed. - -The ingestion is still working but it's processing entries of type `DEADENTRY`. If there is a lot of them in the bucket, there are no _active_ entries to process. We plan to improve the progress logs to display actual percentage progress so it's easier to estimate an ETA. - -If you see that ingestion is not proceeding for a very long period of time: +### Notes -1. Check the RAM usage on the machine. It's possible that system ran out of RAM and is using swap memory that is extremely slow. -1. If above is not the case, file a [new issue](https://github.com/stellar/go/issues/new/choose) in the [Horizon repository](https://github.com/stellar/go/tree/master/services/horizon). - -#### CPU usage goes high every few minutes - -**This is by design**. Horizon runs a state verifier routine that compares state in local storage to history archives every 64 ledgers to ensure data changes are applied correctly. If data corruption is detected, Horizon will block access to endpoints serving invalid data. - -We recommend keeping this security feature turned on; however, if it's causing problems (due to CPU usage) this can be disabled via the `--ingest-disable-state-verification`/`INGEST_DISABLE_STATE_VERIFICATION` parameter. - -## Ingesting Full Public Network History - -In some (albeit rare) cases, it can be convenient to (re)ingest the full Stellar Public Network history into Horizon (e.g. when running Horizon for the first time). Using multiple Captive Core workers on a high performance environment (powerful machines on which to run Horizon + a powerful database) makes this possible in ~1.5 days. - -The following instructions assume the reingestion is done on AWS. However, they should be applicable to any other environment with equivalent capacity. In the same way, the instructions can be adapted to reingest only specific parts of the history. - -### Prerequisites - -Before we begin, we make some assumptions around the environment required. Please refer to the [Prerequisites](./prerequisites.mdx) section for the current HW requirements to run Horizon reingestion for either historical catch up or real-time ingestion (for staying in sync with the ledger). A few things to keep in mind: - -1. For reingestion, the more parallel workers are provisioned to speed up the process, the larger the machine size is required in terms of RAM, CPU, IOPS and disk size. The size of the RAM per worker also increases over time (14GB RAM / worker as of mid 2022) due to the growth of the ledger. HW specs can be downsized once reingestion is completed. - -1. [Horizon](./installing.mdx) latest version installed on the machine from (1). - -1. [Core](https://github.com/stellar/stellar-core) latest version installed on the machine from (1). - -1. A Horizon database where to reingest the history. Preferably, the database should be empty to minimize storage (Postgres accumulates data during usage, which is only deleted when `VACUUM`ed) and have the minimum spec's for reingestion as outlined in [Prerequisites](./prerequisites.mdx). - -As the DB storage grows, the IO capacity will grow along with it. The number of workers (and the size of the instance created in (1), should be increased accordingly if we want to take advantage of it. To make sure we are minimizing reingestion time, we should watch write IOPS. It should ideally always be close to the theoretical limit of the DB. - -### Parallel Reingestion - -Once the prerequisites are satisfied, we can spawn two Horizon reingestion processes in parallel: - -1. One for the first 17 million ledgers (which are almost empty). -1. Another one for the rest of the history. - -This is due to first 17 million ledgers being almost empty whilst the rest are much more packed. Having a single Horizon instance with enough workers to saturate the IO capacity of the machine for the first 17 million would kill the machine when reingesting the rest (during which there is a higher CPU and memory consumption per worker). - -64 workers for (1) and 20 workers for (2) saturates instance with RAM and 15K IOPS. Again, as the DB storage grows, a larger number of workers and faster storage should be considered. - -In order to run the reingestion, first set the following environment variables in the [configuration](./configuring.mdx) (updating values to match your database environment, of course): - - - -```bash -export DATABASE_URL=postgres://postgres:secret@db.local:5432/horizon -export APPLY_MIGRATIONS=true -export HISTORY_ARCHIVE_URLS=https://s3-eu-west-1.amazonaws.com/history.stellar.org/prd/core-live/core_live_001 -export NETWORK_PASSPHRASE="Public Global Stellar Network ; September 2015" -export STELLAR_CORE_BINARY_PATH=$(which stellar-core) -export ENABLE_CAPTIVE_CORE_INGESTION=true -# Number of ledgers per job sent to the workers. -# The larger the job, the better performance from Captive Core's perspective, -# but, you want to choose a job size which maximizes the time all workers are -# busy. -export PARALLEL_JOB_SIZE=100000 -# Retries per job -export RETRIES=10 -export RETRY_BACKOFF_SECONDS=20 - -# Enable optional config when running captive core ingestion - -# For stellar-horizon to download buckets locally at specific location. -# If not enabled, stellar-horizon would download data in the current working directory. -# export CAPTIVE_CORE_STORAGE_PATH="/var/lib/stellar" - -# For stellar-horizon to use local disk file for ledger states rather than in memory(RAM), approximately -# 8GB of space and increasing as size of ledger entries grows over time. -# export CAPTIVE_CORE_USE_DB=true -``` - - +#### Some endpoints may report not available during **live** ingestion -(Naturally, you can also edit the configuration file at `/etc/default/stellar-horizon` directly if you installed [from a package manager](./installing.mdx#package-manager).) +Endpoints that display current state information from **live** ingestion may return `503 Service Unavailable`/`Still Ingesting` error. An example is the `/paths` endpoint (built using offers). Such endpoints will become available after **live** ingestion has finished network synchronization and catch up(usually within a couple of minutes). -If Horizon was previously running, first ensure it is stopped. Then, run the following commands in parallel: - -1. `stellar-horizon db reingest range --parallel-workers=64 1 16999999` -1. `stellar-horizon db reingest range --parallel-workers=20 17000000 ` - -(Where you can find `` under [SDF Horizon's](https://horizon.stellar.org/) `core_latest_ledger` field.) - -When saturating a database instance with 15K IOPS capacity: - -(1) should take a few hours to complete. - -(2) should take about 3 days to complete. - -Although there is a retry mechanism, reingestion may fail half-way. Horizon will print the recommended range to use in order to restart it. - -When reingestion is complete it's worth running `ANALYZE VERBOSE [table]` on all tables to recalculate the stats. This should improve the query speed. - -### Monitoring reingestion process - -This script should help monitor the reingestion process by printing the ledger subranges being reingested: - - - -```bash -#!/bin/bash -echo "Current ledger ranges being reingested:" -echo -I=1 -for S in $(ps aux | grep stellar-core | grep catchup | awk '{print $15}' | sort -n); do - printf '%15s' $S - if [ $(( I % 5 )) = 0 ]; then - echo - fi - I=$(( I + 1)) -done -``` - - - -Ideally we would be using Prometheus metrics for this, but they haven't been implemented yet. - -Here is an example run: - - - -``` -Current ledger ranges being reingested: - 99968/99968 199936/99968 299904/99968 399872/99968 499840/99968 - 599808/99968 699776/99968 799744/99968 899712/99968 999680/99968 - 1099648/99968 1199616/99968 1299584/99968 1399552/99968 1499520/99968 - 1599488/99968 1699456/99968 1799424/99968 1899392/99968 1999360/99968 - 2099328/99968 2199296/99968 2299264/99968 2399232/99968 2499200/99968 - 2599168/99968 2699136/99968 2799104/99968 2899072/99968 2999040/99968 - 3099008/99968 3198976/99968 3298944/99968 3398912/99968 3498880/99968 - 3598848/99968 3698816/99968 3798784/99968 3898752/99968 3998720/99968 - 4098688/99968 4198656/99968 4298624/99968 4398592/99968 4498560/99968 - 4598528/99968 4698496/99968 4798464/99968 4898432/99968 4998400/99968 - 5098368/99968 5198336/99968 5298304/99968 5398272/99968 5498240/99968 - 5598208/99968 5698176/99968 5798144/99968 5898112/99968 5998080/99968 - 6098048/99968 6198016/99968 6297984/99968 6397952/99968 17099967/99968 - 17199935/99968 17299903/99968 17399871/99968 17499839/99968 17599807/99968 - 17699775/99968 17799743/99968 17899711/99968 17999679/99968 18099647/99968 - 18199615/99968 18299583/99968 18399551/99968 18499519/99968 18599487/99968 - 18699455/99968 18799423/99968 18899391/99968 18999359/99968 19099327/99968 - 19199295/99968 19299263/99968 19399231/99968 -``` - - - -## Reading Logs - -In order to check the progress and status of ingestion you should check your logs regularly; all logs related to ingestion are tagged with `service=ingest`. - -It starts with informing you about state ingestion: - - - -``` -INFO[...] Starting ingestion system from empty state... pid=5965 service=ingest temp_set="*io.MemoryTempSet" -INFO[...] Reading from History Archive Snapshot pid=5965 service=ingest ledger=25565887 -``` - - - -During state ingestion, Horizon will log the number of processed entries every 100,000 entries (there are currently around 10M entries in the public network): - - - -``` -INFO[...] Processing entries from History Archive Snapshot ledger=25565887 numEntries=100000 pid=5965 service=ingest -INFO[...] Processing entries from History Archive Snapshot ledger=25565887 numEntries=200000 pid=5965 service=ingest -INFO[...] Processing entries from History Archive Snapshot ledger=25565887 numEntries=300000 pid=5965 service=ingest -INFO[...] Processing entries from History Archive Snapshot ledger=25565887 numEntries=400000 pid=5965 service=ingest -INFO[...] Processing entries from History Archive Snapshot ledger=25565887 numEntries=500000 pid=5965 service=ingest -``` - - - -When state ingestion is finished, it will proceed to ledger ingestion starting from the next ledger after the checkpoint ledger (25565887+1 in this example) to update the state using transaction metadata: - - - -``` -INFO[...] Processing entries from History Archive Snapshot ledger=25565887 numEntries=5400000 pid=5965 service=ingest -INFO[...] Processing entries from History Archive Snapshot ledger=25565887 numEntries=5500000 pid=5965 service=ingest -INFO[...] Processed ledger ledger=25565887 pid=5965 service=ingest type=state_pipeline -INFO[...] Finished processing History Archive Snapshot duration=2145.337575904 ledger=25565887 numEntries=5529931 pid=5965 service=ingest shutdown=false -INFO[...] Reading new ledger ledger=25565888 pid=5965 service=ingest -INFO[...] Processing ledger ledger=25565888 pid=5965 service=ingest type=ledger_pipeline updating_database=true -INFO[...] Processed ledger ledger=25565888 pid=5965 service=ingest type=ledger_pipeline -INFO[...] Finished processing ledger duration=0.086024492 ledger=25565888 pid=5965 service=ingest shutdown=false transactions=14 -INFO[...] Reading new ledger ledger=25565889 pid=5965 service=ingest -INFO[...] Processing ledger ledger=25565889 pid=5965 service=ingest type=ledger_pipeline updating_database=true -INFO[...] Processed ledger ledger=25565889 pid=5965 service=ingest type=ledger_pipeline -INFO[...] Finished processing ledger duration=0.06619956 ledger=25565889 pid=5965 service=ingest shutdown=false transactions=29 -INFO[...] Reading new ledger ledger=25565890 pid=5965 service=ingest -INFO[...] Processing ledger ledger=25565890 pid=5965 service=ingest type=ledger_pipeline updating_database=true -INFO[...] Processed ledger ledger=25565890 pid=5965 service=ingest type=ledger_pipeline -INFO[...] Finished processing ledger duration=0.071039012 ledger=25565890 pid=5965 service=ingest shutdown=false transactions=20 -``` - - +#### If more than five minutes has elapsed with no new ingested data: -## Managing Stale Historical Data +- Verify host machine meets recommended [Prerequisites](./prerequisites.mdx). -Horizon ingests ledger data from a managed, pared-down Captive Stellar Core instance. In the event that Captive Core crashes, lags, or if Horizon stops ingesting data for any other reason, the view provided by Horizon will start to lag behind reality. For simpler applications, this may be fine, but in many cases this lag is unacceptable and the application should not continue operating until the lag is resolved. +- Check horizon log output. + - Are there many `level=error` messages, maybe an environmental issue, access to database is lost, etc. + - **live** ingestion will emit two key log lines about once every 5 seconds based on latest ledger emitted from network. Tail the horizon log output and grep for presence of these lines with a filter: + ``` + tail -f horizon.log | | grep -E 'Processed ledger|Closed ledger' + ``` + If you don't see output from this pipeline every couple of seconds for a new ledger then ingestion is not proceeding, look at full logs and see if any alternative messages are printing reasons to the contrary. May see lines mentioning 'catching up' When connecting to pubnet, as it can take up to 5 minutes for the captive core process started by Horizon to catch up to pubnet network. + - Check RAM usage on the machine. It's possible that system ran low on RAM and is using swap memory which will result in slow performance. + - Verify the read/write throughput speeds on the volume that current working directory for horizon process is using. Volume should have at least 10mb/s, one way to roughly verify this on host command line: + ``` + sudo dd if=/dev/zero of=/tmp/test_speed.img bs=1G count=1 + ``` -To help applications that cannot tolerate lag, Horizon provides a configurable "staleness" threshold. If enough lag accumulates to surpass this threshold (expressed in number of ledgers), Horizon will only respond with an error: [`stale_history`](https://github.com/stellar/go/blob/master/services/horizon/internal/docs/reference/errors/stale-history.md). To configure this option, use the `--history-stale-threshold`/`HISTORY_STALE_THRESHOLD` parameter. +#### Monitoring ingestion process -**Note:** Non-historical requests (such as submitting transactions or checking account balances) will not error out if the staleness threshold is surpassed. +For high availability deployments, it is recommended to implement monitoring of ingestion process for visibility on performance/health. Refer to [Monitoring](./monitoring.mdx) for accessing logs and metrics from horizon. Stellar publishes the example [Horizon Grafana Dashboard](https://grafana.com/grafana/dashboards/13793-stellar-horizon/) which demonstrates queries against key horizon ingestion metrics, specifically look at the `Local Ingestion Delay` and `Last Ledger Age` in the `Health Summary` panel. diff --git a/docs/run-platform-server/monitoring.mdx b/docs/run-platform-server/monitoring.mdx index 582961d32..de75b3645 100644 --- a/docs/run-platform-server/monitoring.mdx +++ b/docs/run-platform-server/monitoring.mdx @@ -5,85 +5,15 @@ sidebar_position: 60 import { CodeExample } from "@site/src/components/CodeExample"; -To ensure that your instance of Horizon is performing correctly, we encourage you to monitor it and provide both logs and metrics to do so. +To ensure that your instance of Horizon is performing correctly, we encourage you to monitor it and provide logs and metrics to do so. ## Metrics -Metrics are collected while a Horizon process is running and they are exposed _privately_ via the `/metrics` path, accessible only through the Horizon admin port. You need to configure this via `--admin-port` or `ADMIN_PORT`, since it's disabled by default. If you're running such an instance locally, you can access this endpoint: - - - -``` -$ stellar-horizon --admin-port=4200 & -$ curl localhost:4200/metrics -# HELP go_gc_duration_seconds A summary of the GC invocation durations. -# TYPE go_gc_duration_seconds summary -go_gc_duration_seconds{quantile="0"} 1.665e-05 -go_gc_duration_seconds{quantile="0.25"} 2.1889e-05 -go_gc_duration_seconds{quantile="0.5"} 2.4062e-05 -go_gc_duration_seconds{quantile="0.75"} 3.4226e-05 -go_gc_duration_seconds{quantile="1"} 0.001294239 -go_gc_duration_seconds_sum 0.002469679 -go_gc_duration_seconds_count 25 -# HELP go_goroutines Number of goroutines that currently exist. -# TYPE go_goroutines gauge -go_goroutines 23 -and so on... -``` - - +Metrics are emitted from a running Horizon process and exposed _privately_ via the `/metrics` path, accessible only through the Horizon admin port bound to host machine. You need to add environment configuration parameter `ADMIN_PORT=XXXXX` , since it's disabled by default, then metrics are published on the host machine as `localhost:\metrics`. -## Logs +### Queries -Horizon will output logs to standard out. Information about what requests are coming in will be reported, but more importantly, warnings or errors will also be emitted by default. A correctly running Horizon instance will not output any warning or error log entries. - -Below we present a few standard log entries with associated fields. You can use them to build metrics and alerts. Please note that these represent Horizon app metrics only. You should also monitor your hardware metrics like CPU or RAM Utilization. - -### Starting HTTP request - -| Key | Value | -| --- | --- | -| **`msg`** | **`Starting request`** | -| `client_name` | Value of `X-Client-Name` HTTP header representing client name | -| `client_version` | Value of `X-Client-Version` HTTP header representing client version | -| `app_name` | Value of `X-App-Name` HTTP header representing app name | -| `app_version` | Value of `X-App-Version` HTTP header representing app version | -| `forwarded_ip` | First value of `X-Forwarded-For` header | -| `host` | Value of `Host` header | -| `ip` | IP of a client sending HTTP request | -| `ip_port` | IP and port of a client sending HTTP request | -| `method` | HTTP method (`GET`, `POST`, ...) | -| `path` | Full request path, including query string (ex. `/transactions?order=desc`) | -| `streaming` | Boolean, `true` if request is a streaming request | -| `referer` | Value of `Referer` header | -| `req` | Random value that uniquely identifies a request, attached to all logs within this HTTP request | - -### Finished HTTP request - -| Key | Value | -| --- | --- | -| **`msg`** | **`Finished request`** | -| `bytes` | Number of response bytes sent | -| `client_name` | Value of `X-Client-Name` HTTP header representing client name | -| `client_version` | Value of `X-Client-Version` HTTP header representing client version | -| `app_name` | Value of `X-App-Name` HTTP header representing app name | -| `app_version` | Value of `X-App-Version` HTTP header representing app version | -| `duration` | Duration of request in seconds | -| `forwarded_ip` | First value of `X-Forwarded-For` header | -| `host` | Value of `Host` header | -| `ip` | IP of a client sending HTTP request | -| `ip_port` | IP and port of a client sending HTTP request | -| `method` | HTTP method (`GET`, `POST`, ...) | -| `path` | Full request path, including query string (ex. `/transactions?order=desc`) | -| `route` | Route pattern without query string (ex. `/accounts/{id}`) | -| `status` | HTTP status code (ex. `200`) | -| `streaming` | Boolean, `true` if request is a streaming request | -| `referer` | Value of `Referer` header | -| `req` | Random value that uniquely identifies a request, attached to all logs within this HTTP request | - -### Metrics - -Using the entries above you can build metrics that will help understand performance of a given Horizon node. For example: +Build queries that highlight performance of a given Horizon deployment. Refer to Stellar's [Grafana Horizon Dashboard](https://grafana.com/grafana/dashboards/13793-stellar-horizon/) for examples of metrics queries to derive application performance: - Number of requests per minute. - Number of requests per route (the most popular routes). @@ -99,15 +29,21 @@ Using the entries above you can build metrics that will help understand performa ### Alerts -Below are example alerts with potential causes and solutions. Feel free to add more alerts using your metrics: +Once queries are developed on a Grafana dashboard, it enables convenient follow-on step to add [alert rules](https://grafana.com/docs/grafana/latest/alerting/alerting-rules/create-grafana-managed-rule/) based on specific queries to trigger notifications when thresholds are exceeded. + +Here are some example alerts to consider with potential causes and solutions. | Alert | Cause | Solution | | --- | --- | --- | -| Spike in number of requests | Potential DoS attack | Lower rate-limiting threshold | -| Large number of rate-limited requests | Rate-limiting threshold too low | Increase rate-limiting threshold | -| Ingestion is slow | Horizon server spec too low | Increase hardware spec | -| Spike in average response time of a single route | Possible bug in a code responsible for rendering a route | Report an issue in Horizon repository. | +| Spike in number of requests | Potential DoS attack | network load balance or content switch configurations | +| Ingestion is slow | host server compute resources are low | increase compute specs | + +## Logs + +Horizon will output logs to standard out. Information about what http requests and ingestion will be emitted. Typically, there are very few `warn` or `error` severity level messages emitted. The default severity level logged in Horizon is configured to `LOG_LEVEL=info`, this environment configuration parameter can be set to one of(increasing order of severity, decreasing order of verbosity) `trace, debug, info, warn, error`. + +For production deployments, we recommend using the default severity at `info` level and redirecting the standard out to a file and apply a log rotation tool on the file such as [logrotate](https://man7.org/linux/man-pages/man8/logrotate.8.html) to manage disk space usage. ## I'm Stuck! Help! -If any of the above steps don't work or you are otherwise prevented from correctly setting up Horizon, please join our community and let us know. Either post a question at [our Stack Exchange](https://stellar.stackexchange.com/) or chat with us on [Keybase in #dev_discussion](https://keybase.io/team/stellar.public) to ask for help. +If any of the above steps don't work or you are otherwise prevented from correctly setting up Horizon, please join our community and let us know. Either post a question at [our Stack Exchange](https://stellar.stackexchange.com/) or chat with us on [Horizon Discord](https://discord.com/channels/897514728459468821/912466080960766012) to ask for help. From 619152ba317088dde81374504a8b53568468a17f Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Tue, 27 Jun 2023 15:05:08 -0700 Subject: [PATCH 02/25] #147: first pass at revamp of ingestion filtering docs --- .../ingestion-filtering.mdx | 100 ++++-------------- docs/run-platform-server/ingestion.mdx | 2 +- 2 files changed, 24 insertions(+), 78 deletions(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 6e330246f..ad6c7ee0c 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -3,85 +3,46 @@ title: Ingestion Filtering order: 46 --- -The Ingestion Filtering feature is now released for public beta testing available from Horizon [version 2.18.0](https://github.com/stellar/go/releases/tag/horizon-v2.18.0) and up. - ## Overview -Ingestion Filtering enables Horizon operators to drastically reduce storage footprint of their Horizon DB by whitelisting Assets and/or Accounts that are relevant to their operations. This feature is ideally suited for private Horizon operators who do not need full history for all assets and accounts on the Stellar network. +Ingestion Filtering enables Horizon operators to drastically reduce storage footprint of the historical data on Horizon DB by whitelisting Assets and/or Accounts that are relevant to their operations. ### Why is it useful: -Previously, the only way to limit data storage is by limiting the amount of history Horizon ingests, either by configuring the starting ledger to be later than genesis block or via rolling retention (ie: last 30 days). This feature allows users to store the full history of assets and accounts (and related entities) that they care about. +Previously, the only way to limit data storage is by limiting the temporaral range of history via rolling retention (ie: last 30 days). Filtering feature allows users to store a longer historical timeframe on the Horizon database for only whitelisted assets, accounts and their related historical entities(transactions, operations, trades, etc) -For further context, running a full history Horizon instance currently takes ~ 15TB of disk space (as of June 2022) with storage growing at a rate of ~ 1TB / month. As a benchmark, filtering by even 100 of the most active accounts and assets reduces storage by over 90%. For the majority of users who care about an even more limited set of assets and accounts, storage savings should be well over 99%. Other benefits are reducing operating costs for maintaining storage, improved DB health metrics and query performance. +For further context, running a non-filtered `full` history Horizon instance currently takes ~ 25TB of disk space (as of June 2023) with storage growing at a rate of ~ 1TB / month. As a benchmark, filtering by even 100 of the most active accounts and assets reduces storage by over 90%. For the majority of applications which are interested in an even more limited set of assets and accounts, storage savings should be well over 99%. Other benefits are reducing operating costs for maintaining storage, improved DB health metrics and query performance. ### How does it work: -This feature provides an ability to select which ledger transactions are accepted at ingestion time to be stored in Horizon’s historical database. Filter whitelists are maintained via an admin REST API (and persisted in the DB). The ingestion process checks the list and persists transactions related to Accounts and Assets that are whitelisted. Note that the feature does not filter the current state of the ledger and related DB tables, only history tables. +This feature operates by accepting only ledger transactions that match to a filter rule criteria when persisting the transactions and operations to historical tables on Horizon database at ingestion time, the rest that done't match are skipped and not stored on database. + +The feature does not filter the storage of current network ledger state in Horizon database as that is required for referential integrity of entity identity in sql database overall, however db storage for current state is relatively extremely lower than historical concerns and does not increase exponentially, in the order of a few GB's total. -Whitelisting can include the following supported entities: +Filter rules can whitelist ingestion by the following supported entities: - Account id - Asset id (canonical) -Given that all transactions related to the white listed entities are included, all historical time series data related to those transactions are saved in horizon's history db as well. For example, whitelisting an Asset will also persist all Accounts that interact with that Asset and vice versa, if an Account is whitelisted, all assets that are held by that Account will also be included. +Given that all transactions related to the white listed entities are included, all historical time series data related to those transactions are saved in horizon's history db, including transaction itself, all operations in the transaction and references to any ancillary entities from operations. ## Configuration: -The filters and their configuration are optional features and must be enabled with horizon command line or environmental parameters: - -``` -admin-port=[your_choice] -``` - -and - -``` -exp-enable-ingestion-filtering=true -``` - -As Environment properties: - -``` -ADMIN_PORT= -``` - -and - -``` -EXP-ENABLE-INGESTION-FILTERING=True -``` - -These should be included in addition to the standard ingestion parameters that must be set also to enable the ingestion engine to be running, such as `ingest=true`, etc. Once these flags are included at horizon runtime, filter configurations and their rules are initially empty and the filters are disabled by default. To enable filters, update the configuration settings, refer to the Admin API Docs which are published as Open API 3.0 doc on the Admin Port at `http://localhost:/`. You can paste the contents from that url into any OAPI tool such as [Swagger](https://editor.swagger.io/) which will render a visual explorer of the API endpoints. Follow details and examples for endpoints: - -``` -/ingestion/filters/account -/ingestion/filters/asset -``` +Filtering is enabled by default, however, no filter rules are included by default, meaning effectively no filtering of ingested data happens by default. To start filtering ingestion: -## Operation: +- enable Horizon admin port with environmental configuration parameter `ADMIN_PORT=XXXXX`, this will allow you to access the port. +- define filter whitelists. submit Admin HTTP API requests to view and update the filter rules: -Adding and Removing Entities can be done by submitting PUT requests to the `http://localhost:/` endpoint. + Refer to the [Horizon Admin API Docs](https://github.com/stellar/go/blob/horizon-v2.18.0/services/horizon/internal/httpx/static/admin_oapi.yml) which are also published on Horizon instance as Open API 3.0 doc on the Admin Port at `http://localhost:/`. You can paste the contents from that url into any OAPI tool such as [Swagger](https://editor.swagger.io/) which will render a visual explorer of the API endpoints. Follow details and example request/response payloads for these filter rule endpoints: -To add new filtered entities, submit an `HTTP PUT` request to the admin API endpoints for either Asset or Account filters. The PUT request body will be JSON that expresses the filter rules, currently the rules model is a whitelist format and expressed as JSON string array. To remove entities, submit an `HTTP PUT` request to update the list accordingly. To retrieve what is currently configured, submit an `HTTP GET` request. + ``` + /ingestion/filters/account + /ingestion/filters/asset + ``` -The OAPI doc published by the Admin Server can be pulled directly from the Github repo [here](https://github.com/stellar/go/blob/horizon-v2.18.0/services/horizon/internal/httpx/static/admin_oapi.yml). +### Gap fill on filtered historical data: -### Reverting Options: - -1. Disable both Asset and Account Filter config rules via the [Admin API](https://github.com/stellar/go/blob/master/services/horizon/internal/httpx/static/admin_oapi.yml) by setting `enabled=false` in each filter rule, or set `--exp-enable_ingestion_filtering=false`, this will open up forward ingestion to include all data again. It is then your choice whether to run a Re-ingestion to capture older data from past that would have been dropped by filters but could now be re-imported with filters off, e.g. `horizon db reingest ` - -2. If you have a DB backup: - -- restore the DB -- run a Reingestion Gap Fill command to fill in the gaps to current tip of the chain -- resume Ingestion Sync - -3. Start over with a fresh DB (or see Patching Historical Data below) - -### Patching Historical Data: - -If new Assets or Accounts are added to the whitelist and you would like to patch in its missing historical data, Reingestion can be run. The Reingestion process is idempotent and will re-ingest the data from the designated ledger range and overwrite or insert new data if not already on current DB. +If new Assets or Accounts are added to the whitelist rules and you would like to pull in its missing historical data, which would have been dropped earlie, reingestion can be run. The Reingestion process is idempotent and will re-ingest the data from the designated historical ledger range and `upsert` to Horizon historical data, i.e. overwrite or insert new data if not already on current database. ## Sample Use Case: @@ -97,29 +58,14 @@ I would like to store the full history of all transactions related from the gene ### Pre-requisites: -You have an existing Horizon installed, configured and has forward ingestion enabled at a minimum to be able to successfully sync to the current state of the Stellar network. Bonus if you are familiar with running re-ingestion. - -Steps: - -1. Configure 4 whitelisted Assets via the Admin API. Also check the `HISTORY_RETENTION_COUNT` and set it to `0` if you don’t want any history purged anymore now that you are filtering, otherwise it will continue to reap all data older than the retention. - -2. Decide if you want to wipe existing history data on the DB first before the filtering starts running, you can effectively clear the history by running - -``` -HISTORY_RETENTION_COUNT=1 stellar-horizon db reap -``` - -or drop/create the db and run `stellar-horizon db init`. +You have installed Horizon with empty database, it has **live** ingestion enabled. -Alternatively, if you do not need to free up old history tables, you can effectively stop here, anytime changes or enablement of filter rules are done, the history tables will immediately reflect filtered data per those latest rules from the time the filter config is updated and forward. +### Steps: -3. If starting with a fresh DB, decide if you want to re-run ingestion from the earliest ledger # related to the whitelisted entities to populate history for just the allowed data from filters. +1. Configure a filter rule with 4 whitelisted Assets via the Admin API. Also check the `HISTORY_RETENTION_COUNT` and set it to `0` if you don’t want any history purged anymore now that you are filtering, otherwise it will continue to reap all data older than the retention. -- Tip: To find this ledger number, you can check for the earliest transaction of the Account issuing that asset. -- Also consider running parallel workers to speed up the process. +2. If you do not need prior historical data to the present time, you can effectively stop here, anytime changes or enablement of filter rules are done, the history tables will immediately reflect filtered data per those latest rules from the time the filter rule is updated and onward. -4. Optional: When re-ingestion is finished, run an ingestion gap fill `stellar-horizon db fill-gaps` to fill any gaps that may have been missed. +3. Perform a separate historical [reingestion](ingestion.mdx#ingesting-historical-data) specifying a range with the earliest ledger # in network history that you want retained for the whitelisted entities. -5. Verify that your data is there -- Do a spot check of Accounts that should be automatically be ingested against a full history Horizon instance such as SDF Horizon diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index b05e4133c..c871b8a85 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -12,7 +12,7 @@ Horizon API provides most of its utility through ingested data, and your Horizon You should think carefully about the historical timeframe of ingested data you'd like to retain on Horizon's database, the storage requirements for the entire Stellar network are substantial and grow unbounded, increasing storage used by the Horizon database well into many terabytes, and is an unsustainable deployment. Most organizations and operators only need recent fractions of the historical data, to fit their use case. To Achieve minimal storage footprint, we recommend leveraging the following: - [ingestion filter rules](./ingestion-filtering.mdx) based on accounts and assets related to your application space. -- temporal limits, we recommend limiting historical retention of ingested data to a sliding window of 1 month(`HISTORY_RETENTION_COUNT=518400`) and this is the [default set by Horizon](./configuring.mdx#ingestion-role-instance). This is based on metrics and trend analysis of our own API servers, we have seen most requests are for near term data. +- temporal limits, if no filter rules are applied, we recommend limiting historical retention of ingested data to a sliding window of 1 month(`HISTORY_RETENTION_COUNT=518400`) and this is the [default set by Horizon](./configuring.mdx#ingestion-role-instance). This is based on metrics and trend analysis of our own API servers, we have seen most requests are for near term data. - for access to historical data more than 1 month old, we recommend not using Horizon's database for this purpose, instead consider usage of [Stellar Hubble Data Warehouse](https://www.stellar.org/developers-blog/try-our-new-analytics-dataset?locale=en). ## Ingestion Types From 8465fa1f9d72be3433628a8ccecdd4b2d4282f2c Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Tue, 27 Jun 2023 15:22:47 -0700 Subject: [PATCH 03/25] #147: format the ingestion docs --- docs/run-platform-server/ingestion-filtering.mdx | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index ad6c7ee0c..69310bf86 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -5,7 +5,7 @@ order: 46 ## Overview -Ingestion Filtering enables Horizon operators to drastically reduce storage footprint of the historical data on Horizon DB by whitelisting Assets and/or Accounts that are relevant to their operations. +Ingestion Filtering enables Horizon operators to drastically reduce storage footprint of the historical data on Horizon DB by whitelisting Assets and/or Accounts that are relevant to their operations. ### Why is it useful: @@ -15,7 +15,7 @@ For further context, running a non-filtered `full` history Horizon instance curr ### How does it work: -This feature operates by accepting only ledger transactions that match to a filter rule criteria when persisting the transactions and operations to historical tables on Horizon database at ingestion time, the rest that done't match are skipped and not stored on database. +This feature operates by accepting only ledger transactions that match to a filter rule criteria when persisting the transactions and operations to historical tables on Horizon database at ingestion time, the rest that done't match are skipped and not stored on database. The feature does not filter the storage of current network ledger state in Horizon database as that is required for referential integrity of entity identity in sql database overall, however db storage for current state is relatively extremely lower than historical concerns and does not increase exponentially, in the order of a few GB's total. @@ -24,7 +24,7 @@ Filter rules can whitelist ingestion by the following supported entities: - Account id - Asset id (canonical) -Given that all transactions related to the white listed entities are included, all historical time series data related to those transactions are saved in horizon's history db, including transaction itself, all operations in the transaction and references to any ancillary entities from operations. +Given that all transactions related to the white listed entities are included, all historical time series data related to those transactions are saved in horizon's history db, including transaction itself, all operations in the transaction and references to any ancillary entities from operations. ## Configuration: @@ -58,7 +58,7 @@ I would like to store the full history of all transactions related from the gene ### Pre-requisites: -You have installed Horizon with empty database, it has **live** ingestion enabled. +You have installed Horizon with empty database, it has **live** ingestion enabled. ### Steps: @@ -67,5 +67,3 @@ You have installed Horizon with empty database, it has **live** ingestion enable 2. If you do not need prior historical data to the present time, you can effectively stop here, anytime changes or enablement of filter rules are done, the history tables will immediately reflect filtered data per those latest rules from the time the filter rule is updated and onward. 3. Perform a separate historical [reingestion](ingestion.mdx#ingesting-historical-data) specifying a range with the earliest ledger # in network history that you want retained for the whitelisted entities. - - From 862c3bbc7255bf9aaea222334c55a3c4380cb3b9 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Wed, 28 Jun 2023 10:53:20 -0700 Subject: [PATCH 04/25] #147: updated reasoning of live mode ingestion --- docs/run-platform-server/ingestion.mdx | 35 +++++++++++++------------ docs/run-platform-server/monitoring.mdx | 4 +-- 2 files changed, 19 insertions(+), 20 deletions(-) diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index c871b8a85..f12c352ef 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -7,24 +7,25 @@ import { CodeExample } from "@site/src/components/CodeExample"; Horizon API provides most of its utility through ingested data, and your Horizon server can be configured to listen for and ingest transaction results from the Stellar network. Ingestion enables API access to both current (e.g. someone's balance) and historical state (e.g. someone's transaction history). -## Determine storage space - -You should think carefully about the historical timeframe of ingested data you'd like to retain on Horizon's database, the storage requirements for the entire Stellar network are substantial and grow unbounded, increasing storage used by the Horizon database well into many terabytes, and is an unsustainable deployment. Most organizations and operators only need recent fractions of the historical data, to fit their use case. To Achieve minimal storage footprint, we recommend leveraging the following: - -- [ingestion filter rules](./ingestion-filtering.mdx) based on accounts and assets related to your application space. -- temporal limits, if no filter rules are applied, we recommend limiting historical retention of ingested data to a sliding window of 1 month(`HISTORY_RETENTION_COUNT=518400`) and this is the [default set by Horizon](./configuring.mdx#ingestion-role-instance). This is based on metrics and trend analysis of our own API servers, we have seen most requests are for near term data. - - for access to historical data more than 1 month old, we recommend not using Horizon's database for this purpose, instead consider usage of [Stellar Hubble Data Warehouse](https://www.stellar.org/developers-blog/try-our-new-analytics-dataset?locale=en). - ## Ingestion Types There are two primary ingestion use-cases for Horizon operations: -- ingesting **live** data to stay up to date with the latest, real-time changes to the Stellar network, and +- ingesting **live** data to stay up to date with the latest ledgers from the network, accumulating a sliding window of aged ledgers, and - ingesting **historical** data to retroactively add network data from a time range in the past to the database. +## Determine storage space + +You should think carefully about the historical timeframe of ingested data you'd like to retain on Horizon's database. The storage requirements for every transaction from Stellar network are substantial and grow unbounded, increasing storage used by the Horizon database well into many terabytes and is an unsustainable deployment. Most organizations and operators only need a recent fraction of the historical data, to fit their use case. To Achieve minimal storage footprint, we recommend the following best practices: + +- use **live** ingestion. only use and depend on **historical** ingestion in limited exceptional cases. +- [ingestion filter rules](./ingestion-filtering.mdx) based on accounts and assets related to your application space. +- temporal limits, if no filter rules are applied, we recommend limiting historical retention of ingested data to a sliding window of 1 month(`HISTORY_RETENTION_COUNT=518400`) and this is the [default set by Horizon](./configuring.mdx#ingestion-role-instance). This is based on metrics and trend analysis of our own API servers, we have seen most requests are for near term data. + - if not using filter rules and want to access historical data more than 1 month old, we recommend not using Horizon ingestion and its database for this purpose, instead consider usage of [Stellar Hubble Data Warehouse](https://www.stellar.org/developers-blog/try-our-new-analytics-dataset?locale=en). + ### Ingesting Live Data -This option is enabled by default, it is controlled with flag `INGEST`. Refer to [Configuration](./configuring.mdx) for how an instance of Horizon performs the ingestion role. +This option is enabled by default and is the recommended mode of ingestion to run. It is controlled with environent configuration flag `INGEST`. Refer to [Configuration](./configuring.mdx) for how an instance of Horizon performs the ingestion role. For a H/A requirements, **we highly recommend having more than one live ingesting instance**, as this makes it easier to avoid downtime during upgrades and adds resilience to your infrastructure, ensuring you always have the latest network data, refer to [Ingestion Role Instance](./configuring.mdx#multiple-instance-deployment) @@ -74,15 +75,15 @@ Endpoints that display current state information from **live** ingestion may ret - Check horizon log output. - Are there many `level=error` messages, maybe an environmental issue, access to database is lost, etc. - **live** ingestion will emit two key log lines about once every 5 seconds based on latest ledger emitted from network. Tail the horizon log output and grep for presence of these lines with a filter: - ``` - tail -f horizon.log | | grep -E 'Processed ledger|Closed ledger' - ``` - If you don't see output from this pipeline every couple of seconds for a new ledger then ingestion is not proceeding, look at full logs and see if any alternative messages are printing reasons to the contrary. May see lines mentioning 'catching up' When connecting to pubnet, as it can take up to 5 minutes for the captive core process started by Horizon to catch up to pubnet network. + ``` + tail -f horizon.log | | grep -E 'Processed ledger|Closed ledger' + ``` + If you don't see output from this pipeline every couple of seconds for a new ledger then ingestion is not proceeding, look at full logs and see if any alternative messages are printing reasons to the contrary. May see lines mentioning 'catching up' When connecting to pubnet, as it can take up to 5 minutes for the captive core process started by Horizon to catch up to pubnet network. - Check RAM usage on the machine. It's possible that system ran low on RAM and is using swap memory which will result in slow performance. - Verify the read/write throughput speeds on the volume that current working directory for horizon process is using. Volume should have at least 10mb/s, one way to roughly verify this on host command line: - ``` - sudo dd if=/dev/zero of=/tmp/test_speed.img bs=1G count=1 - ``` + ``` + sudo dd if=/dev/zero of=/tmp/test_speed.img bs=1G count=1 + ``` #### Monitoring ingestion process diff --git a/docs/run-platform-server/monitoring.mdx b/docs/run-platform-server/monitoring.mdx index de75b3645..35d25a77b 100644 --- a/docs/run-platform-server/monitoring.mdx +++ b/docs/run-platform-server/monitoring.mdx @@ -5,8 +5,6 @@ sidebar_position: 60 import { CodeExample } from "@site/src/components/CodeExample"; -To ensure that your instance of Horizon is performing correctly, we encourage you to monitor it and provide logs and metrics to do so. - ## Metrics Metrics are emitted from a running Horizon process and exposed _privately_ via the `/metrics` path, accessible only through the Horizon admin port bound to host machine. You need to add environment configuration parameter `ADMIN_PORT=XXXXX` , since it's disabled by default, then metrics are published on the host machine as `localhost:\metrics`. @@ -40,7 +38,7 @@ Here are some example alerts to consider with potential causes and solutions. ## Logs -Horizon will output logs to standard out. Information about what http requests and ingestion will be emitted. Typically, there are very few `warn` or `error` severity level messages emitted. The default severity level logged in Horizon is configured to `LOG_LEVEL=info`, this environment configuration parameter can be set to one of(increasing order of severity, decreasing order of verbosity) `trace, debug, info, warn, error`. +Horizon will output logs to standard out. It will log on all aspects of runtime including http requests and ingestion. Typically, there are very few `warn` or `error` severity level messages emitted. The default severity level logged in Horizon is configured to `LOG_LEVEL=info`, this environment configuration parameter can be set to one of `trace, debug, info, warn, error`. The verbosity of log output is inverse of the severity level chosen. I.e. for most verbose logs use 'trace', for least verbose logs use 'error'. For production deployments, we recommend using the default severity at `info` level and redirecting the standard out to a file and apply a log rotation tool on the file such as [logrotate](https://man7.org/linux/man-pages/man8/logrotate.8.html) to manage disk space usage. From fe43a5ff095e8c827ff0de1c90d347d0d574ba45 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Wed, 28 Jun 2023 11:13:20 -0700 Subject: [PATCH 05/25] #147: fix incomplete sentences --- docs/run-platform-server/ingestion.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index f12c352ef..6f67ac536 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -19,8 +19,8 @@ There are two primary ingestion use-cases for Horizon operations: You should think carefully about the historical timeframe of ingested data you'd like to retain on Horizon's database. The storage requirements for every transaction from Stellar network are substantial and grow unbounded, increasing storage used by the Horizon database well into many terabytes and is an unsustainable deployment. Most organizations and operators only need a recent fraction of the historical data, to fit their use case. To Achieve minimal storage footprint, we recommend the following best practices: - use **live** ingestion. only use and depend on **historical** ingestion in limited exceptional cases. -- [ingestion filter rules](./ingestion-filtering.mdx) based on accounts and assets related to your application space. -- temporal limits, if no filter rules are applied, we recommend limiting historical retention of ingested data to a sliding window of 1 month(`HISTORY_RETENTION_COUNT=518400`) and this is the [default set by Horizon](./configuring.mdx#ingestion-role-instance). This is based on metrics and trend analysis of our own API servers, we have seen most requests are for near term data. +- use [ingestion filter rules](./ingestion-filtering.mdx) based on accounts and assets related to your application space. +- use temporal limits, if no filter rules are applied, we recommend limiting historical retention of ingested data to a sliding window of 1 month(`HISTORY_RETENTION_COUNT=518400`) and this is the [default set by Horizon](./configuring.mdx#ingestion-role-instance). This is based on metrics and trend analysis of our own API servers, we have seen most requests are for near term data. - if not using filter rules and want to access historical data more than 1 month old, we recommend not using Horizon ingestion and its database for this purpose, instead consider usage of [Stellar Hubble Data Warehouse](https://www.stellar.org/developers-blog/try-our-new-analytics-dataset?locale=en). ### Ingesting Live Data From dac497ff7b8e58e7080063f363214a68cb90afcd Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Wed, 28 Jun 2023 11:26:04 -0700 Subject: [PATCH 06/25] #174: refer to prereqs more --- docs/run-platform-server/ingestion.mdx | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index 6f67ac536..016a4879a 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -79,12 +79,12 @@ Endpoints that display current state information from **live** ingestion may ret tail -f horizon.log | | grep -E 'Processed ledger|Closed ledger' ``` If you don't see output from this pipeline every couple of seconds for a new ledger then ingestion is not proceeding, look at full logs and see if any alternative messages are printing reasons to the contrary. May see lines mentioning 'catching up' When connecting to pubnet, as it can take up to 5 minutes for the captive core process started by Horizon to catch up to pubnet network. - - Check RAM usage on the machine. It's possible that system ran low on RAM and is using swap memory which will result in slow performance. - - Verify the read/write throughput speeds on the volume that current working directory for horizon process is using. Volume should have at least 10mb/s, one way to roughly verify this on host command line: + - Check RAM usage on the machine, it's possible that system ran low on RAM and is using swap memory which will result in slow performance. Verify host machine meets minimum RAM [prerequisites](./prerequisites.mdx). + - Verify the read/write throughput speeds on the volume that current working directory for horizon process is using. Based on [prerequisites](./prerequisites.mdx), volume should have at least 10mb/s, one way to roughly verify this on host machine(linux/mac) command line: ``` sudo dd if=/dev/zero of=/tmp/test_speed.img bs=1G count=1 ``` #### Monitoring ingestion process -For high availability deployments, it is recommended to implement monitoring of ingestion process for visibility on performance/health. Refer to [Monitoring](./monitoring.mdx) for accessing logs and metrics from horizon. Stellar publishes the example [Horizon Grafana Dashboard](https://grafana.com/grafana/dashboards/13793-stellar-horizon/) which demonstrates queries against key horizon ingestion metrics, specifically look at the `Local Ingestion Delay` and `Last Ledger Age` in the `Health Summary` panel. +For high availability deployments, it is recommended to implement monitoring of ingestion process for visibility on performance/health. Refer to [Monitoring](./monitoring.mdx) for accessing logs and metrics from horizon. Stellar publishes the example [Horizon Grafana Dashboard](https://grafana.com/grafana/dashboards/13793-stellar-horizon/) which demonstrates queries against key horizon ingestion metrics, specifically look at the `Local Ingestion Delay [Ledgers]` and `Last ledger age` in the `Health Summary` panel. From c76ce07cd27b56534fe284200f04a1b93bd290f8 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Wed, 28 Jun 2023 14:15:00 -0700 Subject: [PATCH 07/25] #147: add distinction between dev and prod installs --- docs/run-platform-server/installing.mdx | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/run-platform-server/installing.mdx b/docs/run-platform-server/installing.mdx index c5c96519c..34201489a 100644 --- a/docs/run-platform-server/installing.mdx +++ b/docs/run-platform-server/installing.mdx @@ -5,14 +5,16 @@ sidebar_position: 20 import { CodeExample } from "@site/src/components/CodeExample"; -To install Horizon, you have choices, we recommend the following for target infrastructure: +To install Horizon in production or non-development environments, we recommend the following based on target infrastructure: - bare-metal: - if host is debian linux, install prebuilt binaries [from repositories](#package-manager) using package manager. - for any other hosts, download [prebuilt release binaries](#prebuilt-releases) of Stellar Horizon and Core for host target architecture and operation system. - containerized: - non-Helm Chart, if the target envrironment for container to run does not support Helm chart usage, run the prebuilt docker image of Horizon published on [dockerhub.com/stellar/horizon](https://hub.docker.com/r/stellar/stellar-horizon). - - Helm charts, when the target envrionment uses container orchestration such as Kubernetes and has enabled Helm Charts on cluster. The Horizon Helm chart manages installation life cycle. Use the [Helm Install command](https://helm.sh/docs/helm/helm_install/), it will accept Horizon's configuration parameters. Please review [Configuration](./configuring.mdx) first, to identify any specific configuration params needed. + - Helm charts, when the target envrionment uses container orchestration such as Kubernetes and has enabled Helm Charts on cluster. The [Horizon Helm chart](https://github.com/stellar/helm-charts/tree/main/charts/horizon) manages installation life cycle. Use the [Helm Install command](https://helm.sh/docs/helm/helm_install/), it will accept Horizon's configuration parameters. Please review [Configuration](./configuring.mdx) first, to identify any specific configuration params needed. + +For installation in development environments, please refer to the [Horizon README](https://github.com/stellar/go/blob/master/services/horizon/README.md#try-it-out) from the source code repo for options to use in development context. ### Notes From ab95b7727dcb45dc2f30b0a2f0a4b945c5764409 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Wed, 28 Jun 2023 14:15:51 -0700 Subject: [PATCH 08/25] #147: fix formatting --- docs/run-platform-server/installing.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/installing.mdx b/docs/run-platform-server/installing.mdx index 34201489a..5cf40f059 100644 --- a/docs/run-platform-server/installing.mdx +++ b/docs/run-platform-server/installing.mdx @@ -14,7 +14,7 @@ To install Horizon in production or non-development environments, we recommend t - non-Helm Chart, if the target envrironment for container to run does not support Helm chart usage, run the prebuilt docker image of Horizon published on [dockerhub.com/stellar/horizon](https://hub.docker.com/r/stellar/stellar-horizon). - Helm charts, when the target envrionment uses container orchestration such as Kubernetes and has enabled Helm Charts on cluster. The [Horizon Helm chart](https://github.com/stellar/helm-charts/tree/main/charts/horizon) manages installation life cycle. Use the [Helm Install command](https://helm.sh/docs/helm/helm_install/), it will accept Horizon's configuration parameters. Please review [Configuration](./configuring.mdx) first, to identify any specific configuration params needed. -For installation in development environments, please refer to the [Horizon README](https://github.com/stellar/go/blob/master/services/horizon/README.md#try-it-out) from the source code repo for options to use in development context. +For installation in development environments, please refer to the [Horizon README](https://github.com/stellar/go/blob/master/services/horizon/README.md#try-it-out) from the source code repo for options to use in development context. ### Notes From a71b61c232c217ab45fcfacc2a7ffe2a62f0c1d0 Mon Sep 17 00:00:00 2001 From: shawn Date: Wed, 28 Jun 2023 15:51:10 -0700 Subject: [PATCH 09/25] filtering rephrase for default behavior Co-authored-by: George --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 69310bf86..f4f123360 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -28,7 +28,7 @@ Given that all transactions related to the white listed entities are included, a ## Configuration: -Filtering is enabled by default, however, no filter rules are included by default, meaning effectively no filtering of ingested data happens by default. To start filtering ingestion: +Filtering is enabled by default but with no filter rules, which effectively means no filtering of ingested data occurs. To start filtering ingestion: - enable Horizon admin port with environmental configuration parameter `ADMIN_PORT=XXXXX`, this will allow you to access the port. - define filter whitelists. submit Admin HTTP API requests to view and update the filter rules: From 948494fff26bec22a0b4b01f636c26f393722265 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Fri, 30 Jun 2023 13:44:17 -0700 Subject: [PATCH 10/25] #147: review feedback, update data warehouse link --- docs/run-platform-server/ingestion.mdx | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index 016a4879a..3bf0bc8e8 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -16,18 +16,20 @@ There are two primary ingestion use-cases for Horizon operations: ## Determine storage space -You should think carefully about the historical timeframe of ingested data you'd like to retain on Horizon's database. The storage requirements for every transaction from Stellar network are substantial and grow unbounded, increasing storage used by the Horizon database well into many terabytes and is an unsustainable deployment. Most organizations and operators only need a recent fraction of the historical data, to fit their use case. To Achieve minimal storage footprint, we recommend the following best practices: +You should think carefully about the historical timeframe of ingested data you'd like to retain in Horizon's database. The storage requirements for transactions on the Stellar network are substantial and are growing unbounded over time. This is something that you may need to continually monitor and reevaluate as the network continues to grow. We have found that most organizations need only a small fraction of recent historical data to satisfy their use cases. Through analyzing traffic patterns on SDF's Horizon instance, we see that most requests are for very recent data. -- use **live** ingestion. only use and depend on **historical** ingestion in limited exceptional cases. -- use [ingestion filter rules](./ingestion-filtering.mdx) based on accounts and assets related to your application space. -- use temporal limits, if no filter rules are applied, we recommend limiting historical retention of ingested data to a sliding window of 1 month(`HISTORY_RETENTION_COUNT=518400`) and this is the [default set by Horizon](./configuring.mdx#ingestion-role-instance). This is based on metrics and trend analysis of our own API servers, we have seen most requests are for near term data. - - if not using filter rules and want to access historical data more than 1 month old, we recommend not using Horizon ingestion and its database for this purpose, instead consider usage of [Stellar Hubble Data Warehouse](https://www.stellar.org/developers-blog/try-our-new-analytics-dataset?locale=en). +To keep your storage footprint small, we recommend the following: + +- use **live** ingestion only use and depend on **historical** ingestion in limited exceptional cases +- if your application requires access to all network data, no filtering can be done, we recommend limiting historical retention of ingested data to a sliding window of 1 month (HISTORY_RETENTION_COUNT=518400) which is default set by Horizon +- if your application can work on a filtered network dataset based on specific accounts and assets, then we recommend applying ingestion filter rules. When using filter rules, it provides benefit of choice in longer historical retention timeframe since the filtering is reducing the overall database size to such a degree, historical retention(`HISTORY_RETENTION_COUNT`) can be set in terms of years rather than months or even disabled(`HISTORY_RETENTION_COUNT=0`) +- if you cannot limit your history retention window to 30 days and cannot use filter rules, we recommend considering [Stellar Hubble Data Warehouse](https://developers.stellar.org/docs/accessing-data/overview) for any historical data ### Ingesting Live Data This option is enabled by default and is the recommended mode of ingestion to run. It is controlled with environent configuration flag `INGEST`. Refer to [Configuration](./configuring.mdx) for how an instance of Horizon performs the ingestion role. -For a H/A requirements, **we highly recommend having more than one live ingesting instance**, as this makes it easier to avoid downtime during upgrades and adds resilience to your infrastructure, ensuring you always have the latest network data, refer to [Ingestion Role Instance](./configuring.mdx#multiple-instance-deployment) +For a high availability requirements, **we recommend deploying more than one live ingesting instance**, as this makes it easier to avoid downtime during upgrades and adds resilience, ensuring you always have the latest network data, refer to [Ingestion Role Instance](./configuring.mdx#multiple-instance-deployment) ### Ingesting Historical Data From 23d0485b22dec86918bd381484c9ac76ef077c29 Mon Sep 17 00:00:00 2001 From: shawn Date: Fri, 30 Jun 2023 14:34:37 -0700 Subject: [PATCH 11/25] remove trailing 'and' condition Co-authored-by: Molly Karcher --- docs/run-platform-server/ingestion.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index 3bf0bc8e8..9087a81e6 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -11,8 +11,8 @@ Horizon API provides most of its utility through ingested data, and your Horizon There are two primary ingestion use-cases for Horizon operations: -- ingesting **live** data to stay up to date with the latest ledgers from the network, accumulating a sliding window of aged ledgers, and -- ingesting **historical** data to retroactively add network data from a time range in the past to the database. +- ingesting **live** data to stay up to date with the latest ledgers from the network, accumulating a sliding window of aged ledgers +- ingesting **historical** data to retroactively add network data from a time range in the past to the database ## Determine storage space From 7e96c11eba9f9a28cd1ab18286eafd8216067ffb Mon Sep 17 00:00:00 2001 From: shawn Date: Fri, 30 Jun 2023 14:35:14 -0700 Subject: [PATCH 12/25] qualify current and previous as states Co-authored-by: Molly Karcher --- docs/run-platform-server/ingestion.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index 9087a81e6..c2c92a4f6 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -5,7 +5,7 @@ sidebar_position: 45 import { CodeExample } from "@site/src/components/CodeExample"; -Horizon API provides most of its utility through ingested data, and your Horizon server can be configured to listen for and ingest transaction results from the Stellar network. Ingestion enables API access to both current (e.g. someone's balance) and historical state (e.g. someone's transaction history). +Horizon API provides most of its utility through ingested data, and your Horizon server can be configured to listen for and ingest transaction results from the Stellar network. Ingestion enables API access to both current state (e.g. someone's balance) and historical state (e.g. someone's transaction history). ## Ingestion Types From d4c6a9aafa2ae7714518842c1599f743288cbd70 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Fri, 30 Jun 2023 14:43:32 -0700 Subject: [PATCH 13/25] don't mention full history in terms of filtering usage --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index f4f123360..206232b2a 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -62,7 +62,7 @@ You have installed Horizon with empty database, it has **live** ingestion enable ### Steps: -1. Configure a filter rule with 4 whitelisted Assets via the Admin API. Also check the `HISTORY_RETENTION_COUNT` and set it to `0` if you don’t want any history purged anymore now that you are filtering, otherwise it will continue to reap all data older than the retention. +1. Configure a filter rule with 4 whitelisted Assets via the Admin API. 2. If you do not need prior historical data to the present time, you can effectively stop here, anytime changes or enablement of filter rules are done, the history tables will immediately reflect filtered data per those latest rules from the time the filter rule is updated and onward. From b8ac739b3b891599944a5493692d6df50e11ae2e Mon Sep 17 00:00:00 2001 From: shawn Date: Fri, 30 Jun 2023 14:46:11 -0700 Subject: [PATCH 14/25] review feedback, less verbose on tech description of filtering Co-authored-by: Molly Karcher --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 206232b2a..1bd9a84fd 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -17,7 +17,7 @@ For further context, running a non-filtered `full` history Horizon instance curr This feature operates by accepting only ledger transactions that match to a filter rule criteria when persisting the transactions and operations to historical tables on Horizon database at ingestion time, the rest that done't match are skipped and not stored on database. -The feature does not filter the storage of current network ledger state in Horizon database as that is required for referential integrity of entity identity in sql database overall, however db storage for current state is relatively extremely lower than historical concerns and does not increase exponentially, in the order of a few GB's total. +Note that this filtering applies only to historical data, and does not affect current state data stored in Horizon. However, current state data consumes a relatively small amount of the overall storage capacity. Filter rules can whitelist ingestion by the following supported entities: From 7f3553bb9c0f5667d8d9aeda841b3f9c454eebc6 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Fri, 30 Jun 2023 14:51:54 -0700 Subject: [PATCH 15/25] #147: minor change of filtering database retention --- docs/run-platform-server/ingestion-filtering.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 1bd9a84fd..75bcabd9b 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -5,7 +5,7 @@ order: 46 ## Overview -Ingestion Filtering enables Horizon operators to drastically reduce storage footprint of the historical data on Horizon DB by whitelisting Assets and/or Accounts that are relevant to their operations. +Ingestion Filtering enables Horizon operators to drastically reduce the storage footprint of the historical data in the Horizon database by whitelisting Assets and/or Accounts that are relevant to their operations. ### Why is it useful: @@ -62,7 +62,7 @@ You have installed Horizon with empty database, it has **live** ingestion enable ### Steps: -1. Configure a filter rule with 4 whitelisted Assets via the Admin API. +1. Configure a filter rule with 4 whitelisted Assets via the Admin API. 2. If you do not need prior historical data to the present time, you can effectively stop here, anytime changes or enablement of filter rules are done, the history tables will immediately reflect filtered data per those latest rules from the time the filter rule is updated and onward. From 6bc60a7267cf8b881d73bc133a0e794407cca7bb Mon Sep 17 00:00:00 2001 From: shawn Date: Fri, 30 Jun 2023 14:53:40 -0700 Subject: [PATCH 16/25] review feedback, clean up description of filtering benefits Co-authored-by: George --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 75bcabd9b..697ce0a65 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -9,7 +9,7 @@ Ingestion Filtering enables Horizon operators to drastically reduce the storage ### Why is it useful: -Previously, the only way to limit data storage is by limiting the temporaral range of history via rolling retention (ie: last 30 days). Filtering feature allows users to store a longer historical timeframe on the Horizon database for only whitelisted assets, accounts and their related historical entities(transactions, operations, trades, etc) +Previously, the only way to limit data storage was by limiting the temporal range of history via rolling retention (e.g. the last 30 days). The filtering feature allows users to store a longer historical timeframe in the Horizon database for only whitelisted assets, accounts, and their related historical entities (transactions, operations, trades, etc.). For further context, running a non-filtered `full` history Horizon instance currently takes ~ 25TB of disk space (as of June 2023) with storage growing at a rate of ~ 1TB / month. As a benchmark, filtering by even 100 of the most active accounts and assets reduces storage by over 90%. For the majority of applications which are interested in an even more limited set of assets and accounts, storage savings should be well over 99%. Other benefits are reducing operating costs for maintaining storage, improved DB health metrics and query performance. From 60cf5cb85b4609610b6e5bab229c8f26453ec609 Mon Sep 17 00:00:00 2001 From: shawn Date: Fri, 30 Jun 2023 14:55:27 -0700 Subject: [PATCH 17/25] review feedback, commas before `and` conditions Co-authored-by: George --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 697ce0a65..65ad1bc02 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -24,7 +24,7 @@ Filter rules can whitelist ingestion by the following supported entities: - Account id - Asset id (canonical) -Given that all transactions related to the white listed entities are included, all historical time series data related to those transactions are saved in horizon's history db, including transaction itself, all operations in the transaction and references to any ancillary entities from operations. +Given that all transactions related to the white listed entities are included, all historical time series data related to those transactions are saved in horizon's history db, including transaction itself, all operations in the transaction, and references to any ancillary entities from operations. ## Configuration: From ad0a041a489be49b9238674b2e67827d86a02b95 Mon Sep 17 00:00:00 2001 From: shawn Date: Fri, 30 Jun 2023 14:56:19 -0700 Subject: [PATCH 18/25] review feedback, fix typos Co-authored-by: George --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 65ad1bc02..7d3907c30 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -15,7 +15,7 @@ For further context, running a non-filtered `full` history Horizon instance curr ### How does it work: -This feature operates by accepting only ledger transactions that match to a filter rule criteria when persisting the transactions and operations to historical tables on Horizon database at ingestion time, the rest that done't match are skipped and not stored on database. +This feature operates by accepting only ledger transactions that match a filter rule when persisting the transactions and operations to historical tables in the Horizon database at ingestion time, any entries that don't match are skipped and not stored on database. Note that this filtering applies only to historical data, and does not affect current state data stored in Horizon. However, current state data consumes a relatively small amount of the overall storage capacity. From 98809a719f97d2787562a7acf84b93080d2715ff Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Fri, 30 Jun 2023 14:59:24 -0700 Subject: [PATCH 19/25] #147: review feedback, update the oapi link for admin port to 'master' --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 7d3907c30..2d90972ba 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -33,7 +33,7 @@ Filtering is enabled by default but with no filter rules, which effectively mean - enable Horizon admin port with environmental configuration parameter `ADMIN_PORT=XXXXX`, this will allow you to access the port. - define filter whitelists. submit Admin HTTP API requests to view and update the filter rules: - Refer to the [Horizon Admin API Docs](https://github.com/stellar/go/blob/horizon-v2.18.0/services/horizon/internal/httpx/static/admin_oapi.yml) which are also published on Horizon instance as Open API 3.0 doc on the Admin Port at `http://localhost:/`. You can paste the contents from that url into any OAPI tool such as [Swagger](https://editor.swagger.io/) which will render a visual explorer of the API endpoints. Follow details and example request/response payloads for these filter rule endpoints: + Refer to the [Horizon Admin API Docs](https://github.com/stellar/go/blob/master/services/horizon/internal/httpx/static/admin_oapi.yml) which are also published on Horizon instance as Open API 3.0 doc on the Admin Port at `http://localhost:/`. You can paste the contents from that url into any OAPI tool such as [Swagger](https://editor.swagger.io/) which will render a visual explorer of the API endpoints. Follow details and example request/response payloads for these filter rule endpoints: ``` /ingestion/filters/account From ec433c13eb3039ce639e76e86be99e68b28de763 Mon Sep 17 00:00:00 2001 From: shawn Date: Fri, 30 Jun 2023 15:01:10 -0700 Subject: [PATCH 20/25] review feedback, correct pronoun usage Co-authored-by: George --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 2d90972ba..67c4e177f 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -42,7 +42,7 @@ Filtering is enabled by default but with no filter rules, which effectively mean ### Gap fill on filtered historical data: -If new Assets or Accounts are added to the whitelist rules and you would like to pull in its missing historical data, which would have been dropped earlie, reingestion can be run. The Reingestion process is idempotent and will re-ingest the data from the designated historical ledger range and `upsert` to Horizon historical data, i.e. overwrite or insert new data if not already on current database. +If new Assets or Accounts are added to the whitelist rules and you would like to pull in their missing historical data which would have been dropped earlier, you need to run reingestion. The Reingestion process is idempotent and will re-ingest the data from the designated historical ledger range and `upsert` to Horizon historical data, i.e. overwrite or insert new data not already in the current database. ## Sample Use Case: From 606aa87bd0e7998e90bf5ce7c5beb7eeb389edd9 Mon Sep 17 00:00:00 2001 From: shawn Date: Fri, 30 Jun 2023 15:02:55 -0700 Subject: [PATCH 21/25] review feedback, minor punctuation change Co-authored-by: George --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 67c4e177f..b436416ee 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -58,7 +58,7 @@ I would like to store the full history of all transactions related from the gene ### Pre-requisites: -You have installed Horizon with empty database, it has **live** ingestion enabled. +You have installed Horizon with empty database and it has **live** ingestion enabled. ### Steps: From c645b5673ff5a412955b55d1e3433d01db376b98 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Mon, 10 Jul 2023 17:02:34 -0700 Subject: [PATCH 22/25] #147: use allow listing term --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 2d90972ba..60a80efcd 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -5,7 +5,7 @@ order: 46 ## Overview -Ingestion Filtering enables Horizon operators to drastically reduce the storage footprint of the historical data in the Horizon database by whitelisting Assets and/or Accounts that are relevant to their operations. +Ingestion Filtering enables Horizon operators to drastically reduce the storage footprint of the historical data in the Horizon database by allow-listing Assets and/or Accounts that are relevant to their operations. ### Why is it useful: From d4ae7d84711f4d5cd049461249136651f73cb019 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Mon, 10 Jul 2023 17:51:53 -0700 Subject: [PATCH 23/25] #147: add more content on filter behavior --- docs/run-platform-server/ingestion-filtering.mdx | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index fcd6cdebc..566b9e938 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -11,15 +11,22 @@ Ingestion Filtering enables Horizon operators to drastically reduce the storage Previously, the only way to limit data storage was by limiting the temporal range of history via rolling retention (e.g. the last 30 days). The filtering feature allows users to store a longer historical timeframe in the Horizon database for only whitelisted assets, accounts, and their related historical entities (transactions, operations, trades, etc.). -For further context, running a non-filtered `full` history Horizon instance currently takes ~ 25TB of disk space (as of June 2023) with storage growing at a rate of ~ 1TB / month. As a benchmark, filtering by even 100 of the most active accounts and assets reduces storage by over 90%. For the majority of applications which are interested in an even more limited set of assets and accounts, storage savings should be well over 99%. Other benefits are reducing operating costs for maintaining storage, improved DB health metrics and query performance. +For further context, running an unfiltered `full` history Horizon instance currently requires over 30TB of disk space (as of June 2023) with storage growing at a rate of about 1TB/month. As a benchmark, filtering by even 100 of the most active accounts and assets reduces storage by over 90%. For the majority of applications which are interested in an even more limited set of assets and accounts, storage savings should be well over 99%. Other benefits include reducing operating costs for maintaining storage, improved DB health metrics and query performance. ### How does it work: -This feature operates by accepting only ledger transactions that match a filter rule when persisting the transactions and operations to historical tables in the Horizon database at ingestion time, any entries that don't match are skipped and not stored on database. +Filtering feature operates during the ingestion process, **live** or **prior historical ranges**. It tells ingestion process to only accept incoming ledger transactions which match on a filter rule, any transactions which don't match on filter rules are skipped by ingestion and therefore not stored on database. -Note that this filtering applies only to historical data, and does not affect current state data stored in Horizon. However, current state data consumes a relatively small amount of the overall storage capacity. +Some key aspects to note about filtering behavior: -Filter rules can whitelist ingestion by the following supported entities: +- Filtering applies only to ingestion of historical data in the database, it does not affect how ingestion process maintains current state data stored in database, which is the last known ledger entry for each unique entity within accounts, trustlines, liquidity pools, offers. However, current state data consumes a relatively small amount of the overall storage capacity. +- When filter rules are changed, they only apply to active ingestion processes(**live** or **historical ranges**). They don't trigger any retro-active filtering or back-filling of existing historical data on the database. + - If you update the filter rules to increase allow-listing of accounts or assets, related transactions will only start to show up in historical database data from **live** ingestion beginning after time the filter rule is updated using the Horizon Admin API. Same applies to **historical range** ingestion, it will only be affected by new filter rules starting at current ledger it was processing within it's configured range at time the filter rules were updated. + - When updating filter rules with increased allow list coverage, no historical back-filling is done automatically. You can manually backfill the history on database by running a new **historical range** ingestion process for a past ledger range after you have updated the filter rules to achieve that result. + - If you update filter rules and reduce the allow list coverage by removing some entities, no retro-active purging or filtering of historical data per the reduced scope of filter rules on database is performed. Whatever data is stored on history tables resides for lifetime of database or until `HISTORY_RETENTION_COUNT` is exceeded, and Horizon will purge all historical data for all entites related to older ledgers regardless of any filtering rules. +- Filtering will not affect the performance or throughput rate of an ingestion process, it will remain consistent whether filter rules are present or not. + +Filter rules define allow-lists for the following supported entities: - Account id - Asset id (canonical) From a12d1394b8f670192aae67b14513c7ab4b51b349 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Tue, 11 Jul 2023 11:08:00 -0700 Subject: [PATCH 24/25] #147: added examples of filter rules api updates --- .../ingestion-filtering.mdx | 54 +++++++++++++------ 1 file changed, 37 insertions(+), 17 deletions(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 566b9e938..89206d13a 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -5,7 +5,7 @@ order: 46 ## Overview -Ingestion Filtering enables Horizon operators to drastically reduce the storage footprint of the historical data in the Horizon database by allow-listing Assets and/or Accounts that are relevant to their operations. +Ingestion Filtering enables Horizon operators to drastically reduce the storage footprint of the historical data in the Horizon database by white-listing Assets and/or Accounts that are relevant to their operations. ### Why is it useful: @@ -15,18 +15,18 @@ For further context, running an unfiltered `full` history Horizon instance curre ### How does it work: -Filtering feature operates during the ingestion process, **live** or **prior historical ranges**. It tells ingestion process to only accept incoming ledger transactions which match on a filter rule, any transactions which don't match on filter rules are skipped by ingestion and therefore not stored on database. +Filtering feature operates during ingestion in **live** and **historical range** processes. It tells ingestion process to only accept incoming ledger transactions which match on a filter rule, any transactions which don't match on filter rules are skipped by ingestion and therefore not stored on database. Some key aspects to note about filtering behavior: - Filtering applies only to ingestion of historical data in the database, it does not affect how ingestion process maintains current state data stored in database, which is the last known ledger entry for each unique entity within accounts, trustlines, liquidity pools, offers. However, current state data consumes a relatively small amount of the overall storage capacity. -- When filter rules are changed, they only apply to active ingestion processes(**live** or **historical ranges**). They don't trigger any retro-active filtering or back-filling of existing historical data on the database. - - If you update the filter rules to increase allow-listing of accounts or assets, related transactions will only start to show up in historical database data from **live** ingestion beginning after time the filter rule is updated using the Horizon Admin API. Same applies to **historical range** ingestion, it will only be affected by new filter rules starting at current ledger it was processing within it's configured range at time the filter rules were updated. - - When updating filter rules with increased allow list coverage, no historical back-filling is done automatically. You can manually backfill the history on database by running a new **historical range** ingestion process for a past ledger range after you have updated the filter rules to achieve that result. - - If you update filter rules and reduce the allow list coverage by removing some entities, no retro-active purging or filtering of historical data per the reduced scope of filter rules on database is performed. Whatever data is stored on history tables resides for lifetime of database or until `HISTORY_RETENTION_COUNT` is exceeded, and Horizon will purge all historical data for all entites related to older ledgers regardless of any filtering rules. +- When filter rules are changed, they only apply to existing, running ingestion processes(**live** and **historical range**). They don't trigger any retro-active filtering or back-filling of existing historical data on the database. + - When the filter rules are updated to include additional accounts or assets in the white-list, the related transactions from **live** ingestion will only appear in the historical database data once the filter rules have been updated using the Admin API. The same applies to **historical range** ingestion, where the new filter rules will only affect the data from the current ledger within its configured range at the time of the update. + - Updating the filter rules to include additional accounts or assets does not trigger automatic back-filling related to new entites in the historical database. To include prior history of newly white-listed entites in the database you can manually run a new [Historical Ingestion Range](ingestion.mdx#ingesting-historical-data) after updating the filter rules. + - When the filter rules are updated to remove accounts or assets previously defined on white-list, the historical data in the database will not be retroactively purged or filtered based on the updated rules. The data is stored in the history tables for the lifetime of the database or until the `HISTORY_RETENTION_COUNT` is exceeded. Once the retention limit is reached, Horizon will purge all historical data related to older ledgers, regardless of any filtering rules. - Filtering will not affect the performance or throughput rate of an ingestion process, it will remain consistent whether filter rules are present or not. -Filter rules define allow-lists for the following supported entities: +Filter rules define white-lists of the following supported entities: - Account id - Asset id (canonical) @@ -40,20 +40,24 @@ Filtering is enabled by default but with no filter rules, which effectively mean - enable Horizon admin port with environmental configuration parameter `ADMIN_PORT=XXXXX`, this will allow you to access the port. - define filter whitelists. submit Admin HTTP API requests to view and update the filter rules: - Refer to the [Horizon Admin API Docs](https://github.com/stellar/go/blob/master/services/horizon/internal/httpx/static/admin_oapi.yml) which are also published on Horizon instance as Open API 3.0 doc on the Admin Port at `http://localhost:/`. You can paste the contents from that url into any OAPI tool such as [Swagger](https://editor.swagger.io/) which will render a visual explorer of the API endpoints. Follow details and example request/response payloads for these filter rule endpoints: + Refer to the [Horizon Admin API Docs](https://github.com/stellar/go/blob/master/services/horizon/internal/httpx/static/admin_oapi.yml) which are also published on Horizon running instances as Open API 3.0 doc on the Admin Port when enabled at `http://localhost:/`. You can paste the contents from that url into any OAPI tool such as [Swagger](https://editor.swagger.io/) which will render a visual explorer of the API endpoints. On the swagger editor you can also load the published Horizon admin.oapi.yml directly as a url, choose `File->Import URL`: + + ``` + https://raw.githubusercontent.com/stellar/go/master/services/horizon/internal/httpx/static/admin_oapi.yml + ``` + + Follow details and examples of request/response payloads to read and update the filter rules for these endpoints: ``` /ingestion/filters/account /ingestion/filters/asset ``` -### Gap fill on filtered historical data: - -If new Assets or Accounts are added to the whitelist rules and you would like to pull in their missing historical data which would have been dropped earlier, you need to run reingestion. The Reingestion process is idempotent and will re-ingest the data from the designated historical ledger range and `upsert` to Horizon historical data, i.e. overwrite or insert new data not already in the current database. + Choosing `Try it out` button from either endpoint will display `curl` examples of entire HTTP request. ## Sample Use Case: -As an Asset Issuer, I have issued 4 assets and am interested in all transaction data related to those assets including customer Accounts that interact with those assets and the following: +As an Asset Issuer, I have issued 4 assets and am interested in all transaction data related to those assets including customer Accounts that interact with those assets through the following operations: - Operations - Effects @@ -69,8 +73,24 @@ You have installed Horizon with empty database and it has **live** ingestion ena ### Steps: -1. Configure a filter rule with 4 whitelisted Assets via the Admin API. - -2. If you do not need prior historical data to the present time, you can effectively stop here, anytime changes or enablement of filter rules are done, the history tables will immediately reflect filtered data per those latest rules from the time the filter rule is updated and onward. - -3. Perform a separate historical [reingestion](ingestion.mdx#ingesting-historical-data) specifying a range with the earliest ledger # in network history that you want retained for the whitelisted entities. +1. Configure a filter rule with 4 white-listed Assets by POST'ing the request to Horizon ADMIN API `:/ingestion/filters/asset`. + +``` +curl -X 'PUT' \ + 'http://localhost:4200/ingestion/filters/asset' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "whitelist": [ + "USDC:GAFRNZHK4DGH6CSF4HB5EBKK6KARUOVWEI2Y2OIC5NSQ4UBSN4DR456U", + "DOTT:GAFRNZHK4DGH6CSF4HB5EBKK6KARUOVWEI2Y2OIC5NSQ4UBSN4DR456U", + "ABCD:GAFRNZHK4DGH6CSF4HB5EBKK6KARUOVWEI2Y2OIC5NSQ4UBSN4DR456U", + "EFGH:GAFRNZHK4DGH6CSF4HB5EBKK6KARUOVWEI2Y2OIC5NSQ4UBSN4DR456U" + ], + "enabled": true +}' +``` + +2. Since this is new horizon database, and first filter rules, there is nothing more to do, and effectively stop here. + +3. However, for sake of exercise, suppose you already had Horizon running for a while and the database populated based on some filter rules, and these new rules were additional white-listings you just added. In this case, you choose whether you want to retro-actively back fill historical data on horizon database for these new white-listed entites from a prior time up to the present time, because they were originally dropped at prior ingestion time and not included on the database. If you decide you want to back fill, then you run a separate Horizon **historical range** ingestion process, refer to [Historical Ingestion Range](ingestion.mdx#ingesting-historical-data) for steps: From a80507cc8ca7979c1c6b66356067b060883b6276 Mon Sep 17 00:00:00 2001 From: Shawn Reuland Date: Wed, 19 Jul 2023 17:42:23 -0700 Subject: [PATCH 25/25] #147: review feedback on phrasing --- docs/run-platform-server/ingestion-filtering.mdx | 2 +- docs/run-platform-server/ingestion.mdx | 12 ++++++------ 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/run-platform-server/ingestion-filtering.mdx b/docs/run-platform-server/ingestion-filtering.mdx index 89206d13a..536633cb2 100644 --- a/docs/run-platform-server/ingestion-filtering.mdx +++ b/docs/run-platform-server/ingestion-filtering.mdx @@ -35,7 +35,7 @@ Given that all transactions related to the white listed entities are included, a ## Configuration: -Filtering is enabled by default but with no filter rules, which effectively means no filtering of ingested data occurs. To start filtering ingestion: +Filtering is enabled by default with no filter rules defined. When no filter rules are defined, it effectively means no filtering of ingested data occurs. To start filtering ingestion, need to define at least one filter rule: - enable Horizon admin port with environmental configuration parameter `ADMIN_PORT=XXXXX`, this will allow you to access the port. - define filter whitelists. submit Admin HTTP API requests to view and update the filter rules: diff --git a/docs/run-platform-server/ingestion.mdx b/docs/run-platform-server/ingestion.mdx index c2c92a4f6..f380d00bd 100644 --- a/docs/run-platform-server/ingestion.mdx +++ b/docs/run-platform-server/ingestion.mdx @@ -20,14 +20,14 @@ You should think carefully about the historical timeframe of ingested data you'd To keep your storage footprint small, we recommend the following: -- use **live** ingestion only use and depend on **historical** ingestion in limited exceptional cases +- use **live** ingestion, use **historical** ingestion only in limited exceptional cases - if your application requires access to all network data, no filtering can be done, we recommend limiting historical retention of ingested data to a sliding window of 1 month (HISTORY_RETENTION_COUNT=518400) which is default set by Horizon -- if your application can work on a filtered network dataset based on specific accounts and assets, then we recommend applying ingestion filter rules. When using filter rules, it provides benefit of choice in longer historical retention timeframe since the filtering is reducing the overall database size to such a degree, historical retention(`HISTORY_RETENTION_COUNT`) can be set in terms of years rather than months or even disabled(`HISTORY_RETENTION_COUNT=0`) +- if your application can work on a [filtered network dataset](./ingestion-filtering.mdx) based on specific accounts and assets, then we recommend applying ingestion filter rules. When using filter rules, it provides benefit of choice in longer historical retention timeframe since the filtering is reducing the overall database size to such a degree, historical retention(`HISTORY_RETENTION_COUNT`) can be set in terms of years rather than months or even disabled(`HISTORY_RETENTION_COUNT=0`) - if you cannot limit your history retention window to 30 days and cannot use filter rules, we recommend considering [Stellar Hubble Data Warehouse](https://developers.stellar.org/docs/accessing-data/overview) for any historical data ### Ingesting Live Data -This option is enabled by default and is the recommended mode of ingestion to run. It is controlled with environent configuration flag `INGEST`. Refer to [Configuration](./configuring.mdx) for how an instance of Horizon performs the ingestion role. +This option is enabled by default and is the recommended mode of ingestion to run. It is controlled with environment configuration flag `INGEST`. Refer to [Configuration](./configuring.mdx) for how an instance of Horizon performs the ingestion role. For a high availability requirements, **we recommend deploying more than one live ingesting instance**, as this makes it easier to avoid downtime during upgrades and adds resilience, ensuring you always have the latest network data, refer to [Ingestion Role Instance](./configuring.mdx#multiple-instance-deployment) @@ -43,11 +43,11 @@ stellar-horizon db reingest range -Running any historical range of ingestion requires coordination with the data retention configuration chosen. If you have a temporal limit on history set with `HISTORY_RETENTION_COUNT=` then it makes no sense to ingest any time range that is older as it will get purged from the database almose as soon as it's added. +Running any historical range of ingestion requires coordination with the data retention configuration chosen. When setting a temporal limit on history with `HISTORY_RETENTION_COUNT=`, the temporal limit takes precedence, and any data ingested beyond that limit will be automatically purged. Typically the only time you need to run historical ingestion is once when boot-strapping a system after first deployment, from that point forward **live** ingestion will keep the database populated with the expected sliding window of trailing historical data. Maybe one exception is if you think you have a gap in the database caused by the **live** ingestion being down, in which case you can run historical ingestion range to essentially gap fill. -You can run historical ingestion in parallel in background while your main Horizon server separately performs **live** ingestion. If the range specified overlaps with data already in the database, it will simply be overwritten, effectively idempotent. +You can run historical ingestion in parallel in background while your main Horizon server separately performs **live** ingestion. If the range specified overlaps with data already in the database, it is ok and will simply be overwritten, effectively idempotent. #### Parallel ingestion workers @@ -75,7 +75,7 @@ Endpoints that display current state information from **live** ingestion may ret - Verify host machine meets recommended [Prerequisites](./prerequisites.mdx). - Check horizon log output. - - Are there many `level=error` messages, maybe an environmental issue, access to database is lost, etc. + - If there are many `level=error` messages, it may point to an environmental issue, inability to access the database. - **live** ingestion will emit two key log lines about once every 5 seconds based on latest ledger emitted from network. Tail the horizon log output and grep for presence of these lines with a filter: ``` tail -f horizon.log | | grep -E 'Processed ledger|Closed ledger'