Skip to content

Commit

Permalink
#147: added examples of filter rules api updates
Browse files Browse the repository at this point in the history
  • Loading branch information
sreuland committed Jul 11, 2023
1 parent d4ae7d8 commit a12d139
Showing 1 changed file with 37 additions and 17 deletions.
54 changes: 37 additions & 17 deletions docs/run-platform-server/ingestion-filtering.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ order: 46

## Overview

Ingestion Filtering enables Horizon operators to drastically reduce the storage footprint of the historical data in the Horizon database by allow-listing Assets and/or Accounts that are relevant to their operations.
Ingestion Filtering enables Horizon operators to drastically reduce the storage footprint of the historical data in the Horizon database by white-listing Assets and/or Accounts that are relevant to their operations.

### Why is it useful:

Expand All @@ -15,18 +15,18 @@ For further context, running an unfiltered `full` history Horizon instance curre

### How does it work:

Filtering feature operates during the ingestion process, **live** or **prior historical ranges**. It tells ingestion process to only accept incoming ledger transactions which match on a filter rule, any transactions which don't match on filter rules are skipped by ingestion and therefore not stored on database.
Filtering feature operates during ingestion in **live** and **historical range** processes. It tells ingestion process to only accept incoming ledger transactions which match on a filter rule, any transactions which don't match on filter rules are skipped by ingestion and therefore not stored on database.

Some key aspects to note about filtering behavior:

- Filtering applies only to ingestion of historical data in the database, it does not affect how ingestion process maintains current state data stored in database, which is the last known ledger entry for each unique entity within accounts, trustlines, liquidity pools, offers. However, current state data consumes a relatively small amount of the overall storage capacity.
- When filter rules are changed, they only apply to active ingestion processes(**live** or **historical ranges**). They don't trigger any retro-active filtering or back-filling of existing historical data on the database.
- If you update the filter rules to increase allow-listing of accounts or assets, related transactions will only start to show up in historical database data from **live** ingestion beginning after time the filter rule is updated using the Horizon Admin API. Same applies to **historical range** ingestion, it will only be affected by new filter rules starting at current ledger it was processing within it's configured range at time the filter rules were updated.
- When updating filter rules with increased allow list coverage, no historical back-filling is done automatically. You can manually backfill the history on database by running a new **historical range** ingestion process for a past ledger range after you have updated the filter rules to achieve that result.
- If you update filter rules and reduce the allow list coverage by removing some entities, no retro-active purging or filtering of historical data per the reduced scope of filter rules on database is performed. Whatever data is stored on history tables resides for lifetime of database or until `HISTORY_RETENTION_COUNT` is exceeded, and Horizon will purge all historical data for all entites related to older ledgers regardless of any filtering rules.
- When filter rules are changed, they only apply to existing, running ingestion processes(**live** and **historical range**). They don't trigger any retro-active filtering or back-filling of existing historical data on the database.
- When the filter rules are updated to include additional accounts or assets in the white-list, the related transactions from **live** ingestion will only appear in the historical database data once the filter rules have been updated using the Admin API. The same applies to **historical range** ingestion, where the new filter rules will only affect the data from the current ledger within its configured range at the time of the update.
- Updating the filter rules to include additional accounts or assets does not trigger automatic back-filling related to new entites in the historical database. To include prior history of newly white-listed entites in the database you can manually run a new [Historical Ingestion Range](ingestion.mdx#ingesting-historical-data) after updating the filter rules.
- When the filter rules are updated to remove accounts or assets previously defined on white-list, the historical data in the database will not be retroactively purged or filtered based on the updated rules. The data is stored in the history tables for the lifetime of the database or until the `HISTORY_RETENTION_COUNT` is exceeded. Once the retention limit is reached, Horizon will purge all historical data related to older ledgers, regardless of any filtering rules.
- Filtering will not affect the performance or throughput rate of an ingestion process, it will remain consistent whether filter rules are present or not.

Filter rules define allow-lists for the following supported entities:
Filter rules define white-lists of the following supported entities:

- Account id
- Asset id (canonical)
Expand All @@ -40,20 +40,24 @@ Filtering is enabled by default but with no filter rules, which effectively mean
- enable Horizon admin port with environmental configuration parameter `ADMIN_PORT=XXXXX`, this will allow you to access the port.
- define filter whitelists. submit Admin HTTP API requests to view and update the filter rules:

Refer to the [Horizon Admin API Docs](https://github.com/stellar/go/blob/master/services/horizon/internal/httpx/static/admin_oapi.yml) which are also published on Horizon instance as Open API 3.0 doc on the Admin Port at `http://localhost:<admin_port>/`. You can paste the contents from that url into any OAPI tool such as [Swagger](https://editor.swagger.io/) which will render a visual explorer of the API endpoints. Follow details and example request/response payloads for these filter rule endpoints:
Refer to the [Horizon Admin API Docs](https://github.com/stellar/go/blob/master/services/horizon/internal/httpx/static/admin_oapi.yml) which are also published on Horizon running instances as Open API 3.0 doc on the Admin Port when enabled at `http://localhost:<admin_port>/`. You can paste the contents from that url into any OAPI tool such as [Swagger](https://editor.swagger.io/) which will render a visual explorer of the API endpoints. On the swagger editor you can also load the published Horizon admin.oapi.yml directly as a url, choose `File->Import URL`:

```
https://raw.githubusercontent.com/stellar/go/master/services/horizon/internal/httpx/static/admin_oapi.yml
```

Follow details and examples of request/response payloads to read and update the filter rules for these endpoints:

```
/ingestion/filters/account
/ingestion/filters/asset
```

### Gap fill on filtered historical data:

If new Assets or Accounts are added to the whitelist rules and you would like to pull in their missing historical data which would have been dropped earlier, you need to run reingestion. The Reingestion process is idempotent and will re-ingest the data from the designated historical ledger range and `upsert` to Horizon historical data, i.e. overwrite or insert new data not already in the current database.
Choosing `Try it out` button from either endpoint will display `curl` examples of entire HTTP request.

## Sample Use Case:

As an Asset Issuer, I have issued 4 assets and am interested in all transaction data related to those assets including customer Accounts that interact with those assets and the following:
As an Asset Issuer, I have issued 4 assets and am interested in all transaction data related to those assets including customer Accounts that interact with those assets through the following operations:

- Operations
- Effects
Expand All @@ -69,8 +73,24 @@ You have installed Horizon with empty database and it has **live** ingestion ena

### Steps:

1. Configure a filter rule with 4 whitelisted Assets via the Admin API.

2. If you do not need prior historical data to the present time, you can effectively stop here, anytime changes or enablement of filter rules are done, the history tables will immediately reflect filtered data per those latest rules from the time the filter rule is updated and onward.

3. Perform a separate historical [reingestion](ingestion.mdx#ingesting-historical-data) specifying a range with the earliest ledger # in network history that you want retained for the whitelisted entities.
1. Configure a filter rule with 4 white-listed Assets by POST'ing the request to Horizon ADMIN API `<horizon_host>:<ADMIN_PORT>/ingestion/filters/asset`.

```

This comment has been minimized.

Copy link
@urvisavla

urvisavla Jul 11, 2023

Contributor

Looks great, makes it much easier to understand how to use the API

This comment has been minimized.

Copy link
@gre3n3yes34

gre3n3yes34 via email Jul 12, 2023

curl -X 'PUT' \
'http://localhost:4200/ingestion/filters/asset' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"whitelist": [
"USDC:GAFRNZHK4DGH6CSF4HB5EBKK6KARUOVWEI2Y2OIC5NSQ4UBSN4DR456U",
"DOTT:GAFRNZHK4DGH6CSF4HB5EBKK6KARUOVWEI2Y2OIC5NSQ4UBSN4DR456U",
"ABCD:GAFRNZHK4DGH6CSF4HB5EBKK6KARUOVWEI2Y2OIC5NSQ4UBSN4DR456U",
"EFGH:GAFRNZHK4DGH6CSF4HB5EBKK6KARUOVWEI2Y2OIC5NSQ4UBSN4DR456U"
],
"enabled": true
}'
```

2. Since this is new horizon database, and first filter rules, there is nothing more to do, and effectively stop here.

3. However, for sake of exercise, suppose you already had Horizon running for a while and the database populated based on some filter rules, and these new rules were additional white-listings you just added. In this case, you choose whether you want to retro-actively back fill historical data on horizon database for these new white-listed entites from a prior time up to the present time, because they were originally dropped at prior ingestion time and not included on the database. If you decide you want to back fill, then you run a separate Horizon **historical range** ingestion process, refer to [Historical Ingestion Range](ingestion.mdx#ingesting-historical-data) for steps:

0 comments on commit a12d139

Please sign in to comment.