-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingestion nodes cannot be separated from request serving horizon nodes #2250
Comments
Thanks for creating the issue!
I think we actually found that serving it from the graph is slightly slower on average than from a DB (#1963). It's strange because the graph keeps the offers sorted.
What we could do is to keep the code as it is but if a special config value is set we wait if the node finds out it should ingest into a DB. Then, when other (backend) node that actually ingests into a DB, the frontend node updates the graph only. |
Closed in #2299. |
Reopening for a discussion connected to FSC. If Horizon-Core communication via pipe stays after a prototyping stage ingesting into memory in frontend instances can be a problem. It would require N core processes for N Horizon instances (N-N) instead of N-M (one core process for each backend ingesting instance: M). Because of this:
Even outside FSC, in-memory ingestion can be a problem:
|
Another benefit of fixing this issue is that only ingesting nodes will require write access to the Horizon DB. Request serving Horizon instances will be able to operate with read only access to the DB. Currently, any Horizon nodes which ingest into the in memory orderbook graph still require write access to the Horizon DB because it is not possible to do a SELECT FOR UPDATE query using a read only connection. |
We would like support the following deployment topology for horizon:
There are two pools of horizon instances. The first pool consists of horizon nodes which solely focus on ingestion. The nodes in this pool do not serve API requests. The other pool consists of horizon nodes which do not participate in ingestion and only serve API requests.
We discussed this issue previously in #1529 and #1519 (comment) . We came to the conclusion that the simplest solution was to make all horizon instances participate in ingestion AND serve API requests.
Now that we have several months of experience running horizon in this scheme (where all nodes serve API requests and ingest) we've observed that ingestion can be quite demanding in terms of CPU and memory. We would like to separate the ingestion operations from request serving operations to reduce resource contention between the two.
The difficulty in separating ingestion nodes from API nodes is that the
/order_book
and/paths
endpoint rely on an in memory orderbook which is populated by ingestion. As a workaround we have chosen to still maintain the two pools of horizon instances but, the ingestion nodes also serve/order_book
and/paths
requests and the API nodes serve all other requests.Ideally, we would like a solution that does not require special routing to handle the in memory orderbook case.
Here are some ideas:
For the
/order_book
endpoint we actually don't need to use the in memory graph. We have all the offers recorded in the horizon database:https://github.com/stellar/go/blob/release-horizon-v0.25.0/services/horizon/internal/db2/schema/migrations/19_offers.sql
It should be possible to obtain all the data we need to fulfill the
/order_book
requests with sql queries on theoffers
table. However, I don't know how quickly those queries would run and we may need to add some additional indexes to improve performance.For the path finding endpoints, I think we still need to maintain an in memory graph. But, I think it should be possible to decouple maintaining the in memory graph from ingestion. Here's the idea which was originally proposed in #1519 (comment) :
When the API horizon instances start up, they build an in memory order book graph by reading all rows from the offers table. The API horizon instances will periodically poll the offers table for new updates and apply them to the in memory order book graph. We should be able to find updates from the offers table using the
last_modified_ledger
column.However, when an offer is removed from the order book, the corresponding row in the offers table is deleted. When polling for updates we would not be able to observe changes where offers are removed from the orderbook. To fix this issue we could add a deleted column on the offers table which acts as a tombstone. Instead of deleting a row, we would set the deleted flag to true. To limit the size of the offers table, we can periodically delete tombstone rows from the offers table which are older than 10,000 ledgers (or some other large cutoff).
The text was updated successfully, but these errors were encountered: