services/horizon: Precompute 1m Trade Aggregation buckets #3641

paulbellamy · 2021-05-28T14:41:48Z

PR Checklist

PR Structure

This PR has reasonably narrow scope (if not, break it down into smaller PRs).
This PR avoids mixing refactoring changes with feature changes (split into two PRs
otherwise).
This PR's title starts with name of package that is most changed in the PR, ex.
services/friendbot, or all or doc if the changes are broad or impact many
packages.

Thoroughness

This PR adds tests for the most critical parts of the new functionality or fixes.
I've updated any docs (developer docs, .md
files, etc... affected by this change). Take a look in the docs folder for a given service,
like this one.

Release planning

I've updated the relevant CHANGELOG (here for Horizon) if
needed with deprecations, added features, breaking changes, and DB schema changes.
I've decided if this PR requires a new major/minor version according to
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.

What

Incremental pre-computation for the 1-minute trade-aggregation buckets with a database-trigger. Compute higher-resolution queries on-the-fly.

Why

Fixes #632

Per discussion pre-computing seems the best option for performance and scalability, as the amount of computation is known and bounded. The concern is disk usage. Given a raw history_trades table of 4gb, a naive 1m bucket pre-computation results in a new table approx 1.6GB. Not outrageous, but could grow quickly if more trading pairs become active. When generating that as a materialized view, I accidentally forgot to include the base_asset_id, and counter_asset_id, and noticed the table was much smaller. To compress the table this PR uses wide jsonb columns, so there is one row per year/month/asset-pair, with a values column containing the month's 1m buckets. This adds complexity and a small overhead on querying (~15%), but saves a further ~70% on disk size.

In order to handle the offset param the maximum resolution we could potentially pre-compute is 15 minutes. For this PR, I've only pre-computed the 1 minute buckets, and aggregated higher resolutions on the fly. As higher resolutions are quite quick to aggregate on-the-fly. In my past experience this has been enough, as the 1m aggregation gives higher resolutions a big boost. (TODO: benchmark this for concrete numbers)

Initially I hoped we could just use a materialized view to do the pre-computation (to avoid changes to horizon ingestion code). However, updating materialized views is all-or-nothing, and takes ~7 minutes on my laptop. Instead, this PR uses a trigger to incrementally update the relevant buckets on insert. This implementation is still quite fragile, in that it doesn't handle UPDATE or DELETE queries, and assumes all trades are inserted in-order. Non-incremental recompute of a 1m bucket takes ~300ms. I didn't want to add that overhead to each insert, but it might be an easy way to handle update/delete queries.

Known limitations

This trigger is still quite fragile, in that it doesn't handle UPDATE or DELETE queries, and assumes all trades are inserted in-order.

The reduced query load on history_trades means we could probably get rid of some of the now-unused indexes on that table, which would save a few gigabytes of disk.

TODO

Later

Figure out if we can drop any old indexes on the history_trades.

paulbellamy · 2021-05-28T14:43:27Z

For reading this, I'd recommend starting with the migration, as it'll help make sense of the rest.

bartekn

It's looking good. Looking forward to some response durations stats. My only worry is reingestion in ranges. In such case ranges can be ingested in non-increasing order so the triggers here will calculate values wrong.

I think we can fix this but it requires adding an extra step once reingestion is done:

We move the trigger to Go code and run it after a ledger is ingested using normal operations (not reingestion).
In reingestion code we add an extra step that runs once all ranges are completed which is responsible for calculating 1m resolutions.

services/horizon/internal/db2/schema/migrations/47_precompute_trade_aggregations.sql

bartekn · 2021-06-02T18:55:51Z

Idea to fix open and close when reingesting out of order. We can add two new columns open_ledger_seq and close_ledger_seq that will indicate the sequence number of the previously set open/close price. If a new ledger in a given bucket appears and it's before open_ledger_seq then we update the open price, otherwise open is not updated. We update close price in a similar way.

Please note that the open/close price will still be incorrect if the bucket was ingested partially.

paulbellamy · 2021-06-09T13:30:26Z

In reingestion code we add an extra step that runs once all ranges are completed which is responsible for calculating 1m resolutions.

This seems like a good plan. We actually only need to completely-recalculate the first and last buckets, as those are the only ones which would be only partially-ingested.

paulbellamy · 2021-06-10T17:47:51Z

Ok, so benchmarking on my local laptop. Let's find the biggest most-traded pairs, to set a worst-case scenario bound...

Top asset-pairs query

postgres=# select base_asset_id, counter_asset_id, count(*) from history_trades group by base_asset_id, counter_asset_id order by count desc limit 10;
 base_asset_id | counter_asset_id |  count
---------------+------------------+---------
             2 |              149 | 2819451
             2 |         17581803 | 1660065
             2 |         27869101 | 1424528
             2 |             2351 |  761635
             2 |             1519 |  616861
             2 |            40569 |  604979
             2 |            28007 |  590901
      20772946 |         20772947 |  562536
             2 |         46998801 |  555386
             2 |          1649457 |  449990

So base_asset_id: 2, counter_asset_id: 149 are the biggest (with 70% more than the second).

Pre-computed 1m buckets query: ~900ms

postgres=# EXPLAIN ANALYZE SELECT
  (j.value->>'timestamp')::bigint as timestamp,
  (j.value->>'count')::integer as count,
  (j.value->>'base_volume')::bigint as base_volume,
  (j.value->>'counter_volume')::bigint as counter_volume,
  ARRAY[(j.value->>'high_n')::bigint, (j.value->>'high_d')::bigint] as high,
  ARRAY[(j.value->>'low_n')::bigint, (j.value->>'low_d')::bigint] as low,
  ARRAY[(j.value->>'open_n')::bigint, (j.value->>'open_d')::bigint] as open,
  ARRAY[(j.value->>'close_n')::bigint, (j.value->>'close_d')::bigint] as close
FROM
  history_trades_60000 h, jsonb_each(h.values) j
WHERE h.base_asset_id = '2'
  AND h.counter_asset_id = '149'
ORDER BY timestamp DESC LIMIT 200;


                                                                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=957.86..958.36 rows=200 width=156) (actual time=910.185..910.196 rows=200 loops=1)
   ->  Sort  (cost=957.86..958.61 rows=300 width=156) (actual time=910.183..910.188 rows=200 loops=1)
         Sort Key: (((j.value ->> 'timestamp'::text))::bigint) DESC
         Sort Method: top-N heapsort  Memory: 131kB
         ->  Nested Loop  (cost=0.29..945.51 rows=300 width=156) (actual time=0.356..848.402 rows=371658 loops=1)
               ->  Index Scan using history_trades_60000_year_month_base_asset_id_counter_asset_key on history_trades_60000 h  (cost=0.29..912.51 rows=3 width=618) (actual time=0.099..0.607 rows=39 loops=1)
                     Index Cond: ((base_asset_id = '2'::bigint) AND (counter_asset_id = '149'::bigint))
               ->  Function Scan on jsonb_each j  (cost=0.00..1.00 rows=100 width=32) (actual time=5.995..7.002 rows=9530 loops=39)
 Planning Time: 0.192 ms
 Execution Time: 910.512 ms
(10 rows)

Time: 911.161 ms

On-the-fly 1d buckets query: ~1300ms

postgres=# EXPLAIN ANALYZE SELECT
  bucket as timestamp,
  sum("count") as count,
  sum(base_volume) as base_volume,
  sum(counter_volume) as counter_volume,
  sum(counter_volume)/sum(base_volume) as avg,
  (max_price(high))[1] as high_n,
  (max_price(high))[2] as high_d,
  (min_price(low))[1] as low_n,
  (min_price(low))[2] as low_d,
  (first(open))[1] as open_n,
  (first(open))[2] as open_d,
  (last(close))[1] as close_n,
  (last(close))[2] as close_d
  FROM (
    SELECT
      div(((j.value->>'timestamp')::bigint - 0), 86400000)*86400000 + 0 as bucket,
      (j.value->>'timestamp')::bigint as timestamp,
      (j.value->>'count')::integer as count,
      (j.value->>'base_volume')::bigint as base_volume,
      (j.value->>'counter_volume')::bigint as counter_volume,
      ARRAY[(j.value->>'high_n')::bigint, (j.value->>'high_d')::bigint] as high,
      ARRAY[(j.value->>'low_n')::bigint, (j.value->>'low_d')::bigint] as low,
      ARRAY[(j.value->>'open_n')::bigint, (j.value->>'open_d')::bigint] as open,
      ARRAY[(j.value->>'close_n')::bigint, (j.value->>'close_d')::bigint] as close
    FROM
      history_trades_60000 h, jsonb_each(h.values) j
    WHERE h.base_asset_id = '2'
      AND h.counter_asset_id = '149'
    ORDER BY timestamp ASC
  ) as htrd
  GROUP BY bucket ORDER BY timestamp DESC LIMIT 200;

                                                                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=979.95..1622.70 rows=200 width=296) (actual time=1218.986..1223.731 rows=200 loops=1)
   ->  GroupAggregate  (cost=979.95..1622.70 rows=200 width=296) (actual time=1218.984..1223.723 rows=200 loops=1)
         Group Key: htrd.bucket
         ->  Sort  (cost=979.95..980.70 rows=300 width=180) (actual time=1218.870..1219.016 rows=1148 loops=1)
               Sort Key: htrd.bucket DESC
               Sort Method: external merge  Disk: 69184kB
               ->  Subquery Scan on htrd  (cost=963.86..967.61 rows=300 width=180) (actual time=1032.856..1084.236 rows=371658 loops=1)
                     ->  Sort  (cost=963.86..964.61 rows=300 width=188) (actual time=1032.854..1057.811 rows=371658 loops=1)
                           Sort Key: (((j.value ->> 'timestamp'::text))::bigint)
                           Sort Method: external merge  Disk: 74928kB
                           ->  Nested Loop  (cost=0.29..951.51 rows=300 width=188) (actual time=0.309..901.815 rows=371658 loops=1)
                                 ->  Index Scan using history_trades_60000_year_month_base_asset_id_counter_asset_key on history_trades_60000 h  (cost=0.29..912.51 rows=3 width=618) (actual time=0.060..0.521 rows=39 loops=1)
                                       Index Cond: ((base_asset_id = '2'::bigint) AND (counter_asset_id = '149'::bigint))
                                 ->  Function Scan on jsonb_each j  (cost=0.00..1.00 rows=100 width=32) (actual time=5.279..6.170 rows=9530 loops=39)
 Planning Time: 0.385 ms
 Execution Time: 1279.927 ms
(16 rows)

Thoughts

Keep in mind these are worst-cases (loading all of time for the most traded asset pair). Seems like most of the time is spent unwinding the jsonb. Intuitively I feel like that could be reduced, as it seems like the limit param isn't being propagated down as far as it could. Maybe I'm missing an index I had before? 🤔

For comparison with the "normal" case, with counter asset 15937061, the 1m and 1d buckets take 200ms and 400ms respectively.

It also might be worth doing a benchmark without the jsonb storage. It ~doubles the data storage, but might make the query times much happier.

Update: Benchmarked without the jsonb storage.

Non-jsonb Pre-computed 1m buckets query: ~30ms

postgres=#
-- raw non-jsonb 1m query
EXPLAIN ANALYZE SELECT
  timestamp,
  count,
  base_volume,
  counter_volume,
  ARRAY[high_n, high_d] as high,
  ARRAY[low_n, low_d] as low,
  ARRAY[open_n, open_d] as open,
  ARRAY[close_n, close_d] as close
FROM
  history_trades_60000 h
WHERE h.base_asset_id = '2'
  AND h.counter_asset_id = '149'
ORDER BY timestamp DESC LIMIT 200;

                                                                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..697.19 rows=200 width=156) (actual time=1.156..25.029 rows=200 loops=1)
   ->  Index Scan Backward using history_trades_60000_timestamp_base_asset_id_counter_asset__key on history_trades_60000 h  (cost=0.56..834139.28 rows=239478 width=156) (actual time=1.154..25.005 rows=200 loops=1)
         Index Cond: ((base_asset_id = '2'::bigint) AND (counter_asset_id = '149'::bigint))
 Planning Time: 0.217 ms
 Execution Time: 25.075 ms
(5 rows)

Non-jsonb On-the-fly 1d buckets query: ~1000ms

postgres=# EXPLAIN ANALYZE SELECT
  timestamp,
  sum("count") as count,
  sum(base_volume) as base_volume,
  sum(counter_volume) as counter_volume,
  sum(counter_volume)/sum(base_volume) as avg,
  (max_price(high))[1] as high_n,
  (max_price(high))[2] as high_d,
  (min_price(low))[1] as low_n,
  (min_price(low))[2] as low_d,
  (first(open))[1] as open_n,
  (first(open))[2] as open_d,
  (last(close))[1] as close_n,
  (last(close))[2] as close_d
  FROM (
    SELECT
      timestamp,
      count,
      base_volume,
      counter_volume,
      ARRAY[high_n, high_d] as high,
      ARRAY[low_n, low_d] as low,
      ARRAY[open_n, open_d] as open,
      ARRAY[close_n, close_d] as close
    FROM
      history_trades_60000 h
    WHERE h.base_asset_id = '2'
      AND h.counter_asset_id = '149'
    ORDER BY timestamp ASC
  ) as htrd
  GROUP BY timestamp ORDER BY timestamp DESC LIMIT 200;
                                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=446157.44..446587.44 rows=200 width=272) (actual time=950.483..950.731 rows=200 loops=1)
   ->  GroupAggregate  (cost=446157.44..961035.14 rows=239478 width=272) (actual time=950.483..950.724 rows=200 loops=1)
         Group Key: h."timestamp"
         ->  Sort  (cost=446157.44..446756.14 rows=239478 width=156) (actual time=950.462..950.508 rows=201 loops=1)
               Sort Key: h."timestamp" DESC
               Sort Method: external merge  Disk: 69184kB
               ->  Sort  (cost=402940.67..403539.36 rows=239478 width=156) (actual time=832.959..857.091 rows=371658 loops=1)
                     Sort Key: h."timestamp"
                     Sort Method: external merge  Disk: 69184kB
                     ->  Seq Scan on history_trades_60000 h  (cost=0.00..362717.36 rows=239478 width=156) (actual time=0.051..672.880 rows=371658 loops=1)
                           Filter: ((base_asset_id = '2'::bigint) AND (counter_asset_id = '149'::bigint))
                           Rows Removed by Filter: 9872212
 Planning Time: 0.228 ms
 Execution Time: 986.947 ms

Non-jsonb inserts: 1-15ms

postgres=# explain analyze INSERT into history_trades
  (history_operation_id, "order",        ledger_closed_at,      offer_id,
   base_account_id,      base_asset_id,  base_amount,           counter_account_id,
   counter_asset_id,     counter_amount, base_is_seller,        price_n,
   price_d,              base_offer_id,  counter_offer_id)
  values
  (152527926612222986,   0,              '2021-05-21 15:16:36', 572368382,
   2329565490,           2,              2490,                  263763996,
   17076589,             45272,          false,                 200,
   11,                   4764213935039610881,                   572368382);

                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Insert on history_trades  (cost=0.00..0.01 rows=1 width=109) (actual time=10.135..10.136 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=109) (actual time=0.002..0.003 rows=1 loops=1)
 Planning Time: 0.062 ms
 Trigger htrd_60000_insert: time=5.723 calls=1
 Execution Time: 15.962 ms
(5 rows)

postgres=# explain analyze INSERT into history_trades
  (history_operation_id, "order",        ledger_closed_at,      offer_id,
   base_account_id,      base_asset_id,  base_amount,           counter_account_id,
   counter_asset_id,     counter_amount, base_is_seller,        price_n,
   price_d,              base_offer_id,  counter_offer_id)
  values
  (152527926612222987,   0,              '2021-05-21 15:16:37', 572368382,
   2329565490,           2,              2490,                  263763996,
   17076589,             45272,          false,                 200,
   11,                   4764213935039610881,                   572368382);
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Insert on history_trades  (cost=0.00..0.01 rows=1 width=109) (actual time=0.167..0.168 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=109) (actual time=0.003..0.003 rows=1 loops=1)
 Planning Time: 0.045 ms
 Trigger htrd_60000_insert: time=0.848 calls=1
 Execution Time: 1.120 ms
(5 rows)

So, expectedly, the raw non-jsonb retrieval is blazing fast. Interestingly the on-the-fly is still quite slow, since it is dominated by the GroupAggregate work. We could do the same trigger (more-or-less) to pre-compute the 5 and 15-minute buckets as well, which would probably help, but this adds ~1300MB. Basically, we should to decide if the extra disk usage (1600MB vs 500MB with jsonb) is worth the increased query speed (or find another way to reduce the disk usage, maybe a side-table for the asset-pair, or tuning the field types to reduce row width).

paulbellamy · 2021-06-14T16:49:55Z

Thinking about this more, and benchmarking a bit more, I think we should not do the jsonb thing for now. The faster query times will take more load off the DB, and hard-drives are cheap(ish). Let's opt for the ~1600MB disk usage, and fastest query times.

Because all the data is available, if/when disk usage later becomes an issue we can reconsider a move to jsonb (via db migration, no reingest needed), with more knowledge about the usage patterns.

- Just calls the ~same thing once-per-insert-batch from go - Unknowns - Does it actually work? - Is the "in" query for buckets efficient, or would a range be better? - Is full-rebuild of buckets efficient, or should we do incremental? Seems safer if it works.

paulbellamy · 2021-06-25T17:02:34Z

services/horizon/internal/ingest/fsm.go

@@ -753,6 +760,11 @@ func (h reingestHistoryRangeState) run(s *system) (transition, error) {
 		}
 	}

+	err = s.historyQ.RebuildTradeAggregationBuckets(s.ctx, h.fromLedger, h.toLedger)


Note, running this once at the end of the reingestion means that the aggregations will be out-of-date until the reingestion finishes.

We could decide we're fine with that, or do some detection and rebuild it after each minute chunk. Running it once at the end is way more efficient though, as it can just load the whole range of trades at once (instead of loading each minute's).

Per benchmarking with @jacekn it seems like the previous slowness was caused by missing an index to use for the DeleteRangeAll call. If we're concerned with consistency during reingestion we can retry rebuilding the buckets after each ledger, and time that.

bartekn

OK, it looks good to me! I added one major comment about delete part of RebuildTradeAggregationBuckets and a few minor comments. I know you checked the speed of the new thing (which looks really good!) but have you checked the correctness? I think what we could do is: deploy your branch to stg, rebuild buckets and then get as many /trade_aggregation requests from the access log and use horizon-cmp to check if the stable release returns the same data.

go.list

services/horizon/internal/db2/history/main.go

bartekn · 2021-07-01T21:57:23Z

services/horizon/internal/db2/schema/migrations/47_precompute_trade_aggregations.sql

+      FROM htrd
+      GROUP by base_asset_id, counter_asset_id, timestamp
+      ORDER BY timestamp
+    );


I wonder if we should remove recalculations from DB migration. Not everyone is using trade aggregations and not everyone can afford 9m downtime (which I guess can be even slower on lower spec machines/disks). What about leaving a query that create a table here but recalculating when someone explicitly runs horizon db reingest ...? There are obviously disadvantages of this so I'd rather want to discuss it first.

Would this actually need to cause downtime? I guess so, as the table locks would conflict with ingestion... 🤔

After actually deploying this and using it, I think I agree. Having the migrations take a long time is a pain, as is maintaining the query in 2 places. Let's remove it from the migration.

services/horizon/internal/db2/history/trade_aggregation.go

services/horizon/internal/ingest/fsm.go

bartekn · 2021-07-01T22:37:47Z

services/horizon/internal/db2/history/trade_aggregation.go

+	_, err := q.Exec(ctx, sq.Delete("history_trades_60000").Where(
+		sq.GtOrEq{"open_ledger": fromLedger},
+		sq.Lt{"open_ledger": toLedger},
+	))


I think something is wrong here or I misunderstand what history_trades_60000 holds. Given that history_trades_60000 holds one minute trade aggregations buckets, check this chart:

---|-------------------|---------- O | C | | | X Y

The line in the top is one of the already aggregated buckets in a DB, it starts at ledger O (open) and ends at C (close). Then we run reingest of X-Y ledgers. What we want to delete buckets starting at O and C and I think that the highlighted code doesn't do that.

You are right, yeah. We actually needed to find the first and last bucket timestamps from the given ledger range, not the actual ledger values. Otherwise we would only include partial data when rebuilding. Nice catch, thanks.

services/horizon/internal/db2/history/trade_aggregation.go

Co-authored-by: Bartek Nowotarski <[email protected]>

paulbellamy · 2021-07-05T12:15:35Z

Note, when ingesting this failed with:

error rebuilding trade aggregations: could not rebuild trade aggregation bucket: exec failed: pq: duplicate key value violates unique constraint \"history_trades_60000_pkey\"

So, we are not clearing out the correct old buckets.

bartekn approved these changes May 28, 2021

View reviewed changes

services/horizon/internal/db2/schema/migrations/47_precompute_trade_aggregations.sql Outdated Show resolved Hide resolved

services/horizon/internal/db2/schema/migrations/47_precompute_trade_aggregations.sql Outdated Show resolved Hide resolved

Paul Bellamy added 5 commits June 10, 2021 12:25

Add history_trades_60000 table for precomputed aggregations

5b34a63

WIP getting trade aggregations endpoint using new _60000 table

8cd9b94

More comments on the migration

1e56741

./gogenerate.sh

eadab59

Trying a formatting thing to fix unterminated function in postgres

c1fe1e5

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from 3e88881 to c1fe1e5 Compare June 10, 2021 11:26

Fix high/low calculation in trade aggs

464f363

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from 93e3b55 to 1ac4160 Compare June 10, 2021 17:14

Fixing some migration syntax issues

0c002ac

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from 1ac4160 to 0c002ac Compare June 10, 2021 17:15

add rebuild sql function to rebuild given buckets

bd57c98

Paul Bellamy added 6 commits June 16, 2021 09:37

rework the precomputation to avoid jsonb

d81c1fb

Rebuild first/last ledger buckets after reingestion

43d5270

Working on tests for trade aggregation triggers

72c7e1d

Fixing migration for postgres 9.6

e01919f

Fixing query to match expected fields

b04b162

WIP tests

980edef

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from b129046 to 2cb5733 Compare June 16, 2021 18:37

Paul Bellamy added 2 commits June 16, 2021 19:39

Upgrade github.com/Masterminds/squirrel to v1.5.0

03daa32

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from 2cb5733 to e5c3966 Compare June 16, 2021 18:39

Paul Bellamy added 3 commits June 17, 2021 14:52

Fix rogue comma

a8d1516

Need to quote order field

cba804f

Remove defunct comment

b68acfe

Paul Bellamy added 4 commits June 24, 2021 18:41

WIP tests

e75189e

WIP tests

6772e00

Fix go vet nonsense

940a147

Need to use total-order-id for trade agg rebuilds

6e35397

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from 9a011d2 to 6e35397 Compare June 25, 2021 15:33

paulbellamy commented Jun 25, 2021

View reviewed changes

Fixing reverse bucket trades high/low/avg

1cd9401

paulbellamy changed the title ~~[WIP] services/horizon: Precompute 1m Trade Aggregation buckets~~ services/horizon: Precompute 1m Trade Aggregation buckets Jun 28, 2021

paulbellamy marked this pull request as ready for review June 28, 2021 16:23

Merge branch 'master' into paulb/trade_aggregation_precompute

4f55c13

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from 5229099 to cdd97b4 Compare June 29, 2021 11:39

Add an index on open_ledger so DeleteRangeAll is quicker

25ac86c

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from cdd97b4 to 25ac86c Compare June 29, 2021 11:44

bartekn reviewed Jul 1, 2021

View reviewed changes

paulbellamy and others added 4 commits July 2, 2021 15:19

Update services/horizon/internal/db2/history/trade_aggregation.go

cfbbe32

Co-authored-by: Bartek Nowotarski <[email protected]>

Update services/horizon/internal/db2/history/trade_aggregation.go

1fa749e

Co-authored-by: Bartek Nowotarski <[email protected]>

Feedback renames

01488e2

Add metric on ledger_ingestion_trade_aggregation_duration_seconds

07802dc

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from 05d750f to d3b4695 Compare July 2, 2021 15:52

Fix a couple bugs in trade aggregation bucket rebuilding

41cf9fe

paulbellamy force-pushed the paulb/trade_aggregation_precompute branch from d3b4695 to 41cf9fe Compare July 2, 2021 16:29

There might not be any trades in the new ledgers

401e413

Paul Bellamy added 4 commits July 5, 2021 17:25

Use the ledgers to get the start/end timestamps to rebuild

38492ad

Fix open/close ledger bug when rebuilding buckets for multiple assets

1aa55b8

Add RebuildTradeAggregationTimes to make testing easier

778f90a

Merge branch 'master' into paulb/trade_aggregation_precompute

8797896

paulbellamy merged commit d336609 into stellar:master Jul 6, 2021

paulbellamy deleted the paulb/trade_aggregation_precompute branch July 6, 2021 12:12

bartekn mentioned this pull request Sep 17, 2021

Debug ingestion speed post RebuildTradeAggregationBuckets #3939

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

services/horizon: Precompute 1m Trade Aggregation buckets #3641

services/horizon: Precompute 1m Trade Aggregation buckets #3641

paulbellamy commented May 28, 2021 •

edited

Loading

paulbellamy commented May 28, 2021

bartekn left a comment

bartekn commented Jun 2, 2021 •

edited

Loading

paulbellamy commented Jun 9, 2021

paulbellamy commented Jun 10, 2021 •

edited

Loading

paulbellamy commented Jun 14, 2021

paulbellamy Jun 25, 2021

paulbellamy Jun 25, 2021

paulbellamy Jun 29, 2021 •

edited

Loading

bartekn left a comment

bartekn Jul 1, 2021

paulbellamy Jul 2, 2021

paulbellamy Jul 5, 2021

bartekn Jul 1, 2021

paulbellamy Jul 2, 2021 •

edited

Loading

paulbellamy commented Jul 5, 2021

services/horizon: Precompute 1m Trade Aggregation buckets #3641

services/horizon: Precompute 1m Trade Aggregation buckets #3641

Conversation

paulbellamy commented May 28, 2021 • edited Loading

PR Structure

Thoroughness

Release planning

What

Why

Known limitations

TODO

Later

paulbellamy commented May 28, 2021

bartekn left a comment

Choose a reason for hiding this comment

bartekn commented Jun 2, 2021 • edited Loading

paulbellamy commented Jun 9, 2021

paulbellamy commented Jun 10, 2021 • edited Loading

Thoughts

Update: Benchmarked without the jsonb storage.

paulbellamy commented Jun 14, 2021

paulbellamy Jun 25, 2021

Choose a reason for hiding this comment

paulbellamy Jun 25, 2021

Choose a reason for hiding this comment

paulbellamy Jun 29, 2021 • edited Loading

Choose a reason for hiding this comment

bartekn left a comment

Choose a reason for hiding this comment

bartekn Jul 1, 2021

Choose a reason for hiding this comment

paulbellamy Jul 2, 2021

Choose a reason for hiding this comment

paulbellamy Jul 5, 2021

Choose a reason for hiding this comment

bartekn Jul 1, 2021

Choose a reason for hiding this comment

paulbellamy Jul 2, 2021 • edited Loading

Choose a reason for hiding this comment

paulbellamy commented Jul 5, 2021

paulbellamy commented May 28, 2021 •

edited

Loading

bartekn commented Jun 2, 2021 •

edited

Loading

paulbellamy commented Jun 10, 2021 •

edited

Loading

paulbellamy Jun 29, 2021 •

edited

Loading

paulbellamy Jul 2, 2021 •

edited

Loading