Implement inserting transactions into a forming block #131

pgebal · 2020-09-17T17:19:15Z

References to state 'forming`: #121 (comment)

Inserting a transaction has to return block number and tx index.

Specs to come.

pgebal · 2020-09-17T17:26:51Z

If we add a transaction counter to :blocks table and read and update it for each new transaction or we'll be counting the transactions via COUNT, we'll going to have serialization failures. Is there a way in ecto or postgres to have auto-incrementing counter that keeps track of number of transactions for each block? @mederic-p

mederic-p · 2020-09-18T03:31:47Z

Would it work to simply add a UPDATE BLOCK SET tx_count=tx_count+1 WHERE .... to our multi?

pgebal · 2020-09-18T06:04:47Z

It would work, but I believe we'll have transaction serialization failures for some requests. I haven't checked how it works in practice, but please take a look on https://www.postgresql.org/docs/9.1/transaction-iso.html Serializable Isolation Level. I believe if we have a couple of those updates running concurrently some of them will fail. But maybe I don't understand it well enough.

mederic-p · 2020-09-18T06:21:32Z

If we put all the changes in a multi (it should be like this anyway) then it shouldn't be an issue right?

pgebal · 2020-09-18T06:46:48Z

@mederic-p DB will be consistent for sure. I thought we might run into serialization failures because of db transaction ordering issues.
In the db transaction we have to:

Update txcount in :blocks.
Read txcount from :blocks
Set it in txindex for incoming child chain transaction.

I think the read in 2 might cause serialization problems for concurrent db transactions, meaning only one will execute and the other will rollback.

But I might be totally wrong. I'll just code it and show you the code.

mederic-p · 2020-09-18T07:45:10Z

I see, what if, instead of storing a tx_count in blocks, we just have tx_index on transactions that resets for every new block, similarly to:

Does it help?

pgebal · 2020-09-18T09:51:15Z

That would be perfect. I'll try to do it.

pgebal · 2020-09-23T15:03:34Z

Some thoughts on calculating transaction index:
To have a transaction index for submitted transaction we have to know the order of transactions included in currently forming block.
That problem is hard to solve because of concurrent database transactions.
I decided to go for solution that uses lock. In each db transaction that inserts child-chain transaction into the database we lock (using SELECT ... FOR UPDATE) on currently forming block.
That introduces performance bottleneck, but I don''t see any other way to do it if we want to be consistent with the old API.
Raw draft (not for review) is here: https://github.com/omgnetwork/childchain/pull/133/files

pgebal · 2020-09-29T16:25:12Z

When implementing this I ran into performance issues.
Inserting 15 000 transactions without this change takes 8s.
Inserting 15 000 transactions with this change implemented takes 130s.
All durations are from my PC.
SQL reported by Ecto logs is the following:

begin []⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK source="blocks" db=0.3ms
SELECT b0."id", b0."hash", b0."state", b0."nonce", b0."blknum", b0."txcount", b0."tx_hash", b0."formed_at_ethereum_height", b0."submitted_at_ethereum_height", b0."gas", b0."attempts_counter", b0."inserted_at", b0."updated_at", b0."node_inserted_at", b0."node_updated_at" FROM "blocks" AS b0 WHERE (b0."state" = $1) FOR UPDATE ["forming"]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK source="transactions" db=0.4ms
SELECT max(t0."tx_index") FROM "transactions" AS t0 WHERE (t0."block_id" = $1) [1]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.6ms
INSERT INTO "transactions" ("block_id","tx_bytes","tx_hash","tx_index","tx_type","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6,$7) RETURNING "id" [1, <<248, 185, 248, 67, 184, 65, 34, 109, 90, 164, 172, 129, 39, 183, 70, 120, 39, 219, 215, 91, 178, 153, 46, 219, 242, 81, 43, 252, 249, 206, 244, 153, 113, 81, 245, 51, 1, 65, 99, 31, 169, 155, 240, 116, 133, 204, 159, 18, ...>>, <<234, 216, 89, 121, 16, 159, 184, 21, 48, 57, 42, 76, 202, 54, 203, 123, 17, 47, 180, 151, 57, 199, 132, 78, 11, 175, 190, 158, 36, 124, 231, 115>>, 0, 1, ~N[2020-09-29 15:32:01], ~N[2020-09-29 15:32:01]]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
INSERT INTO "outputs" ("creating_transaction_id","output_data","output_type","state","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" [1, <<237, 1, 235, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1>>, 1, "pending", ~N[2020-09-29 15:32:01], ~N[2020-09-29 15:32:01]]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=1.0ms
UPDATE "outputs" SET "spending_transaction_id" = $1, "node_updated_at" = $2 WHERE "id" = $3 [1, ~N[2020-09-29 15:32:01], 1]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.1ms
commit []⋅

If I do not lock block with FOR UPDATE I get more or less the same performance.
Same If I push setting tx_index to database triggers (then I get the following SQL):

begin []⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK source="blocks" db=0.2ms
SELECT b0."id", b0."hash", b0."state", b0."nonce", b0."blknum", b0."txcount", b0."tx_hash", b0."formed_at_ethereum_height", b0."submitted_at_ethereum_height", b0."gas", b0."attempts_counter", b0."inserted_at", b0."updated_at", b0."node_inserted_at", b0."node_updated_at" FROM "blocks" AS b0 WHERE (b0."state" = $1) FOR UPDATE ["forming"]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.9ms
INSERT INTO "transactions" ("block_id","tx_bytes","tx_hash","tx_type","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" [1, <<248, 185, 248, 67, 184, 65, 34, 109, 90, 164, 172, 129, 39, 183, 70, 120, 39, 219, 215, 91, 178, 153, 46, 219, 242, 81, 43, 252, 249, 206, 244, 153, 113, 81, 245, 51, 1, 65, 99, 31, 169, 155, 240, 116, 133, 204, 159, 18, ...>>, <<234, 216, 89, 121, 16, 159, 184, 21, 48, 57, 42, 76, 202, 54, 203, 123, 17, 47, 180, 151, 57, 199, 132, 78, 11, 175, 190, 158, 36, 124, 231, 115>>, 1, ~N[2020-09-29 16:05:49], ~N[2020-09-29 16:05:49]]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
INSERT INTO "outputs" ("creating_transaction_id","output_data","output_type","state","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" [1, <<237, 1, 235, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1>>, 1, "pending", ~N[2020-09-29 16:05:49], ~N[2020-09-29 16:05:49]]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.6ms
UPDATE "outputs" SET "spending_transaction_id" = $1, "node_updated_at" = $2 WHERE "id" = $3 [1, ~N[2020-09-29 16:05:49], 1]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.1ms
commit []⋅

Basically each of this transactions takes couple millliseconds + there is probably some overhead from using Ecto.Multi

InoMurko · 2020-09-29T16:48:21Z

Perhaps there's a different angle how this could be made.
Let me first understand your angle :)

There are two options from your #133, afaik:
On each insert (transaction insert) you check if there's a block :forming and if there is one, you use it's blknum.
On each insert (transaction insert) you check if there's a block :forming and if there is none, you insert a plasma block and use it's blknum.

Correct?

pgebal · 2020-09-29T16:51:49Z

almost,
what you said +
if :forming block exceeds number of transaction I have to insert a plasma block and use it's blknum.

InoMurko · 2020-09-29T16:56:46Z

Okay, so there's another step happening after the above. Which is insert_block_if_transaction_limit_exceeded/2.
Which seals a block after the previous step in case we reached the max number of transactions in a block.

Right? It seems to me that the previous step and this one (insert_block_if_transaction_limit_exceeded) could be just one step. For example if get_forming_block_for_update also checks if there's room for the incoming transaction (if not, create a new block, just like if there's no :forming block). And then the full block state change towards pending_submission could be done async by a genserver process.

pgebal · 2020-09-29T17:04:39Z

I'll check tomorrow if I can merge those two steps, that's a good idea to see how it will work.
When it comes to changing block's state to :pending_submission I don't think we'll win a lot here, it's being done only once block is full, which is once for every ~65k transactions. But you are right on that it shouldn't be done on inserting transaction anyway (at least calculating block hash).

InoMurko · 2020-09-29T17:20:01Z

It's true the step into :pending_submission does not happen on every insert, but the select read does. Let's see what it does perf wise.

last_tx_index = repo.one(from(t in Transaction, where: t.block_id == ^block.id, select: max(t.tx_index))) || -1

pgebal · 2020-09-30T16:25:57Z

Thanks Ino.
That select didn't change a lot. I stripped the SQL generated by Ecto to

 begin []⋅
     module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK source="blocks" db=0.1ms
     SELECT b0."id", b0."hash", b0."state", b0."nonce", b0."blknum", b0."txcount", b0."tx_hash", b0."formed_at_ethereum_height", b0."submitted_at_ethereum_height", b0."gas", b0."attempts_counter", b0."inserted_at", b0."updated_at", b0."node_inserted_at", b0."node_updated_at" FROM "blocks" AS b0 WHERE (b0."state" = $1) ["forming"]⋅
     module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
     INSERT INTO "transactions" ("block_id","tx_bytes","tx_hash","tx_index","tx_type","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6,$7) RETURNING "id" [1, <<248, 185, 248, 67, 184, 65, 71, 208, 106, 12, 4, 232, 38, 117, 107, 86, 190, 59, 158, 103, 176, 245, 28, 222, 48, 196, 122, 254, 169, 41, 17, 205, 171, 188, 1, 234, 87, 41, 46, 6, 144, 165, 227, 108, 194, 116, 142, 231, ...>>, <<42, 213, 252, 87, 113, 111, 235, 54, 190, 173, 160, 80, 173, 55, 221, 243, 190, 123, 27, 36, 133, 215, 151, 218, 7, 112, 84, 36, 20, 171, 73, 19>>, 1, 1, ~N[2020-09-30 15:45:48], ~N[2020-09-30 15:45:48]]⋅
    module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
     INSERT INTO "outputs" ("creating_transaction_id","output_data","output_type","state","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" [2, <<237, 1, 235, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1>>, 1, "pending", ~N[2020-09-30 15:45:48], ~N[2020-09-30 15:45:48]]⋅
       module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
     UPDATE "outputs" SET "spending_transaction_id" = $1, "node_updated_at" = $2 WHERE "id" = $3 [2, ~N[2020-09-30 15:45:48], 3]⋅
    module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.1ms
     commit []

which does not lock and doesn't keep track of tx_indexes, always insterts 1 as tx_index
and I'm getting about the same performance (around 10% better) as in case of using SELECT .. FOR UPDATE and SELECT MAX() ....
Currently I'm at 500-600 db transactions per second.

boolafish · 2020-10-01T01:15:14Z

Just an idea. Reading from the thread/discussion here it seems the performance is to be bottlenecked by tx index incrementation (right?) What if we make tx index flagging async? So we first write tx to DB without index and then wait for the tx index to be batch flagged by another process. Now we decrease the need of lock to how often the process batch flag those tx indexes.

The API call can still be synchronised I think. So the flow would be:

Write tx to DB
Wait for tx to get its index
Return the data

The process that batch flags the index probably need to do it every Xms where X is not too big (to make the API call able to return back the data without long waiting time), probably we can try....50 or 100ms +- some jitter (to avoid 2 childchain instance always try to flag at same timing) to see if it helps?

mederic-p · 2020-10-01T03:46:47Z

Thanks @boolafish for the solution, I'm not totally sure that locking+txindex is indeed the bottleneck. To me it feels like postgres is the bottleneck, however if @pgebal could try a POC of your solution and see how it performs it could be great!.

boolafish · 2020-10-01T04:15:38Z

lol didn't read that @pgebal mentioned: "always insterts 1 as tx_index". But I think in general, what I mean is we can probably batch update everything that this PR is aiming to do on the tx instead. Then we can hope it to be more efficient a bit by having less things locked and batch updating stuff.

pgebal · 2020-10-01T08:28:26Z

Thanks Andy. Yes, I think we will eventually have to go with doing it async. That introduces a trade-off. Requests will take longer and possibly the job that determines tx_indexes and block number will still have to lock on some db row. But we won't know how it works until we try it.

To move forward with the development of childchain we decided with Mederic to merge version that handles around 500-600 submits per second and properly handles concurrent transactions.

pgebal · 2020-10-01T14:57:30Z

Just to keep everything in context. When we use single transaction insert like currently in master we can handle 5 times more transactions. On my local setup about 2500 per second, compared to 500-600.

pgebal changed the title ~~Handle inserting transactions~~ Implement inserting transactions into a forming block Sep 17, 2020

pgebal self-assigned this Sep 17, 2020

pgebal mentioned this issue Sep 23, 2020

/transaction.submit should return the blknum and txindex #121

Closed

InoMurko closed this as completed Oct 1, 2020

InoMurko reopened this Oct 1, 2020

InoMurko closed this as completed Dec 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement inserting transactions into a forming block #131

Implement inserting transactions into a forming block #131

pgebal commented Sep 17, 2020 •

edited

Loading

pgebal commented Sep 17, 2020 •

edited

Loading

mederic-p commented Sep 18, 2020

pgebal commented Sep 18, 2020

mederic-p commented Sep 18, 2020

pgebal commented Sep 18, 2020 •

edited

Loading

mederic-p commented Sep 18, 2020 •

edited

Loading

pgebal commented Sep 18, 2020

pgebal commented Sep 23, 2020

pgebal commented Sep 29, 2020 •

edited

Loading

InoMurko commented Sep 29, 2020

pgebal commented Sep 29, 2020 •

edited

Loading

InoMurko commented Sep 29, 2020

pgebal commented Sep 29, 2020

InoMurko commented Sep 29, 2020 •

edited

Loading

pgebal commented Sep 30, 2020 •

edited

Loading

boolafish commented Oct 1, 2020

mederic-p commented Oct 1, 2020

boolafish commented Oct 1, 2020

pgebal commented Oct 1, 2020

pgebal commented Oct 1, 2020

Implement inserting transactions into a forming block #131

Implement inserting transactions into a forming block #131

Comments

pgebal commented Sep 17, 2020 • edited Loading

pgebal commented Sep 17, 2020 • edited Loading

mederic-p commented Sep 18, 2020

pgebal commented Sep 18, 2020

mederic-p commented Sep 18, 2020

pgebal commented Sep 18, 2020 • edited Loading

mederic-p commented Sep 18, 2020 • edited Loading

pgebal commented Sep 18, 2020

pgebal commented Sep 23, 2020

pgebal commented Sep 29, 2020 • edited Loading

InoMurko commented Sep 29, 2020

pgebal commented Sep 29, 2020 • edited Loading

InoMurko commented Sep 29, 2020

pgebal commented Sep 29, 2020

InoMurko commented Sep 29, 2020 • edited Loading

pgebal commented Sep 30, 2020 • edited Loading

boolafish commented Oct 1, 2020

mederic-p commented Oct 1, 2020

boolafish commented Oct 1, 2020

pgebal commented Oct 1, 2020

pgebal commented Oct 1, 2020

pgebal commented Sep 17, 2020 •

edited

Loading

pgebal commented Sep 17, 2020 •

edited

Loading

pgebal commented Sep 18, 2020 •

edited

Loading

mederic-p commented Sep 18, 2020 •

edited

Loading

pgebal commented Sep 29, 2020 •

edited

Loading

pgebal commented Sep 29, 2020 •

edited

Loading

InoMurko commented Sep 29, 2020 •

edited

Loading

pgebal commented Sep 30, 2020 •

edited

Loading