Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement inserting transactions into a forming block #131

Closed
pgebal opened this issue Sep 17, 2020 · 20 comments
Closed

Implement inserting transactions into a forming block #131

pgebal opened this issue Sep 17, 2020 · 20 comments
Assignees

Comments

@pgebal
Copy link
Contributor

pgebal commented Sep 17, 2020

References to state 'forming`: #121 (comment)

Inserting a transaction has to return block number and tx index.

Specs to come.

@pgebal pgebal changed the title Handle inserting transactions Implement inserting transactions into a forming block Sep 17, 2020
@pgebal
Copy link
Contributor Author

pgebal commented Sep 17, 2020

If we add a transaction counter to :blocks table and read and update it for each new transaction or we'll be counting the transactions via COUNT, we'll going to have serialization failures. Is there a way in ecto or postgres to have auto-incrementing counter that keeps track of number of transactions for each block? @mederic-p

@pgebal pgebal self-assigned this Sep 17, 2020
@mederic-p
Copy link
Contributor

Would it work to simply add a UPDATE BLOCK SET tx_count=tx_count+1 WHERE .... to our multi?

@pgebal
Copy link
Contributor Author

pgebal commented Sep 18, 2020

It would work, but I believe we'll have transaction serialization failures for some requests. I haven't checked how it works in practice, but please take a look on https://www.postgresql.org/docs/9.1/transaction-iso.html Serializable Isolation Level. I believe if we have a couple of those updates running concurrently some of them will fail. But maybe I don't understand it well enough.

@mederic-p
Copy link
Contributor

If we put all the changes in a multi (it should be like this anyway) then it shouldn't be an issue right?

@pgebal
Copy link
Contributor Author

pgebal commented Sep 18, 2020

@mederic-p DB will be consistent for sure. I thought we might run into serialization failures because of db transaction ordering issues.
In the db transaction we have to:

  1. Update txcount in :blocks.
  2. Read txcount from :blocks
  3. Set it in txindex for incoming child chain transaction.

I think the read in 2 might cause serialization problems for concurrent db transactions, meaning only one will execute and the other will rollback.

But I might be totally wrong. I'll just code it and show you the code.

@mederic-p
Copy link
Contributor

mederic-p commented Sep 18, 2020

@pgebal
Copy link
Contributor Author

pgebal commented Sep 18, 2020

That would be perfect. I'll try to do it.

@pgebal
Copy link
Contributor Author

pgebal commented Sep 23, 2020

Some thoughts on calculating transaction index:
To have a transaction index for submitted transaction we have to know the order of transactions included in currently forming block.
That problem is hard to solve because of concurrent database transactions.
I decided to go for solution that uses lock. In each db transaction that inserts child-chain transaction into the database we lock (using SELECT ... FOR UPDATE) on currently forming block.
That introduces performance bottleneck, but I don''t see any other way to do it if we want to be consistent with the old API.
Raw draft (not for review) is here: https://github.com/omgnetwork/childchain/pull/133/files

@pgebal
Copy link
Contributor Author

pgebal commented Sep 29, 2020

When implementing this I ran into performance issues.
Inserting 15 000 transactions without this change takes 8s.
Inserting 15 000 transactions with this change implemented takes 130s.
All durations are from my PC.
SQL reported by Ecto logs is the following:

begin []⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK source="blocks" db=0.3ms
SELECT b0."id", b0."hash", b0."state", b0."nonce", b0."blknum", b0."txcount", b0."tx_hash", b0."formed_at_ethereum_height", b0."submitted_at_ethereum_height", b0."gas", b0."attempts_counter", b0."inserted_at", b0."updated_at", b0."node_inserted_at", b0."node_updated_at" FROM "blocks" AS b0 WHERE (b0."state" = $1) FOR UPDATE ["forming"]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK source="transactions" db=0.4ms
SELECT max(t0."tx_index") FROM "transactions" AS t0 WHERE (t0."block_id" = $1) [1]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.6ms
INSERT INTO "transactions" ("block_id","tx_bytes","tx_hash","tx_index","tx_type","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6,$7) RETURNING "id" [1, <<248, 185, 248, 67, 184, 65, 34, 109, 90, 164, 172, 129, 39, 183, 70, 120, 39, 219, 215, 91, 178, 153, 46, 219, 242, 81, 43, 252, 249, 206, 244, 153, 113, 81, 245, 51, 1, 65, 99, 31, 169, 155, 240, 116, 133, 204, 159, 18, ...>>, <<234, 216, 89, 121, 16, 159, 184, 21, 48, 57, 42, 76, 202, 54, 203, 123, 17, 47, 180, 151, 57, 199, 132, 78, 11, 175, 190, 158, 36, 124, 231, 115>>, 0, 1, ~N[2020-09-29 15:32:01], ~N[2020-09-29 15:32:01]]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
INSERT INTO "outputs" ("creating_transaction_id","output_data","output_type","state","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" [1, <<237, 1, 235, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1>>, 1, "pending", ~N[2020-09-29 15:32:01], ~N[2020-09-29 15:32:01]]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=1.0ms
UPDATE "outputs" SET "spending_transaction_id" = $1, "node_updated_at" = $2 WHERE "id" = $3 [1, ~N[2020-09-29 15:32:01], 1]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.1ms
commit []⋅

If I do not lock block with FOR UPDATE I get more or less the same performance.
Same If I push setting tx_index to database triggers (then I get the following SQL):

begin []⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK source="blocks" db=0.2ms
SELECT b0."id", b0."hash", b0."state", b0."nonce", b0."blknum", b0."txcount", b0."tx_hash", b0."formed_at_ethereum_height", b0."submitted_at_ethereum_height", b0."gas", b0."attempts_counter", b0."inserted_at", b0."updated_at", b0."node_inserted_at", b0."node_updated_at" FROM "blocks" AS b0 WHERE (b0."state" = $1) FOR UPDATE ["forming"]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.9ms
INSERT INTO "transactions" ("block_id","tx_bytes","tx_hash","tx_type","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" [1, <<248, 185, 248, 67, 184, 65, 34, 109, 90, 164, 172, 129, 39, 183, 70, 120, 39, 219, 215, 91, 178, 153, 46, 219, 242, 81, 43, 252, 249, 206, 244, 153, 113, 81, 245, 51, 1, 65, 99, 31, 169, 155, 240, 116, 133, 204, 159, 18, ...>>, <<234, 216, 89, 121, 16, 159, 184, 21, 48, 57, 42, 76, 202, 54, 203, 123, 17, 47, 180, 151, 57, 199, 132, 78, 11, 175, 190, 158, 36, 124, 231, 115>>, 1, ~N[2020-09-29 16:05:49], ~N[2020-09-29 16:05:49]]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
INSERT INTO "outputs" ("creating_transaction_id","output_data","output_type","state","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" [1, <<237, 1, 235, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1>>, 1, "pending", ~N[2020-09-29 16:05:49], ~N[2020-09-29 16:05:49]]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.6ms
UPDATE "outputs" SET "spending_transaction_id" = $1, "node_updated_at" = $2 WHERE "id" = $3 [1, ~N[2020-09-29 16:05:49], 1]⋅
[debug] module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.1ms
commit []⋅

Basically each of this transactions takes couple millliseconds + there is probably some overhead from using Ecto.Multi

@InoMurko
Copy link
Contributor

Perhaps there's a different angle how this could be made.
Let me first understand your angle :)

There are two options from your #133, afaik:
On each insert (transaction insert) you check if there's a block :forming and if there is one, you use it's blknum.
On each insert (transaction insert) you check if there's a block :forming and if there is none, you insert a plasma block and use it's blknum.

Correct?

@pgebal
Copy link
Contributor Author

pgebal commented Sep 29, 2020

almost,
what you said +
if :forming block exceeds number of transaction I have to insert a plasma block and use it's blknum.

@InoMurko
Copy link
Contributor

Okay, so there's another step happening after the above. Which is insert_block_if_transaction_limit_exceeded/2.
Which seals a block after the previous step in case we reached the max number of transactions in a block.

Right? It seems to me that the previous step and this one (insert_block_if_transaction_limit_exceeded) could be just one step. For example if get_forming_block_for_update also checks if there's room for the incoming transaction (if not, create a new block, just like if there's no :forming block). And then the full block state change towards pending_submission could be done async by a genserver process.

@pgebal
Copy link
Contributor Author

pgebal commented Sep 29, 2020

I'll check tomorrow if I can merge those two steps, that's a good idea to see how it will work.
When it comes to changing block's state to :pending_submission I don't think we'll win a lot here, it's being done only once block is full, which is once for every ~65k transactions. But you are right on that it shouldn't be done on inserting transaction anyway (at least calculating block hash).

@InoMurko
Copy link
Contributor

InoMurko commented Sep 29, 2020

It's true the step into :pending_submission does not happen on every insert, but the select read does. Let's see what it does perf wise.

last_tx_index = repo.one(from(t in Transaction, where: t.block_id == ^block.id, select: max(t.tx_index))) || -1

@pgebal
Copy link
Contributor Author

pgebal commented Sep 30, 2020

Thanks Ino.
That select didn't change a lot. I stripped the SQL generated by Ecto to

 begin []⋅
     module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK source="blocks" db=0.1ms
     SELECT b0."id", b0."hash", b0."state", b0."nonce", b0."blknum", b0."txcount", b0."tx_hash", b0."formed_at_ethereum_height", b0."submitted_at_ethereum_height", b0."gas", b0."attempts_counter", b0."inserted_at", b0."updated_at", b0."node_inserted_at", b0."node_updated_at" FROM "blocks" AS b0 WHERE (b0."state" = $1) ["forming"]⋅
     module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
     INSERT INTO "transactions" ("block_id","tx_bytes","tx_hash","tx_index","tx_type","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6,$7) RETURNING "id" [1, <<248, 185, 248, 67, 184, 65, 71, 208, 106, 12, 4, 232, 38, 117, 107, 86, 190, 59, 158, 103, 176, 245, 28, 222, 48, 196, 122, 254, 169, 41, 17, 205, 171, 188, 1, 234, 87, 41, 46, 6, 144, 165, 227, 108, 194, 116, 142, 231, ...>>, <<42, 213, 252, 87, 113, 111, 235, 54, 190, 173, 160, 80, 173, 55, 221, 243, 190, 123, 27, 36, 133, 215, 151, 218, 7, 112, 84, 36, 20, 171, 73, 19>>, 1, 1, ~N[2020-09-30 15:45:48], ~N[2020-09-30 15:45:48]]⋅
    module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
     INSERT INTO "outputs" ("creating_transaction_id","output_data","output_type","state","node_inserted_at","node_updated_at") VALUES ($1,$2,$3,$4,$5,$6) RETURNING "id" [2, <<237, 1, 235, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 148, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1>>, 1, "pending", ~N[2020-09-30 15:45:48], ~N[2020-09-30 15:45:48]]⋅
       module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.2ms
     UPDATE "outputs" SET "spending_transaction_id" = $1, "node_updated_at" = $2 WHERE "id" = $3 [2, ~N[2020-09-30 15:45:48], 3]⋅
    module=Ecto.Adapters.SQL function=log/4 ⋅QUERY OK db=0.1ms
     commit []

which does not lock and doesn't keep track of tx_indexes, always insterts 1 as tx_index
and I'm getting about the same performance (around 10% better) as in case of using SELECT .. FOR UPDATE and SELECT MAX() ....
Currently I'm at 500-600 db transactions per second.

@boolafish
Copy link

Just an idea. Reading from the thread/discussion here it seems the performance is to be bottlenecked by tx index incrementation (right?) What if we make tx index flagging async? So we first write tx to DB without index and then wait for the tx index to be batch flagged by another process. Now we decrease the need of lock to how often the process batch flag those tx indexes.

The API call can still be synchronised I think. So the flow would be:

  1. Write tx to DB
  2. Wait for tx to get its index
  3. Return the data

The process that batch flags the index probably need to do it every Xms where X is not too big (to make the API call able to return back the data without long waiting time), probably we can try....50 or 100ms +- some jitter (to avoid 2 childchain instance always try to flag at same timing) to see if it helps?

@mederic-p
Copy link
Contributor

Thanks @boolafish for the solution, I'm not totally sure that locking+txindex is indeed the bottleneck. To me it feels like postgres is the bottleneck, however if @pgebal could try a POC of your solution and see how it performs it could be great!.

@boolafish
Copy link

lol didn't read that @pgebal mentioned: "always insterts 1 as tx_index". But I think in general, what I mean is we can probably batch update everything that this PR is aiming to do on the tx instead. Then we can hope it to be more efficient a bit by having less things locked and batch updating stuff.

@pgebal
Copy link
Contributor Author

pgebal commented Oct 1, 2020

Thanks Andy. Yes, I think we will eventually have to go with doing it async. That introduces a trade-off. Requests will take longer and possibly the job that determines tx_indexes and block number will still have to lock on some db row. But we won't know how it works until we try it.

To move forward with the development of childchain we decided with Mederic to merge version that handles around 500-600 submits per second and properly handles concurrent transactions.

@InoMurko InoMurko closed this as completed Oct 1, 2020
@InoMurko InoMurko reopened this Oct 1, 2020
@pgebal
Copy link
Contributor Author

pgebal commented Oct 1, 2020

Just to keep everything in context. When we use single transaction insert like currently in master we can handle 5 times more transactions. On my local setup about 2500 per second, compared to 500-600.

@InoMurko InoMurko closed this as completed Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants