Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon: chained transactions are delayed #4062

Closed
leighmcculloch opened this issue Nov 10, 2021 · 3 comments
Closed

services/horizon: chained transactions are delayed #4062

leighmcculloch opened this issue Nov 10, 2021 · 3 comments
Labels

Comments

@leighmcculloch
Copy link
Member

leighmcculloch commented Nov 10, 2021

It appears that chained transactions for a single are delayed when those transactions are submitted to Horizon. A set of transactions will consistently have the first transaction in the next ledger, while all subsequence transactions are delayed for the subsequent ledgers. The same transaction set submitted directly to Core are consistently included in the immediate next ledger, and are not delayed.

Both @tyvdh and I have experienced this using different clients which suggests this is not an issue with any particular client but an issue with Horizon's transaction submission system.

The impact of this is that it is much slower to submit a chain of transactions to Horizon. On networks like testnet and pubnet the issue is exasperated by other traffic and transactions will take much longer to confirm than if submitted in isolation. On private networks, like running a standalone network with no other traffic, the problem is still visible although only the first transaction in a set will typically be delayed.

I've written a tool that demonstrates this. The tool is in this PR #4061 for anyone who wishes to replicate the issue. The output included in this issue was generated by this tool.

You can get the tool by running:

$ git clone https://github.com/stellar/go
$ cd go
$ git fetch origin pull/4061/head:pr4061
$ git checkout pr4061
$ cd tools/chained-tx-test

Running against testnet and submitting transactions 100ms apart we see the first transaction is placed in the next ledger while other transactions are spread out over subsequent ledgers taking up to 30 seconds to be submitted:

$ go run . -horizon=https://horizon-testnet.stellar.org
11:18:47.323463 horizon url: https://horizon-testnet.stellar.org
11:18:47.794939 network passphrase: Test SDF Network ; September 2015
11:18:47.796873 account: GAQZATRVG5DN3OY5MV5RVVHHLMJDFTCKKKD4D7KMUWXHNOOMQKXRZAED
11:18:47.796879 funding...
11:18:53.016183 funded and has seq num: 3983526332465152
11:18:53.016216 creating 10 accounts...
11:18:53.016710 tx 0 seq=3983526332465153 h=e65254... pending
11:18:53.121371 tx 1 seq=3983526332465154 h=b40d92... pending
11:18:53.223023 tx 2 seq=3983526332465155 h=fe429c... pending
11:18:53.323301 tx 3 seq=3983526332465156 h=563c77... pending
11:18:53.424641 tx 4 seq=3983526332465157 h=4698a3... pending
11:18:53.527014 tx 5 seq=3983526332465158 h=1b9d8a... pending
11:18:53.627058 tx 6 seq=3983526332465159 h=d3de7f... pending
11:18:53.727756 tx 7 seq=3983526332465160 h=7bbcf0... pending
11:18:53.829136 tx 8 seq=3983526332465161 h=dbd3dc... pending
11:18:53.929438 tx 9 seq=3983526332465162 h=f40cd1... pending
11:18:58.805557 tx 0 seq=3983526332465153 h=e65254... ✅ ledger=927488 dur=5.788701965s
11:19:04.005921 tx 1 seq=3983526332465154 h=b40d92... ✅ ledger=927489 dur=10.884339058s
11:19:08.807089 tx 2 seq=3983526332465155 h=fe429c... ✅ ledger=927490 dur=15.583784475s
11:19:08.807256 tx 3 seq=3983526332465156 h=563c77... ✅ ledger=927490 dur=15.483630698s
11:19:14.009644 tx 4 seq=3983526332465157 h=4698a3... ✅ ledger=927491 dur=20.584651898s
11:19:18.822017 tx 7 seq=3983526332465160 h=7bbcf0... ✅ ledger=927492 dur=25.093852507s
11:19:18.822034 tx 8 seq=3983526332465161 h=dbd3dc... ✅ ledger=927492 dur=24.992476585s
11:19:18.822855 tx 6 seq=3983526332465159 h=d3de7f... ✅ ledger=927492 dur=25.195344876s
11:19:18.822899 tx 5 seq=3983526332465158 h=1b9d8a... ✅ ledger=927492 dur=25.29545326s
11:19:24.009172 tx 9 seq=3983526332465162 h=f40cd1... ✅ ledger=927493 dur=30.079230375s
11:19:24.009220 done

Running against a local standalone private network we see the first transaction is placed in the next ledger while other transactions are placed in the subsequent ledger:

$ docker run --rm -it -p 8000:8000 --name stellar stellar/quickstart:latest --standalone
$ go run . -horizon=http://localhost:8000
11:20:58.679428 horizon url: http://localhost:8000
11:20:58.689386 network passphrase: Standalone Network ; February 2017
11:20:58.690416 account: GAOJSIEWSNCFQX2FFMTQRZWXYTWOHQCVVORFGC53KCLABEXRCQGVJ72W
11:20:58.690423 funding...
11:21:06.614653 funded and has seq num: 2430951489536
11:21:06.614679 creating 10 accounts...
11:21:06.615085 tx 0 seq=2430951489537 h=83edfe... pending
11:21:06.716899 tx 1 seq=2430951489538 h=b3c0b2... pending
11:21:06.820419 tx 2 seq=2430951489539 h=bdaf64... pending
11:21:06.925293 tx 3 seq=2430951489540 h=32e210... pending
11:21:07.028498 tx 4 seq=2430951489541 h=23f081... pending
11:21:07.132353 tx 5 seq=2430951489542 h=d9f940... pending
11:21:07.235280 tx 6 seq=2430951489543 h=39b627... pending
11:21:07.339963 tx 7 seq=2430951489544 h=bcdd67... pending
11:21:07.442684 tx 8 seq=2430951489545 h=b56f0b... pending
11:21:07.544036 tx 9 seq=2430951489546 h=b43214... pending
11:21:10.633275 tx 0 seq=2430951489537 h=83edfe... ✅ ledger=568 dur=4.018096779s
11:21:15.635641 tx 6 seq=2430951489543 h=39b627... ✅ ledger=569 dur=8.400204001s
11:21:15.637085 tx 9 seq=2430951489546 h=b43214... ✅ ledger=569 dur=8.092889392s
11:21:15.638578 tx 1 seq=2430951489538 h=b3c0b2... ✅ ledger=569 dur=8.921521753s
11:21:15.639242 tx 7 seq=2430951489544 h=bcdd67... ✅ ledger=569 dur=8.299091072s
11:21:15.639781 tx 5 seq=2430951489542 h=d9f940... ✅ ledger=569 dur=8.507260078s
11:21:15.640499 tx 4 seq=2430951489541 h=23f081... ✅ ledger=569 dur=8.611844039s
11:21:15.640755 tx 2 seq=2430951489539 h=bdaf64... ✅ ledger=569 dur=8.820176524s
11:21:15.641414 tx 3 seq=2430951489540 h=32e210... ✅ ledger=569 dur=8.715968164s
11:21:15.644040 tx 8 seq=2430951489545 h=b56f0b... ✅ ledger=569 dur=8.201193142s
11:21:15.644056 done

Running against a local standalone private network and submitting directly to Core we see the transactions get accepted into the same ledger:

$ docker run --rm -it -p 8000:8000 -p 11626:11626 --name stellar stellar/quickstart:latest --standalone
$ go run . -horizon=http://localhost:8000 -core=http://localhost:11626
11:22:54.374930 horizon url: http://localhost:8000
11:22:54.383355 network passphrase: Standalone Network ; February 2017
11:22:54.384358 account: GDN3QCFBAMP74SVI6JY43VNIHBSVHCU6YYGJ5IIRQMZRCK22VNB7NQFS
11:22:54.384363 funding...
11:23:00.615094 funded and has seq num: 2529735737344
11:23:00.615113 creating 10 accounts...
11:23:00.615344 tx 0 seq=2529735737345 h=0208663... pending
11:23:00.719301 tx 1 seq=2529735737346 h=8b27560... pending
11:23:00.822780 tx 2 seq=2529735737347 h=8c4a2e8... pending
11:23:00.926400 tx 3 seq=2529735737348 h=47c915d... pending
11:23:01.030042 tx 4 seq=2529735737349 h=1bf4f9d... pending
11:23:01.133891 tx 5 seq=2529735737350 h=362af9f... pending
11:23:01.233806 tx 6 seq=2529735737351 h=01ff8b7... pending
11:23:01.337935 tx 7 seq=2529735737352 h=fd3a692... pending
11:23:01.439168 tx 8 seq=2529735737353 h=797acc9... pending
11:23:01.542718 tx 9 seq=2529735737354 h=021de94... pending
11:23:05.821111 tx 6 seq=2529735737351 h=01ff8b7... ✅ ledger=591 dur=4.587181096s
11:23:05.822817 tx 1 seq=2529735737346 h=8b27560... ✅ ledger=591 dur=5.10341948s
11:23:05.922420 tx 7 seq=2529735737352 h=fd3a692... ✅ ledger=591 dur=4.584385657s
11:23:05.923382 tx 2 seq=2529735737347 h=8c4a2e8... ✅ ledger=591 dur=5.100507882s
11:23:06.021584 tx 3 seq=2529735737348 h=47c915d... ✅ ledger=591 dur=5.095068062s
11:23:06.021863 tx 8 seq=2529735737353 h=797acc9... ✅ ledger=591 dur=4.582604558s
11:23:06.124386 tx 4 seq=2529735737349 h=1bf4f9d... ✅ ledger=591 dur=5.094224208s
11:23:06.124837 tx 9 seq=2529735737354 h=021de94... ✅ ledger=591 dur=4.582023853s
11:23:06.218809 tx 5 seq=2529735737350 h=362af9f... ✅ ledger=591 dur=5.084793918s
11:23:06.219718 tx 0 seq=2529735737345 h=0208663... ✅ ledger=591 dur=5.604272976s
11:23:06.219732 done

Note the output has some fields truncated to fit them on screen.

cc @tyvdh @ire-and-curses @stellar/horizon-committers

Related Slack conversation: https://stellarfoundation.slack.com/archives/C9H5JTW1J/p1636559850127000

@ire-and-curses
Copy link
Member

Potentially related: #2628

@bartekn bartekn self-assigned this Nov 25, 2021
@bartekn
Copy link
Contributor

bartekn commented Nov 25, 2021

I debugged the issue and I understand what's going on.

The submission list per account is maintained in Horizon instance which served the POST request. If all submissions reach a single instance in a cluster it has a full view of the list and submits pending all transactions to Stellar-Core. However, if submissions reach different clusters Horizon rely on ingestion to update sequence numbers in ledgers which triggers submissions. This explains why a standalone Horizon process all transactions fast.

I used @leighmcculloch's tool to submit txs to horizon-testnet.stellar.org and I later checked the logs. Here you can see that transactions reached two separate instances and if an instance received subsequent tx seqnums it submitted them to Stellar-Core right away:

Core submission time Hash Host
21:01:54.389 9b66854ac17e8acb4df1e558a42e52a9b31d84c93fd76b882f1f8785fc5d1736 001a
21:01:59.177 5edf03527968426aa67534e28eeb6ba2d63906983ed4cd88c969c1a918fcb54f 001a
21:02:05.395 a6f574ca1376ee62b37901d8a4ac9cc52b8b5ef5d7b6c6934c44076007d41d0c 002c
21:02:05.396 04b8c7416cb40e374d8016af6b6f5dc3707c92f2c8425869a0fa30b6ba358bd8 002c
21:02:05.396 fdaf09377d75de805218398733d5916098626057efb2b1eb1373fa7e88483cad 002c
21:02:10.178 c33e78df0c52fd85b7fad6d092031ea2303177f6c1b73674f446605698c32c88 001a
21:02:15.397 197a124366b78876771ed04ce38ea3a2b6a0201c853f10a3b8afc05124634eb6 002c
21:02:20.179 1b2b5723cdeebf22d992c2bdfc9132f408c3bd54aa36293c58b808e4d81c727a 001a
21:02:20.180 f46574a139690121de87a7a278ce47e2d7afa0361b82bb89ac2349f9ae2f19fc 001a
21:02:26.394 19c4735ce621a779086eb3c7df2b20bb457960ba1d009273d3cad70906becc43 002c

Tool output:

2021/11/25 21:01:59.303598 tx 0 seq=5044945895292929 h=9b66854ac17e8acb4df1e558a42e52a9b31d84c93fd76b882f1f8785fc5d1736 ✅ ledger=1174619 dur=5.00450396s
2021/11/25 21:02:05.243979 tx 1 seq=5044945895292930 h=5edf03527968426aa67534e28eeb6ba2d63906983ed4cd88c969c1a918fcb54f ✅ ledger=1174620 dur=10.841169498s
2021/11/25 21:02:10.420779 tx 4 seq=5044945895292933 h=fdaf09377d75de805218398733d5916098626057efb2b1eb1373fa7e88483cad ✅ ledger=1174621 dur=15.710910441s
2021/11/25 21:02:10.421340 tx 2 seq=5044945895292931 h=a6f574ca1376ee62b37901d8a4ac9cc52b8b5ef5d7b6c6934c44076007d41d0c ✅ ledger=1174621 dur=15.913745634s
2021/11/25 21:02:10.421391 tx 3 seq=5044945895292932 h=04b8c7416cb40e374d8016af6b6f5dc3707c92f2c8425869a0fa30b6ba358bd8 ✅ ledger=1174621 dur=15.812093191s
2021/11/25 21:02:15.205719 tx 5 seq=5044945895292934 h=c33e78df0c52fd85b7fad6d092031ea2303177f6c1b73674f446605698c32c88 ✅ ledger=1174622 dur=20.393711164s
2021/11/25 21:02:20.419193 tx 6 seq=5044945895292935 h=197a124366b78876771ed04ce38ea3a2b6a0201c853f10a3b8afc05124634eb6 ✅ ledger=1174623 dur=25.504030169s
2021/11/25 21:02:26.205623 tx 7 seq=5044945895292936 h=1b2b5723cdeebf22d992c2bdfc9132f408c3bd54aa36293c58b808e4d81c727a ✅ ledger=1174624 dur=31.188321209s
2021/11/25 21:02:26.205711 tx 8 seq=5044945895292937 h=f46574a139690121de87a7a278ce47e2d7afa0361b82bb89ac2349f9ae2f19fc ✅ ledger=1174624 dur=31.08424199s
2021/11/25 21:02:31.458646 tx 9 seq=5044945895292938 h=19c4735ce621a779086eb3c7df2b20bb457960ba1d009273d3cad70906becc43 ✅ ledger=1174625 dur=36.231836455s

As for solutions the only reliable way is to move the list to a DB so that it's maintainer per cluster not per instance. We could set session pinning in a proxy but it relies on cookies so won't always work.

@bartekn bartekn removed their assignment Dec 13, 2021
@mollykarcher
Copy link
Contributor

Closed by #4996

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants