-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Horizon returned bad sequence error although the transaction was added to ledger #2628
Comments
Thanks for the report. I can't reproduce it with the latest version (ex. I get a success response for your tx). Going to check horizon.stellar.org logs from May, 8th (I think it was running 1.2.1 at this time) and will give you more info. |
I think the |
@bartekn - Any updates on this one? |
Sorry for delay commenting this issue. I think this should be fixed by #2601 to be released next week (and deployed to horizon.stellar.org next Wednesday). Before Horizon 1.4.0 I'm closing it now, please confirm if you still experience it after next week deploy. |
@bartekn {
"type": "https://stellar.org/horizon-errors/transaction_failed",
"title": "Transaction Failed",
"status": 400,
"detail": "The transaction failed when submitted to the stellar network. The `extras.result_codes` field on this response contains further details. Descriptions of each code can be found at: https://www.stellar.org/developers/guides/concepts/list-of-operations.html",
"extras": {
"envelope_xdr": "AAAAAGRFCbG4m7/GKYJDKVwWrhSAhfgjSjOILbb0cOAN+rH+AAAAZAGfEqQAAEHzAAAAAQAAAABfAgWLAAAAAF8CCQ8AAAABAAAAG1hMTSBlMmUgbW9uaXRvciB0cmFuc2FjdGlvbgAAAAABAAAAAAAAAAEAAAAA+dkdOXuuwsV5OpRuUk9Owqe6ocz1ukhgrYqnKliCQA8AAAAAAAAAAAAAAAEAAAAAAAAAAQ36sf4AAABAOCbRkyGoptNNRB8yDNXC2VqD5GyYBMtevrlKj2RvgzSykZZ4GShhgwAkEnTO0dlsdeu5dHJK1VQD0uX05CUjBw==",
"result_codes": {
"transaction": "tx_bad_seq"
},
"result_xdr": "AAAAAAAAAGT////7AAAAAA=="
}
} |
@bantalon I found another problem causing this. Reopening. |
We make two DB queries: first checking if tx result exists in a DB ( go/services/horizon/internal/txsub/system.go Lines 85 to 110 in 0067572
Queries are not sent in a transaction ( Because we are using Horizon DB only I think we should remove |
@bartekn - It happened again on July 23 on transaction 2df5d0369f89de16193b757800c5f565acfb45eb4cbb8c5ff76ea0077f0fdfac.
When was 1.6.0 deployment completed on https://horizon.stellar.org? |
The following may help with the analysis: legitimate |
At 2020-07-23 00:28:50 UTC the transaction was submitted to Horizon for a 3rd attempt after the previous two attempts resulted in timeouts. However, at 00:29:20 Horizon discarded the transaction because enough time had passed to trigger a timeout. It seems that Stellar Core finally accepted the transaction in ledger 30741789 shortly after Horizon timed out at 00:29:20. Horizon again received a request to submit the same transaction at 00:29:21. But the transaction was not present in the Horizon DB so Horizon tried to submit the transaction to Stellar Core and Stellar Core responded with a bad sequence error since the transaction was already present in the previous ledger. According to Horizon ingestion logs, Horizon finally ingested ledger 30741789 at 00:29:22.
Horizon can respond with a false positive bad sequence error when Horizon's ingestion lags behind Stellar Core. We can determine an approximate upper bound for how often this issue occurs by searching for the "AAAAAAAAAGT////7AAAAAA==" (result_xdr in which feeCharged is 100) in the Horizon logs. To solve this issue we could sleep before calling go/services/horizon/internal/txsub/system.go Lines 147 to 160 in 59324f1
This wouldn't completely solve the issue because this solution assumes that if Horizon is behind Stellar Core it will catch up by the time we finish sleeping. We could pick a sufficiently high sleep duration which would cover most cases. However, we can never guarantee that Horizon will be consistent with Stellar Core after the sleep duration. Another solution is that we could check if the latest ingested ledger matches the latest ledger in Stellar Core. If the Stellar Core ledger is ahead of Horizon, we can block until Horizon catches up. The only drawback of this approach is that we are adding another dependency on functionality in Stellar Core. We have been trying to reduce coupling between Horizon and Stellar Core as much as possible. The last solution that is worth mentioning is that we could modify Stellar Core to provide a new error code (e.g. For now we should probably update the documentation for Horizon's transaction submission endpoint with a warning that, if Horizon's ingestion system is lagging behind Stellar Core, Horizon might respond with an |
During the meeting today I suggested a solution (similar to this one) that simply ignores bad sequence error returned by Stellar-Core and relies on Horizon view only. Later during that meeting I said that I was wrong because if a user submits a tx with sequence number larger than
Does it make sense? |
@bartekn yes, I think that solution will work. I'll create a PR on monday with the implementation |
This should be fixed in Horizon 1.7.0 (will be deployed to horizon.stellar.org on Wednesday). Please reopen if there's still something wrong with the transaction submission. |
@bantalon unfortunately there's a delay deploying this version to production (we are doing some maintenance tasks connected to this deployment). Please observe |
What version are you using?
I am working on the public network through the public Horizon instance: https://horizon.stellar.org. I posted transactions to the network using this API: https://www.stellar.org/developers/horizon/reference/endpoints/transactions-create.html.
What did you do?
On may 8, I started posting a transaction on 06:47:19 UTC. 30 seconds later, on 06:47:49, a https://www.stellar.org/developers/horizon/reference/errors/timeout.html was returned. As instructed in the documentation, I retried sending the same transaction after a second delay. Within less than a second I got a bad sequence failure:
My code treated this as failure.
However, I was surprised to find the transaction on the chain: https://stellar.expert/explorer/public/tx/45a9674370754d50b50df23e03d7fa87bd6ee70bd2db2431a8ab874218c3115a.
What did you expect to see?
I expected to see a successful submit result, either on the first attempt which ended in timeout or on the second attempt which ended in
tx_bad_seq
.What did you see instead?
I got a wrong misleading
tx_bad_seq
failure.The text was updated successfully, but these errors were encountered: