Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize transaction execution logic #5427

Closed
xxo1shine opened this issue Aug 16, 2023 · 6 comments
Closed

Optimize transaction execution logic #5427

xxo1shine opened this issue Aug 16, 2023 · 6 comments
Assignees

Comments

@xxo1shine
Copy link
Contributor

Rationale

Whether to execute a transaction from a peer should not be judged based on whether the peer is disconnected, but whether the peer is a malicious node or not.

In the current logic, when processing a transaction, if the peer is in the disconnected state, the transaction is discarded, so there is a problem. In the stress test scenario, because the broadcast TPS is too high, the network layer cache will be filled instantly, which may cause data acquisition failure during the peer interaction process, resulting in disconnection, and a large number of previously received transactions will be discarded, therefore, the actual TPS gets very low, and the event flow is as follows:

  1. 12:25:05.802 Receive disconnect message, reason: FETCH_FAIL
12:25:05.802 INFO  [nioEventLoopGroup-6-6] [net](P2pEventHandlerImpl.java:167) Receive message from  peer: /10.40.10.239:24860, type: P2P_DISCONNECT
reason: FETCH_FAIL
12:25:05.803 INFO  [nioEventLoopGroup-6-6] [net](P2pChannelInitializer.java:44) Close channel:/10.40.10.239:24860
12:25:05.804 INFO  [nioEventLoopGroup-6-6] [net](ConnPoolService.java:254) Peer stats: channels 7, activePeers 7, active 0, passive 7
  1. 12:25:05.846 Start printing transaction discarded log.
12:25:05.846 WARN  [pool-52-thread-2] [net](TransactionsMsgHandler.java:120) Drop trx 78298cf15e74714557776377df09eab329a5d225d3505bf4417cc1ead2d5f2d4 from /10.40.10.239:24860, peer is disconnect
12:25:05.846 WARN  [pool-52-thread-2] [net](TransactionsMsgHandler.java:120) Drop trx 7a715db9dcfad7bfad77636f052a34fba2412630d46be04fe6bbdbe8445dbc5e from /10.40.10.239:24860, peer is disconnect
12:25:05.846 WARN  [pool-52-thread-2] [net](TransactionsMsgHandler.java:120) Drop trx 97f00076923c3db4a5f67fe87b876148c529873c447c9ac5176c8a11406e206a from /10.40.10.239:24860, peer is disconnect
12:25:05.846 WARN  [pool-52-thread-2] [net](TransactionsMsgHandler.java:120) Drop trx f8c681ae8e21a4614bc1dadb3647b5323e0b878f0de0ba25b1ed4ba989b159b9 from /10.40.10.239:24860, peer is disconnect
12:25:05.846 WARN  [pool-52-thread-2] [net](TransactionsMsgHandler.java:120) Drop trx 3f5508efdd936e30bd115e91ad5c9d537e4da905e8787009f339fd78719e1197 from /10.40.10.239:24860, peer is disconnect
  1. 12:25:45.441 Finished printing the transaction discard log, a total of 12468 items were printed.
12:25:45.441 WARN  [pool-52-thread-2] [net](TransactionsMsgHandler.java:120) Drop trx 52a3e0963d3118d61f58c2c2b5614f70d8133efa90e33c7ac42a3b0ddb21100f from /10.40.10.239:24860, peer is disconnect
12:25:45.441 WARN  [pool-52-thread-4] [net](TransactionsMsgHandler.java:120) Drop trx b4b51da5ea5482563fffaee6cf198378c96db9577d43c5651aaadda33b59a871 from /10.40.10.239:24860, peer is disconnect
12:25:45.441 WARN  [pool-52-thread-2] [net](TransactionsMsgHandler.java:120) Drop trx 787ef65b4f8bfbec3944efa4cfd8767409da5c3a89bcb72a9371e4e6c3ddb0e9 from /10.40.10.239:24860, peer is disconnect
12:25:45.441 WARN  [pool-52-thread-6] [net](TransactionsMsgHandler.java:120) Drop trx 828ff91394958cd9565bad32148165a5b940c382e45d17fc22260c22639838d9 from /10.40.10.239:24860, peer is disconnect

Implementation

There are several options for identifying malicious nodes:

  • According to the proportion of transaction processing failures.
  • According to the number of transaction processing failures.
  • According to the specific exception.

The current code adopts the third scheme but only disconnects when the exception type is BAD_TRX. Other exceptions will not cause disconnection, so the judgment logic can be consistent with the present

    try {
      tronNetDelegate.pushTransaction(trx.getTransactionCapsule());
      advService.broadcast(trx);
    } catch (P2pException e) {
      logger.warn("Trx {} from peer {} process failed. type: {}, reason: {}",
          trx.getMessageId(), peer.getInetAddress(), e.getType(), e.getMessage());
      if (e.getType().equals(TypeEnum.BAD_TRX)) {
        peer.disconnect(ReasonCode.BAD_TX);
      }
    } catch (Exception e) {
      logger.error("Trx {} from peer {} process failed", trx.getMessageId(), peer.getInetAddress(),
          e);
    }

When processing transactions, remove the following logic.

    if (peer.isDisconnect()) {
      logger.warn("Drop trx {} from {}, peer is disconnect", trx.getMessageId(),
          peer.getInetAddress());
      return;
    }

Add isBadPeer field in the peer data structure.

Set isBadPeer to true in the BAD_TRX exception.

When processing a transaction, if isBadPeer is true, discard the transaction.

    if (peer.isBadPeer()) {
      logger.warn("Drop trx {} from bad peer {}", trx.getMessageId(),
          peer.getInetSocketAddress());
      return;
    }
@halibobo1205
Copy link
Contributor

What are the rules for determining a ReasonCode.BAD_TX?
Why is ValidateSignatureException not a ReasonCode.BAD_TX
image

@xxo1shine
Copy link
Contributor Author

@halibobo1205 This involves multi-signatures. When broadcasting ordinary transactions and permission update transactions at the same time, it may be disconnected due to inconsistent broadcast order of transactions.

@halibobo1205
Copy link
Contributor

multi-signatures

Exclude multi-signatures, why don't the other errors indicate a ReasonCode.BAD_TX?
Can explain each of these?

@jwrct
Copy link
Contributor

jwrct commented Aug 22, 2023

There are many reasons for disconnection from the peer. Apart from the BAD_TRX exception, do transactions execute normally in other exceptional scenarios?

@xxo1shine
Copy link
Contributor Author

@chengtx01 Mainly to maintain consistency with online logic and avoid introducing problems

@xxo1shine
Copy link
Contributor Author

Refer to PR #5440

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants