Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zmq: publish submitted txs via zmq #8427

Merged
merged 1 commit into from
Aug 23, 2022

Conversation

j-berman
Copy link
Collaborator

Problem

When a daemon is in the "fluff" epoch and a user submits a tx to the daemon, the tx doesn't get published via zmq's txpool_add. Reported by @trasherdk; see logs 1, 2, and 3.

Solution

Publish "just broadcasted" txs via zmq.

Of note, the test I added hangs waiting to receive txs that never come over zmq with the code from before this PR.

Deeper context

  • upon tx submission, handle_incoming_tx is called with relay_method:local with no knowledge of the daemon's fluff/stem epoch status or tx propagation method.
  • zmq doesn't publish the tx since the tx has not broadcasted yet (and therefore is not visible in the pool).
  • the daemon then "fluffs" the tx and updates its relay status in the db via on_transactions_relayed making the tx visible to anyone who queries the daemon.
  • it's possible no other daemons reflect the tx back to the daemon, which means handle_incoming_txs doesn't get called a second time after the initial submission (which is particularly unique to the "fluff" flow).
  • even if daemons do reflect the tx back to the daemon, since the tx has already updated its relay status to a broadcasted state, it won't publish to zmq since this already_have will be true.

With the above context in mind, I think on_transactions_relayed is the best place to pub txs in this circumstance because:

  1. It's the point we know txs are just set to be broadcasted, and become visible/query-able in the pool.
  2. It's not guaranteed handle_incoming_txs will get called again after the node broadcasts the tx.
  3. Even if handle_incoming_txs does get called once the node broadcasts the tx, we don't know if the tx was "just broadcasted" or if we're handling a tx we already knew about that we already pub'd via zmq.
  4. It seems quite challenging to pass something into handle_incoming_tx on the initial submission to make it aware of the relay mechanism that will subsequently be used when the tx is ultimately relayed. Although it seems like it would be safe for "fluff" txs to be immediately visible in the db on initial submission during the "fluff" epoch (and to push them over zmq immediately), the way the relay code is structured, it would take a larger overhaul to safely implement that approach to avoid doing so for "stem" txs.

@@ -1406,21 +1406,69 @@ namespace cryptonote
return true;
}
//-----------------------------------------------------------------------------------------------
bool core::notify_txpool_event(const epee::span<const cryptonote::blobdata> tx_blobs, const epee::span<const crypto::hash> tx_hashes, const epee::span<const cryptonote::transaction> txs, const std::vector<bool> &just_broadcasted) const
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should check that tx_hashes, txs, just_broadcasted have the same size as tx_blobs and return false if they don't match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also,

if (!m_zmq_pub)
    return true;

because this function is a no-op and can early out when there's no ZMQ pub.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good defense, done

@@ -820,7 +820,7 @@ namespace cryptonote
return true;
}
//---------------------------------------------------------------------------------
void tx_memory_pool::set_relayed(const epee::span<const crypto::hash> hashes, const relay_method method)
void tx_memory_pool::set_relayed(const epee::span<const crypto::hash> hashes, const relay_method method, std::vector<bool> &just_broadcasted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't check that just_broadcasted is empty, so it can emplace_back in the wrong places in the loop.

if (!just_broadcasted.empty())
    just_broadcasted.clear();

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@j-berman j-berman force-pushed the zmq-fluff-submits branch from af7059f to 7ec74ab Compare July 10, 2022 20:33
@SChernykh
Copy link
Contributor

I ran it for 24 hours on my server, both p2pool nodes didn't complain about any duplicate transactions. I also tested it by sending a transaction through the same monerod - the tx was broadcasted to p2pool nodes via ZMQ 42.7 seconds later, as expected.

@vtnerd
Copy link
Contributor

vtnerd commented Jul 18, 2022

You commented on this - but to highlight to everyone -

The primary issue is that this leaks the origin (or re-broadcaster) to all connected ZMQ clients, but does not leak "normal" stem phase transactions. It looks like your counter argument was the difficulty of re-structuring the code to detect whether the local client was in the fluff epoch.

If this fix gets merged, I think an issue needs to be opened up about this because it is inconsistent (there's more privacy for non-origin txes).

Copy link
Contributor

@vtnerd vtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the blurb above about starting an issue, this patch looks fine.

bool valid_events = false;
for (std::size_t i = 0; i < results.size(); ++i)
{
results[i].tx = std::move(txs[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't be able to actually move these objects, its a const view of the transactions. I think this just results in the copy-constructor being called (which is annoying as the compiler doesn't actively complain).

Same thing on next line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And indeed, this just results in a copy. I have such a love/hate for C++.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya this was odd of me to leave as is. Will fix.

Copy link
Contributor

@vtnerd vtnerd Jul 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This never changed, but I don't think its a big deal (other than being slightly deceptive). It probably makes sense just to leave it, because if the constness of the function ever changes this function call changes (although I think transasction does not have a proper move function).

{
if (!just_broadcasted.empty())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if is unnecessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What @vtnerd probably meant to say that just_broadcasted.clear(); can just be called without the if with the same end result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the STL code has to do this check basically anyway internally.

Copy link
Contributor

@vtnerd vtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terrible brain fart. Ignore the above about creating a github issue. I was doing all of this from memory, on_transactions_relayed is the exact function you want as its only called in the levin notify code.

Sometimes its difficult for me to accept I wrote a bug :)


if (tx_blobs.size() != tx_hashes.size() || tx_blobs.size() != txs.size() || tx_blobs.size() != just_broadcasted.size())
return false;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add:

if (std::count(just_broadcasted.begin(), just_broadcasted.end(), std::size_t(0)) == 0)
  return true;

as another quick out optimization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps there is still a misunderstanding, but if not, I think you meant a truthy 3rd param :) If the count of "just broadcasted" txs is 0, then we have nothing to publish. But if the count of txs that were not "just broadcasted" is 0, then there may still be some "just broadcasted" that we need to publish

I think the test should make it clear it should be truthy too; it hangs with a falsey 3rd param

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (std::count(just_broadcasted.begin(), just_broadcasted.end(), true) == 0)
  return true;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think you meant that^ @vtnerd

See the latest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was doing this from memory quickly and forgot the API. Mine didn't make any sense.

@j-berman j-berman force-pushed the zmq-fluff-submits branch from 614fd83 to c367a11 Compare July 19, 2022 01:15
@j-berman j-berman force-pushed the zmq-fluff-submits branch from c367a11 to fb94899 Compare July 21, 2022 18:13
@luigi1111 luigi1111 merged commit ce80747 into monero-project:master Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants