Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repair window responses are not retransmitted #336

Closed
aeyakovenko opened this issue Jun 8, 2018 · 11 comments · May be fixed by adamlaska/solana#107, adamlaska/solana#117 or adamlaska/solana#121
Closed
Assignees
Milestone

Comments

@aeyakovenko
Copy link
Member

  1. leader sends packet to 1 validator (A)
  2. validator A retransmits the packet to all the peers

if the packet is dropped, we do not know if it's in step 1 or 2. We basically need some way to decide in the validators if they should ask the peers, or the leader about the packet, and the leader should respond with a packet that the validator will retransmit if it was dropped in step 1.

the hard part here is avoiding having multiple validators retransmit this packet to the peers, because it would flood the network. so the leader needs to do some flow control.

@aeyakovenko aeyakovenko added this to the v0.7.0 milestone Jun 8, 2018
@pgarg66 pgarg66 self-assigned this Jun 9, 2018
@aeyakovenko
Copy link
Member Author

we don’t have a retransmit flag, the leader sets the blobs sender id to self. And packets from the leader are retransmitted to peers.

So we can set the id to self on the first repair packet. Or the second one from a different node

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Jun 9, 2018

I think if we included the window bits in the repair messages we could do something smarter and proactive.

So if the leader gets multiple requests for repair they can evaluate all windows and see what packets should be retransmitted to all the peers. We would need to weight them by stake size to be spam resistant eventually

@pgarg66
Copy link
Contributor

pgarg66 commented Jun 10, 2018

How about this:

  1. The validator sends the retransmit request to the leader.
  2. If the leader get the retransmit request from multiple validators, it sends the packets to one of validator (may be first one that requested the packet, or some better scheduler), setting sender id to self. The validator will retransmit the packet to other validators in this case.
  3. If the retransmit request was from one validator only, it'll send it to that validator, sending sender id to validator (?) The validator will not retransmit this packet to other validators.

As you said, the leader can maintain a window. Also, does/can leader know which validator got which packet (in the window) when it was originally transmitted? If retransmission requests are being received for packets sent to a particular validator, it can indicate some problem (network/host) with that validator.

Thoughts?

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Jun 10, 2018

Would you wait before responding, until message 2? Or just keep a counter of how many unique repair requests there are?

Right now the validators randomly send each other repair requests, and the leader is part of the random group.

@pgarg66
Copy link
Contributor

pgarg66 commented Jun 11, 2018

There can be a small (TBD) fixed wait before responding.

If one of the validator is down/bottlenecked (or if leader to validator packet was dropped), more than one peer validators will request for a retransmission within some time interval. If the unicast packet from one validator to another was dropped, then only one of them will request retransmit.

So, validators don't know who the leader is?

@aeyakovenko
Copy link
Member Author

@pgarg66 validators know who the leader is. so each validator gets a different packet and retransmits to all the other peers, thats who we are splitting the leaders bandwidth into N downstream nodes.

I think something simple we can try is asking to retransmit with exponential backoff, so 2, 4, 8th... repair request

@pgarg66
Copy link
Contributor

pgarg66 commented Jun 12, 2018

"I think something simple we can try is asking to retransmit with exponential backoff, so 2, 4, 8th... repair request"

Sorry, I am slightly confused with this. Is the requester (validator) exponentially backing off before requesting a retransmission? Is the purpose of back off that multiple validators won't ask for a retransmission of the same packet?

@aeyakovenko
Copy link
Member Author

the leader sets the sender id as self (which indicates retransmit), every time the number of requests to repair that specific packet doubles.

@pgarg66
Copy link
Contributor

pgarg66 commented Jun 15, 2018

Isn't this code already retrying to repair the window?

streamer.rs: line 203
let reqs = find_next_missing(locked_window, crdt, consumed, received)?;
let sock = UdpSocket::bind("0.0.0.0:0")?;
for (to, req) in reqs {
//todo cache socket
info!("repair_window request {} {} {}", *consumed, *received, to);
assert!(req.len() < BLOB_SIZE);
sock.send_to(&req, to)?;
}

@aeyakovenko
Copy link
Member Author

the problem is here
https://github.com/solana-labs/solana/blob/master/src/crdt.rs#L609

we set the response to the repair request to not retransmit ever. so if the packet is dropped in the first hop, all the peers are missing the packet and none will broadcast it to the rest of the network

@pgarg66
Copy link
Contributor

pgarg66 commented Jun 16, 2018

I understand it now.
Good thing, now I also have some understanding of validator side of repair window processing.

wen-coding pushed a commit to wen-coding/solana that referenced this issue Mar 21, 2024
* runtime: do fewer syscalls in remap_append_vec_file

Use renameat2(src, dest, NOREPLACE) as an atomic version of if
statx(dest).is_err() { rename(src, dest) }.

We have high inode contention during storage rebuild and this saves 1
fs syscall for each appendvec.

* Address review feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment