-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement structure encoding #31
Conversation
Codecov Report
@@ Coverage Diff @@
## master #31 +/- ##
==========================================
+ Coverage 78.71% 79.89% +1.18%
==========================================
Files 8 9 +1
Lines 2720 3263 +543
==========================================
+ Hits 2141 2607 +466
- Misses 579 656 +77
Continue to review full report at Codecov.
|
What is the purpose of moving away from
What extra flexibility is provided by the custom impls? |
Thanks @GeneFerneau for the review. As far as my understanding While the consensus en/decode are provided to produce byte array representation of structures following bitcoin's consensus. These are used mostly in network communication. Which is what we are aiming for here. We are already using serde for everything. We don't need The reason we want a separate implementation is because bitcoin's encoding doesn't provide all the primitives. Probably because they are not much used in bitcoin network messages. But we need those primitive in our messages, and we also need to extend them to derived structures (like regarding the name I don't think there is any need to have serde serialization derived for message structures if we are going for bitcoin's encoding. So we should be able to use "serialize" for everything eventually. The entire purpose of having a custom byte serialization is to have only one So the wrong name is transitory, if its bothering too much we can use explicit declaration everywhere to remove the conflict. |
That makes sense. Now I understand better having local trait(s) to cover everything in the crate, instead of using the stuff already present in
If changing the way everything in the crate is serialized/encoded, doing it in one go makes more sense to me than using an intentionally misspelled function name. "The most permanent things are temporary solutions..." - someone on the internet probably I don't really understand your reasoning on the misspelled function, if using local traits for serialization, there should be no conflicts. Am I missing something?
This seems reasonable for stuff that needs to follow consensus (inclusion in blocks), but for communication rounds something like CBOR probably fits better. To distinguish between the two, you could add a |
There will still be conflicts if the local trait function name matches with some external function name. In this case
Feel free to suggest one and I will update the PR.
The term "consensus" here is not for validation (inclusion of blocks and other stuffs). Validation doesn't need encoding, that's done with deserialized data. The the term is for network communication. It means the serialization format that other bitcoin nodes can understand too, i.e, the format that is in "consensus" with the current bitcoin network. Bad encoding of correct data is also a "consensus failure" in the network. :) (Ya it gets weird with terminologies in Bitcoin) We will be sending around tx and other data through our connections, and if we for some reason need to broadcast a transaction in the bitcoin p2p network, we need it to be serialized as per "bitcoin consensus". That's the job of So the rationale here is, if we are using byte serialization, its better to follow the bitcoin protocol for a bitcoin app, and remove future interoperability issue with regular bitcoin nodes. |
I understand that, and that is what I meant: use
Since we're not importing upstream implementations of pub trait NetSerialize {
fn net_serialize(...) {
// ...
consensus_encode(...);
}
fn net_deserialize(...) {
// ...
consensus_decode(...);
}
} |
We are using teleport-transactions/src/messages.rs Line 18 in b67dab5
Which I didn't remove in this PR, would be done on a subsequent one if required. This causes the name conflict. Ya Updated with new name. |
I would recommend doing your custom |
Overall, changes look great! I've done another quick review on b05b838. A couple nits, but the code looks really solid overall. |
Ya if we just remove the derive macros, The name conflicts will be gone. But then we would also need to replace all serde calls inside the protocol. I plan on doing that after this PR is merged, subjected review by @chris-belcher. Could have done it here also, but that would create a change set I feel too big for a single PR. |
I am also wondering if its better to have Is it better to put encoding impls in a single module or in each structures own module? @GeneFerneau @chris-belcher let me know if you have any thought/prefs. I have seen both of them used in different projects. |
IMHO, having all serialization trait definitions in |
Not sure about the concept of the PR now that I just read about serde_cbor. Could we use that instead? What are the upsides/downsides |
We can. Then we have to consider the following situations
In principle they are both similar and solves our purpose. My only worry is interoperability in future. |
Rebased and Updated with nit comments. |
Both points you raise make sense to me. The main reason I brought up CBOR was for use-cases where JSON was being used for serialization, not consensus encoding. I agree for consensus encoding uses, it is more reasonable to use the de-facto standard ( If there are other cases where JSON serialization is used, I would advocate for CBOR, especially if the bytes are going over the network. |
JoinMarket has the same taker/maker setup and in its several years of existence there's never been a need for other apps to implement the taker/maker protocol. I don't think we'll ever end up in a situation where other apps are talking the teleport protocol between makers and takers. Interoperability is not an issue at all since it's overwhelmingly likely that this app will only ever talk to other instances of itself.
Is it possible to write something like |
Looks like |
In that case we can use whatever encoding we want, and its better to not worry about interoperability and optimize on other aspects.
I am not sure, my understanding is serde::json is already implemented in rust-bitcoin, so that's why we could use its as derive macros here. But serde::cbor is not. I am not too familiar with how serde works, but my guess is we would need to have some serde::cbor impls for primitive bitcoin types. I will try to see if direct derivation works for cbor. |
AFAICT, the |
Thanks @GeneFerneau , that seems very convenient. I will try that and see if it works. I agree that it's better to have simple encoding than define our own if we don't need to follow any specific protocol. Let me know if you have already tried. Also happy to review a PR if you did. |
I have been playing around with cbor (both serde and other alternatives). So far they can't seem to handle our object serializations, and I am not sure why. Here's a demonstration
I am using two different impls of cbor (
|
Just reproduced the errors locally, not sure the best thing to do about it. Wasn't aware about IMHO, think the best thing to do is keep the I can dig into the errors as time permits, since it seems like an internal lib error (more likely), or we are using the lib wrong (less likely). Thanks for trying to get this working. |
Yeah makes sense.. It seems CBOR deps aren't mature enough to handle our data, especially large byte numbers. I will finalize the hand encoding way.. For now that seems like the only option. |
Rebased and fixed failing tests.. I am not sure, but one thing I am expecting to face while integrating byte encoding into the networking modules is how to separate messages without deliminator. For json we are using Any suggestion or refs on how to handle such situation would be helpful. Thanks.. UpdateSome refs on the above: https://stackoverflow.com/questions/13974228/how-to-place-a-delimiter-in-a-networkstream-byte-array And yes, it doesn't seem like a trivial problem. One option is to have our own stream decoder like they do in rust-bitcoin : https://github.com/rust-bitcoin/rust-bitcoin/blob/master/src/network/stream_reader.rs Hmm, it's getting more complicated than I thought for byte encodings.. |
Like mentioned in the StackOverflow post you linked, prefix-length encoding is pretty useful for network encoding. A number of places in Bitcoin use minimally encoded integers for the length value, followed by that length of data. If I'm not wrong, this is what |
Thanks @GeneFerneau for the suggestion. I tried doing something like that. Have a For that I modified the async fn send_message(
socket_writer: &mut WriteHalf<'_>,
message: &MakerToTakerMessage,
) -> Result<(), Error> {
let mut message_bytes = vec![];
let len = message.net_serialize(&mut message_bytes)?;
let var_len = VarInt(len as u64);
let mut result = vec![];
var_len.consensus_encode(&mut result).map_err(|e| Error::Serialisation(e.into()))?;
result.extend_from_slice(&mut message_bytes);
socket_writer.write_all(&mut result).await?;
Ok(())
}
async fn read_message(reader: &mut BufReader<ReadHalf<'_>>) -> Result<TakerToMakerMessage, Error> {
let len = read_varint(reader).await?;
let mut buff = Vec::<u8>::with_capacity(len.0 as usize);
reader.read_exact(&mut buff).await?;
let message = TakerToMakerMessage::net_deserialize(&buff[..])?;
Ok(message)
} Along with it a custom async fn read_varint(reader: &mut BufReader<ReadHalf<'_>>) -> Result<VarInt, Error> {
let n = reader.read_u8().await?;
match n {
0xFF => {
let x = reader.read_u64().await?;
if x < 0x100000000 {
Err(self::Error::Protocol("Bad VarInt"))
} else {
Ok(VarInt(x))
}
}
0xFE => {
let x = reader.read_u32().await?;
if x < 0x10000 {
Err(self::Error::Protocol("Bad VarInt"))
} else {
Ok(VarInt(x as u64))
}
}
0xFD => {
let x = reader.read_u16().await?;
if x < 0xFD {
Err(self::Error::Protocol("Bad VarInt"))
} else {
Ok(VarInt(x as u64))
}
}
n => Ok(VarInt(n as u64))
}
} This is the generic scheme of But so far I am being hit by thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Serialisation(ConsensusEcode(Io(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })))buff data: []
', src/offerbook_sync.rs:82:86thread '
tokio-runtime-workernote: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
' panicked at 'called `Result::unwrap()` on an `Err` value: Serialisation(ConsensusEcode(Io(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })))', src/offerbook_sync.rs:82:86 maker error error reading from socket: Serialisation(ConsensusEcode(Io(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" }))) So far it doesn't seem to be straight forward either. I suppose mainly because we are using async stuffs, and the |
Do you have a separate branch with those changes? I tried reproducing the errors you list, but every test passes locally with your latest changes.
Not sure what the right answer is here, since a sync function is being called within an async reader. May help to use the
Looks like maker_protocol::run and taker_protocol::read_message handle reading bytes off the wire using Maybe try |
So the approach was to use a I have pushed these changes in my branch here https://github.com/mojoX911/teleport-transactions/tree/encoding-trial Its rough, and I commented out previous read methods, but this should explain what I tried to do. Thanks, any suggestion on this would be very helpful. |
Thanks for posting your changes. I pulled them down, and think I've found the bug: you need to set the size of the buffer you pass to Something like: // either
let mut buff: Vec<u8> = Vec::with_capacity(len.0 as usize);
buff.resize(len.0 as usize, 0);
// or
let mut buff = vec![0; len.0 as usize];
reader.read_exact(&mut buff).await?; Think there is still a hang somewhere, but that fixes the |
@GeneFerneau Ah thanks a lot.. Yes that seems like the thing I am missing. Took me some time to get back to this.. Will give this a try and report back.. |
We need our custom serilization called `NetSerilizastion` trait, as bitcoin's consensus encoded cannot be extended directly to our modified coinswap structures. Though this trait heavily dependeds on bitcoin's `consensus_encode`, as mostly use it for primitive serialization derivation.
The custom serialization trait is implemented for all message variants. A new `TakerToMakerMessage::TestMessage` message is created for internal testing message (like killing a maker process from integration tests). Unit test cases are added for message encoding/decoding roundtrips.
This is required for message encoding unit tests.
Replace the existing json text encoding communications with byte encoded send and receive functions for maker protocol. VarInt size demarcation is used to specify length of the message in wire
same as maker protocol, use new byte encoding instead of json text. VarInt is used for message size specification on wire.
Offer book sync internal logic is modified to use byte encoded messages instead of json encoding.
Finally add `serialization.rs` as crate module, and update integration test's kill sitch.
I have finally managed to get to the bottom of various byte encoding issue. And it seems now all is working at a satisfactory level. Although I am seeing a little lag in standard coinswap process, that can be something internal to the So far basic coinswap is working with byte serialization and this PR is ready to have another round of review. I have restructured to commit to make review easier, but its still a lot of code as a whole. I was thinking about breaking it up into multiple PRs, but that might cause more headache than help. Right now its a complete change that implements and shift all the network level communication to byte encodings. Some utility method refactoring can be used, but I would like to do that in separate smaller PRs. This one is already big enough in its scope. Thanks @GeneFerneau for helping out with the suggestions. |
It seems for some reason codecov compilation is failing and it probably is a grcov upstream issue. Nothing related to this PR. https://github.com/bitcoin-teleport/teleport-transactions/runs/4082340098?check_suite_focus=true#step:10:20 I am investigating further. UpdateThis should be fixed after #41 |
5c77b31
to
46e29e5
Compare
This PR implements our own Encoding/Decoding using
bitcoin::consensus::encode::Encodable
traits.I decided to go with our custom traits because that gives us flexibility and we need encoding for stuffs that isn't covered by rust-bitcoin (like
Signature
,Publickey
etc).The commits look bulky, but its mostly straight forward change.
Related #19
There is a typo in function name
serialisable
, this is intentional to avoid conflict withserde::serializable
and will be fixed once we move away fromserde
completely.