-
Notifications
You must be signed in to change notification settings - Fork 344
Binary format proposal. #252
Binary format proposal. #252
Conversation
carlosalberto
commented
Jan 19, 2018
- Uses ByteBuffer to read/write data synchronously.
- write() is expected to read all passed data.
- read() may need many calls in case the buffer is not big enough.
- Adapter for InputStream/OutputStream.
- MockTracer BINARY propagator.
* Uses ByteBuffer to read/write data *synchronously*. * write() is expected to read all passed data. * read() may need many calls in case the buffer is not big enough. * Adapter for InputStream/OutputStream. * MockTracer BINARY propagator.
See #253 |
throw new RuntimeException("Corrupted state"); | ||
} finally { | ||
if (objStream != null) { | ||
try { objStream.close(); } catch (Exception e2) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it OK do not print e2
?
} catch (IOException e) { | ||
} finally { | ||
if (objStream != null) { | ||
try { objStream.close(); } catch (Exception e2) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it OK do not print e2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not important here - this is closing an object used within this method, and it's more of a cleanup task.
// the end of the stream or not. | ||
byte[] b = new byte[buffer.remaining()]; | ||
int available = inputStream.read(b); | ||
if (available < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add curly braces for if
body
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update 👍
public interface Binary { | ||
/** | ||
* Writes a sequence of bytes to this carrier from the given buffer. | ||
* The internal buffer is expected to grow as more data is written. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this line refer to? Whose internal buffer? Seems we can remove it without loss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internal storage, basically (which would mean that many writes are possible). So based on your comment, it sounds like it definitely needs improvement (either by mentioning that multiple writes are allowed, or removing this line altogether).
} | ||
} | ||
} else { | ||
throw new IllegalArgumentException("Unknown carrier"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can do this at the top to reduce nesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, was following the style the TEXT_MAP
propagator has :) Anyhow, I can definitely change it to be top level as well.
@carlosalberto I haven't had time to review the whole discussion yet, but does this fully incorporate @raphw's feedback from before? |
@tylerbenson Some of it, but we are still not taking into account two things that were requested:
I'd expect, in turn:
|
I can only repeate my critique of this API:
None of these problems occurs by offering a stream-based approach:
But even with streams: if the server uses a different tracer implementation than the client, none of this will work to begin with. There is no serialization format defined and if I write the data as BSON but the server expects protobuf, this would result in garbage data being reported. Even worse, this could maybe turned into an attack vector if I find a way to exploit the servers deserialization approach, maybe by forcing it to overconsume data from the buffer what corrupts the remainder of the stream what can have unwanted effects. |
@raphw What would you then suggest? Maybe I'm missing something, but writing instrumentation to support propagation over a generic binary interface without breaking the protocol when only one side is instrumented seems to be a significant challenge. Perhaps it would help if we had some specific use cases. What protocols/frameworks are expected to use this binary format? (perhaps protobuf, thrift, grpc?) Maybe it would be better to define those propagation interfaces specifically like we do for http, instead of lumping everything into a binary format? Which formats/protocols are more common when working with NIO? What frameworks should be explored that would highlight these problems? (Forgive me for my ignorance. I've written a fair bit of instrumentation, but relatively little for systems without generic headers/metadata/parameters that can be used for propagation details.) |
@raphw @tylerbenson I'd love to see the requirements for binary format expressed as a set of tests/examples. This worked very well for designing context propagation, where the requirements were similarly nuanced. If we keep discussing in english, we'll probably never resolve this. :) |
Binary Protocols typically include such optional sections either by:
The first option requires to know the binary size in advance, the second option requires a way to manipulate any written byte before it is written to a buffer. My original suggestion using streams would allow for that. |
Hery everybody - sorry for the delay. Recently I took up on testing the current Please feel free to test it out. Cassandra itself takes a I have the impression most of the times, as the That being said, I do realize that @raphw usage may not be supported with the current approach, and that may need an additional format (maybe something like Let me know ;) |
Trying to re-take the discussion: I took a little look at how Netty works, and ended up gathering some overall issues that have been floating during the design of this format. I'm listing them here to help things get moving.
One thing I wonder is how much time will be spent writing/doing instrumentation for low-level stuff (such as the one exposed by Netty), and how other times it will be done for http or frameworks abstracting the transmission (for these ones we don't need the stream-based API). Thoughts? Waiting for your feedback @raphw |
the more I think of it, I believe that it would actually be the easiest to extract a |
@raphw Thanks for the feedback. Well, I'd say using OT for instrumenting Cassandra on the server side is already an actual case (albeit a simple one ;) ). @yurishkuro Any opinion on this? |
I agree that this is not a pressing matter (strictly speaking "binary" can be just a text_map serialized in a way specific to the protocol being instrumented, like Cassandra), but would be nice to resolve. |
Closing this PR in favor of #276 (which was just merged). |