-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Whats wrong with this buffer? (how to decode a protobuf buffer by hand) #55
Comments
If you do not have the .proto file, all you can do is to reverse engineer it from the buffer, which is possible using the protobuf documentation: https://developers.google.com/protocol-buffers/docs/encoding Note that a couple of data types map to the same wire type (especially wire type 2 and unsigned / signed integers) and you'll have to sort out which is the right one on your own. Because of this, there is no automated approach. |
In this exact case:
0A hex = 1 | 010 bin which is the first tag constructed from two values (marked by "|") = wire type 2 (=010), id 1 (=1)
wire type 2 is a length delimited value (see), so 0D hex = 13 dec is the length (which is single byte varint of 0001 0011 bin actually, which is easy to decode by a simple binary to decimal conversion) which is the rest of the data. Assuming that this is an inner message with a length of 13 bytes we get for its contents:
08 hex = 1 | 000 bin = wire type 0, id 1 wire type 0 is a varint, which is a bit difficult to calculate by hand if it is built from multiple bytes. However, we are able to determine its length:
F9 hex = 1111 1001 bin with first bit set, continue You have to determine whether it is 32 or 64 bit (assuming 64 bit will always work as it will work with 32 bit values, too) and if it is unsigned (uint_), signed (int_) or zig-zag encoded (sint*, see).
12 hex = 10 | 010 bin = wire type 2, id 2 wire type 2 is, as we already know, a length delimited value, so
02 hex = 2 dec is the length
4F 4B hex = 0100 1111 0100 1011 bin - you have to determine what actual type this is
18 hex = 11 | 000 bin = wire type 0, id 3 again a varint
8A hex = 1000 1010 bin with first bit set, continue
20 hex = 100 | 000 bin = wire type 0, id = 4 again a varint
4E hex = 0100 1110 bin with first bit not set, end.
The buffer looks ok so far. The original .proto could, substituting your already provided data types, look somehow like this:
|
Very nice... Tnx bout your patience. Its very clear now...
exactly like
Why the error persist? |
I'd say yes, it is - meaning if I don't miss something. If so, it could be an encoding issue, like that the buffer becomes converted to a string somewhere and is corrupted in that process or such. How are you obtaining the data / reading it into a ByteBuffer? |
you know if it can happen by passing the buffer by various scripts? ... currently I pass the buffer (by parameter) for 3 different files. |
If you pass it just as a function argument, this depends. Just passing it does not modify its type but any of the functions involved could possibly convert the buffer back and forth to some other data type, like a string, which might corrupt the data (like when en/-decoding to/from UTF8, US-ASCII etc.). If you obtain it through HTTP and binaryType="arraybuffer" like with WebSockets or similar isn't available, I'd suggest that you encode it to Base64 before transmitting it over any network connection, and decode it properly to bytes prior to putting it into a byte buffer. |
tnx bout you code passion :) |
It's a good example and I've linked it from the FAQ in the wiki :) |
Another example: #143 (comment) |
since wire type 2 (Length-delimited) represents string, bytes, embedded messages, packed repeated fields, is there way to differentiate between string vs message vs repeated field ? |
Not from the raw message alone, but with the correct .proto file loaded, you have everything at hand to evaluate the reflection structure for what to expect. The other option is guessing. |
Thanks for getting back so quick. I am basically using java to decode the raw protobuf and trying to generate the same result as protoc --decode_raw. It looks to me that there must be a way to differentiate since protoc command is doing it just by reading the raw protobuf. |
Well, protoc then probably does some guessing for you. There is no other information on the type than "length delimited", as that's all a decoder needs (like with --decode_raw). Combined with the .proto definition, it becomes interpreted as the type it is. |
I have problem with extraction of string values from binary data. 4a 33 0a 15 31 30 36 36 37 31 38 32 39 39 33 32 32 34 30 34 36 34 38 33 36 22 1a 56 69 6a 61 79 61 6b 72 69 73 68 6e |
Can any one help me to how to decode the binary data as specified in above comment.. |
How do we differentiate length delimited string, bytes, sub messages in binary data..Can any one help me how to parse the binary data |
@venkatpathapati There is no other way to differentiate string and submessage except it's content. But even content may be confusing. So if you haven't schema you only can try to predict what message is and try to decode it. |
I am getting an index out of range exception, and the below is the JS snippet where the data is recieced but on continous recieving the protobufjs is throwing an index out of range exception at protobufjs\src\reader.js :13:12 can any one help me in decoding the message getting from the server continously. |
When sending and receiving multiple messages, use length delimited messages, because otherwise the decoder doesn't know where one message ends and another one starts. protobuf.js provides |
Trying to decode:
0a 0d 08 f9 27 12 02 4f 4b 18 8a 8c 06 20 4e
with this message:
Getting this error:
Error: Illegal wire type for field Message.Field.core.comm.message.int2s.PaymentResponseElement.messageCode: 2 (0 expected)
The text was updated successfully, but these errors were encountered: