Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Practically) remove message size limit #69

Closed
LivInTheLookingGlass opened this issue Jul 11, 2016 · 6 comments
Closed

(Practically) remove message size limit #69

LivInTheLookingGlass opened this issue Jul 11, 2016 · 6 comments

Comments

@LivInTheLookingGlass
Copy link
Collaborator

LivInTheLookingGlass commented Jul 11, 2016

Proposal: add 1 byte to size header which describes the length of the size header and packet header. New structure would be roughly:

  1. 1 byte describing header length
  2. X bytes describing message length
  3. Xn bytes describing packet layout
  4. Packets

For automated messages this saves ~11 bytes each. It also expands the possible message size/packet length from 2564-1 to 256255-1.

The question is, does this render a sufficiently large increase in complexity that it's not worth it? For instance, this means that struct will no longer work for the purpose. This will have lots of effects on how fast/slow the program becomes, and it may not be worth it.

Strangely, this may make the Javascript side easier, as it forces me to remove a dependency.

@LivInTheLookingGlass
Copy link
Collaborator Author

LivInTheLookingGlass commented Jul 14, 2016

Proposed formal definition:

Size prefix        - 1 byte defining X, or the size of each packet header
Size of message    - X (big-endian) bytes defining the size of the message
------------------------All below may be compressed------------------------
Size of packet 0   - X bytes defining the plaintext size of packet 0
Size of packet 1   - X bytes defining the plaintext size of packet 1
...
Size of packet n   - X bytes defining the plaintext size of packet n
---------------------------------End Header--------------------------------
Pathfinding header - [broadcast, waterfall, whisper, renegotiate]
Sender ID          - A base_58 SHA384-based ID for the sender
Message ID         - A base_58 SHA384-based ID for the message packets
Timestamp          - A base_58 unix UTC timestamp of initial broadcast
Payload packets
  Payload header   - [broadcast, whisper, handshake, peers, request, response]
  Payload contents

@LivInTheLookingGlass
Copy link
Collaborator Author

LivInTheLookingGlass commented Jul 14, 2016

Alternate, potentially more dense, more CPU-heavy proposal:

Main size prefix   - 1 byte defining the length of the main size header
Size of message    - X (big-endian) bytes defining the size of the message
------------------------All below may be compressed------------------------
Prefix of packet 0 - 1 byte defining the plaintext size of the packet 0 header
Size of packet 0   - X bytes defining the plaintext size of packet 0
Prefix of packet 1 - 1 byte defining the plaintext size of the packet 1 header
Size of packet 1   - X bytes defining the plaintext size of packet 1
...
Prefix of packet n - 1 byte defining the plaintext size of the packet n header
Size of packet n   - X bytes defining the plaintext size of packet n
---------------------------------End Header--------------------------------
Pathfinding header - [broadcast, waterfall, whisper, renegotiate]
Sender ID          - A base_58 SHA384-based ID for the sender
Message ID         - A base_58 SHA384-based ID for the message packets
Timestamp          - A base_58 unix UTC timestamp of initial broadcast
Payload packets
  Payload header   - [broadcast, whisper, handshake, peers, request, response]
  Payload contents

@LivInTheLookingGlass
Copy link
Collaborator Author

LivInTheLookingGlass commented Jul 14, 2016

Proposal one saves ~11 bytes per standard message.

Proposal two saves ~13 bytes per standard message.

The key difference is that proposal two scales better. It means that a message is denser if you send a single large packet, and several small ones. This is especially the case for things like the peers flag.

@LivInTheLookingGlass
Copy link
Collaborator Author

Proposal 3:

  1. Keep the 4GiB message limit
  2. Add a continuation flag, which, if at the beginning of a message, specifies a message ID that it is extending, or if at the end of a packet, specifies the message that will continue it
  3. Have the .string call return a list of strings if this occurs

One problem that this does not address is that in some C and Javascript environments, they cannot reach even the 4GiB limit. For instance, the C99 standard defines size_t must hold at least 16 bits. This is definitely not the 32 that we need there.

Additionally, node.js has a maximum Buffer size of 2 GiB.

A proposal needs to be made which addresses these local implementation inconsistencies. As it stands now, the Python implementation (and possibly also the Golang one) is the only one that keeps this part of the standard valid across all environments. This proposal should be extendible to reach such a position, but I do not at the moment know how.

@LivInTheLookingGlass
Copy link
Collaborator Author

That problem, outlined more clearly, is as follows.

C/C++/Obj C:

  1. The size of a string is limited by the environment's size_t parameter. This means that strings must be fed part-wise in order to parse correctly, and that individual packets (currently, in such environments) must be less than 2^16 bytes.
  2. Array sizes are limited by the environment's size_t parameter. This means that an array of packets cannot contain more than 2^16 packets consistently.

Javascript:

  1. The size of a Buffer cannot exceed 1GiB in some implementations, 2GiB in others. This means that packets cannot be larger than this, and that received messages cannot be larger than this.

My first thoughts are:

  1. These seem like they can be addressed at a data structure level (and receiving can be addressed at an implementation level)
  2. These seem like they won't happen for a very long time

@LivInTheLookingGlass
Copy link
Collaborator Author

I'll be closing this issue to split it into its appropriate parts. They will link back to here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant