-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow serializing SM state #239
Conversation
Thanks for the PR. Exposing library internals is against the self-set rules of this library and therefore not mergeable. By exposing the struct members, this struct is from then on part of the API&ABI and changing anything in it would require a major version increase. IMO it would be better if we were exporting everything for the user, maybe into a binary blob!? An API could look something like: /* This takes ownership of `sm_state`. It allocates a new blob buffer,
serializes all the data contained within the `sm_state` and returns
an opaque buffer `blob` of len `blen`.
After serialization it first zeroizes, then free's `sm_state` completely.
The data can IMO be in an arbitrary format of your choosing, feel
free to propose something. It should be simple enough and robust.
The only feature that format requires from my side is "versioning". */
int xmpp_sm_state_serialize(xmpp_sm_state_t *sm_state, void **blob, size_t *blen);
/* Reverse the serialization */
int xmpp_sm_state_deserialize(xmpp_ctx_t *ctx, const void *blob, size_t blen, xmpp_sm_state_t **sm_state); This doesn't mean that this blob must not be "text editor compatible", but my target is to take the burden away from the user to care about handling that struct. Doing that might be OK for you, but it won't be OK for others. Regarding the "protocol" of the exported data I'm not sure what we should do ... Maybe use the SSH wire format? c.f. RFC4251 Ch.5, it's simple enough to be re-implemented and pretty robust, also it supports all features we require if I'm not mistaken. IMO we don't have to care about low-level details like e.g. endianness, we should expect this data to only be read on the machine that it was written from. What do you think? Let's discuss this, before diving into another implementation. |
That's why I wrote a new struct instead of re-using any of the internals. As you say though, I don't really care how I get this data, so it doesn't have to be a struct. So long as I can get the 4 values I need: current value of sent counter, current value of received counter, current sm id, and currently queued stanzas. I guess what you're saying is that technically I could structure my code to not know about those 4 values. Looking at what I'm doing I guess that's true since I always just store them as-is and retrieve them as-is. I can't really think of a reason I would need to introspect except debugging so maybe that's fine.
I definitely need to be able to serialize without taking ownership, since strophe needs to keep ownership in order to stay connected and keep updating the state as it goes. I call this a lot (on every sm state change, though I don't have events for quite all of them in strophe yet but it's on my todo list).
This could work. I propose instead using CBOR (RFC8949) -- it will be a few more bytes, but not more complex to generated, and can be parsed by more tooling and has an official encoding for lists of strings, which a main part of the data here. We would of course only accept our own output as input and not a full arbitrary CBOR parser. So the format could be something like:
|
It looks like we can trim this down to an acceptable complexity.
Why would you limit to 32? I could understand to 24, but 32 requires to implement two types, From the cost of implementation complexity I would prefer to go with a "always non-preferred" way of encoding all required types as their Your example would then look as follows:
FYI I've corrected the type of the queue entry from
I'm not sure whether I like that idea. The current way of only allowing the retrieval of the Maybe @pasis wants to add something to that discussion? |
So I'll give you a better overview of what I'm doing (currently doing this in js with xmpp.js but planning to add a native target that uses libstrophe): Every time a stanza comes in, or goes into the queue, or gets confirmed by an ack, or is known to have failed after a resume, I snapshot the sm state and serialize it, storing in persistent storage. This allows me to hydrate the state and attempt a resume even after a full crash or other unceremonious termination. In most cases this allows me to resume saving lots of time and keeping the stream intact. In the worst case, something has diverged and the resume fails pushing me back to a vanilla reconnect that I would have needed anyway without this. I do understand that as a consumer of this API it's my job to make sure I snapshot every time there is a meaningful change, otherwise my serialized state will be useless. |
Thanks for the explanation. I guess we have to give up something for your use-case then, and I thought whether it wouldn't be better if it were more interactive? I'm thinking of a callback mechanism you can register which gets called each time we update our SM state. My first idea looked like follows (NB: I'm not sure whether that's complete, I may have missed something :) ) enum xmpp_event {
/* `h` == 0 -> enabled
* `h` != 0 -> resumed
* `data` == NULL
*/
XMPP_SM_ENABLE,
/* `h` == `sm_handled_nr`
* `data` == NULL
*/
XMPP_UPDATE_H,
/* `h` == 0
* `data` == stanza
*/
XMPP_Q_ADD,
/* `h` == 0
* `data` equals to `data` of a previous `Q_ADD` event
*/
XMPP_Q_DROP,
/* `h` != 0
* `data` equals to `data` of a previous `Q_ADD` event
*/
XMPP_SM_Q_MOVE,
/* `h` != 0
* `data` equals to `data` of a previous `SM_Q_MOVE` event
*/
XMPP_SM_Q_DROP,
};
typedef int (*xmpp_sm_callback)(xmpp_conn_t *conn, void *ctx, const char *id, enum xmpp_event ev, uint32_t h, const char *data);
int xmpp_sm_state_set_callback(xmpp_conn_t *conn, xmpp_sm_callback cb, void *ctx);
int xmpp_sm_state_restore(xmpp_conn_t *conn, enum xmpp_event ev, uint32_t h, const char *data); The library calls the callback each time it changes the SM state with what happened. This would be bandwidth friendly, but handling the data and restoring is less user friendly. The complexity of the handling&restore, even though it's not so super complex, brought me back to what you basically proposed. Handing over the complete state via the callback, but in a serialized form. typedef int (*xmpp_sm_callback)(xmpp_conn_t *conn, void *ctx, const void *sm_state, size_t sm_state_len);
int xmpp_sm_state_set_callback(xmpp_conn_t *conn, xmpp_sm_callback cb, void *ctx);
int xmpp_sm_state_restore(xmpp_conn_t *conn, const void *sm_state, size_t sm_state_len); The library calls the callback each time it changes the SM state passing over the serialized state. What do you think? Another question: what do you currently do with the "regular send queue"? |
Since that kind of callback is exactly what I need to simulate on my side either way, it seems pretty much ideal to me if we can get it as you say.
I'm not sure what you mean here. Is there a second queue in use that is not tied to SM state? |
I guess the question should've been clearer then :D What do you think of the two approaches? Could you live with the second one?
Yes, there's the regular send queue which buffers stanzas before they get sent out.
There's a mechanism of stanza counting involved to determine what was received on the server side
You can retrieve the regular send queue elements via FYI I've found two issues after re-reviewing the implementation. This only matters if you drop send queue elements and I will provide a fix soon. |
The second one being where there is an event that fires every time the SM state changes and it includes the serialized payload? Yes, this is ideal for me.
Hmm, that's unexpected. I would consider this part of the SM state really (on my side) since in the event of a reconnect these are still things that need to get re-tried (even if they were never tried once). So I guess if I save what you think of as SM state without this there will be a data loss race condition. Would it be crazy from your side to include the contents of this queue in the new "state changed" event and restore it with the same restore call? |
No, I also think this should be included in the serialized data. The serialized format could then look like this:
Where the Am I missing something? |
I'm not sure why |
Because we need to keep track of the |
We could also calculate the |
I'm happy to do it as a map if that's easier. From outside I just get a blob and hand it back so doesn't matter too much to me |
0ae4a17
to
f5f4916
Compare
I have pushed a new draft based on our discussion, see if this matches what you thought |
f5f4916
to
2e3757a
Compare
3b489cd
to
4f7972b
Compare
I took the liberty to work a bit on this. I hope you're fine with the refactor!? Just FTR: One can run a (failing) GH action locally with
|
@sjaeckel sure, seems likely to have equivalent functionality and still passes the test 👍 |
It is useful in some applications to get a snapshot of the stream management state for storage outside the process, in order to recover from crashes and other things. The get_sm_state already present is not suitable for this because it returns a live object tied in to the current context and such, and containing much unneeded internal-api data. Introduce xmpp_sm_state_set_callback which is called every time the state changes with a serialized state, and a dual xmpp_sm_state_restore which takes in this serialized state and sets up a new sm_state based on that. The serialization is considered opaque from the PoV of the API, but is based on CBOR to facilitate easy debugging.
* Refactor `xmpp_sm_state_restore()` Add new internal functions to load u32 and string values. This also brings length checks in order to verify we don't read more sm_state than there potentially is and type checks to ensure we have the correct CBOR types. Each element is added to the queue right after creation, so it won't leak in case creating its content fails. * Refactor `sm_state_serialize()` * Store & restore ints as/from big endian, CBOR requires that * Bring API names in line with the usual naming * Auto-format sources Signed-off-by: Steffen Jaeckel <[email protected]>
`xmpp_run_once()` uses `FD_SET()` to determine which sockets to `select()` on. The socket gets initialized to `INVALID_SOCKET = -1` inside `xmpp_conn_new()` which leads to a buffer overflow once `FD_SET()` is called. This is only exposed when enabling a higher optimization level, which was only done in the `release-test` CI job. * Fix this testcase by initializing the socket to a possible value. * Build the Valgrind CI jobs with `-O2`. * Make the output of the `release-test` more verbose. Signed-off-by: Steffen Jaeckel <[email protected]>
a6b21fd
to
e3d734c
Compare
It is useful in some applications to get a snapshot of the stream management state for storage outside the process, in order to recover from crashes and other things. The get_sm_state already present is not suitable for this because it returns a live object tied in to the current context and such, and containing much unneeded internal-api data.
Introduce a new get_sm_serializable_state that creates instead a fixed snapshot, handing full memory ownership to the caller, and a dual set_sm_serializable_state which takes in the bare minimum needed values in order to resume and sets up a new sm_state based on these values.