flux_respond(h, msg, 0, NULL) results in protocol error? #495

trws · 2015-12-15T00:47:53Z

Perhaps this is intentional, but I did not expect it. When I send a response with a null payload, the receiver gets an EPROTO from the flux_rpc_get call. Clearly I can get around this by allocating an empty JSON object and passing that in, but somehow that feels wrong...

garlick · 2015-12-15T01:05:48Z

For better or worse, what happens is if the flux_rpc_get() call supplies a payload pointer it "expects" there to be a payload, and a missing one is an EPROTO. This is documented in flux_rpc(3)

json_out, if non-NULL, is assigned a string containing valid serialized
JSON from the response payload. It is a protocol error if a payload
that is expected (signified by a non-NULL json_out) does not arrive;
similarly, it is a protocol error if an unexpected payload (signified
by a NULL json_out) arrives. The storage associated with json_out
belongs to the flux_rpc_t object and is invalidated when that object is
destroyed.

Is that weird?

trws · 2015-12-15T01:09:21Z

I can see how that makes sense. I came upon this because the current version of the python bindings always pass a payload to be populated in order to return a response object. The interface currently has no way for the user to say whether there should or should not be a payload. I suppose I can add one, but I had assumed that the result would either be an empty json object or that *payload would equal NULL for a NULL payload.

trws · 2015-12-15T02:37:41Z

I was thinking about this on the way home, and I think I know why this surprised me as much as it did. The payload is explicitly noted as being optional, and is in fact ignored if the error is non-0. So, if a protocol normally responds with a payload, flux_respond(h, msg, !0, NULL) is valid, but flux_respond(h, msg, 0, NULL) is not. By the same token, a get without a payload will happen to be valid for one that has one so long as it is receiving an error. Am I thinking too far into this or does this seem counter-intuitive to anyone else?

lipari · 2015-12-15T16:01:11Z

Letting you know that I read the above discussion, but don't have a strong opinion one way or the other. As it sits, the onus is on the flux_rpc_get() caller to know whether or not a response is expected. If a response is expected and the json_out is NULL, then you know an error occurred. I guess I could live with that even if it meant passing an empty JSON object to every response, and finding it empty after the get meant that no response was provided.

grondo · 2015-12-15T16:56:13Z

By the same token, a get without a payload will happen to be valid for one that has one so long as it is receiving an error. Am I thinking too far into this or does this seem counter-intuitive to anyone else?

Without using the interface yet, I can say I agree with @trws that the current behavior seems pretty awkward and possibly violates principle of least surprise. I think it would be much improved if empty payload is allowed (setting json_ptr to NULL if provided) and will save lots of frustration for future developers...

garlick · 2015-12-15T17:32:49Z

Is the automatic EPROTO on extra/missing payload, or the "dual function" of flux_respond() (handling both error and non-error response cases) the problem, or both?

I feel a little guilty about both because clearly they do make simple functions more complicated, but I proposed them that way because I found myself repeating the same little chunks of code over and over, e.g.

if (rc < 0) {
    if (flux_respond_errnum (h, msg, errno) < 0)
        flux_log_error (h, "...");
} else {
    if (flux_respond (h, msg, json_str) < 0)
        flux_log_error ("...");
}

versus

if flux_respond (h, msg, rc < 0 ? errnum : 0, json_str) < 0)
    flux_log_error (h, "...");

or

if (flux_response_decode (msg, &json_str) < 0)
    goto done;
if (json_str == NULL) {
    errno = EPROTO;
    goto done;
}

versus without the block testing for NULL.

grondo · 2015-12-15T17:51:12Z

IMO automatic EPROTO on missing payload is the awkward bit here. I predict it will be surprising to 3rd party users that pass in a pointer to JSON object to get EPROTO error. It also hides the fact that the error was due to missing but presumably expected payload, vs server side EPROTO which could be due to malformed input JSON payload.

To save duplicated code for in-tree users, it would make more sense to have a util wrapper in libflux-internal or macro in macros.h. Then you're specifically saying you know what you are doing.

So far I think the flux_respond usage is nice as you note above.

trws · 2015-12-15T17:54:10Z

I like the flux_respond usage overall, but I would actually prefer that always allow a payload to be either empty or full. In the case of an error, I can understand ignoring it, but it might be useful to send back a payload with further information on the reason for the failure, or application specific information to help in recovery. By the same token, I agree that a missing payload when one is expected is a potential protocol violation, but it isn't an error in the RPC protocol, rather it would be one in the user's protocol that is being defined on top of RPCs, right?

Overall, I'm agreeing with @grondo here that it's the EPROTO for presence or lack of payload, but as a side issue, ignoring payloads on error code seems a bit of a missed opportunity for richer exceptions (unless there's an underlying RPC protocol issue here I'm unaware of that would make it hard).

grondo · 2015-12-15T17:57:00Z

Well I think of the errnum as a low-level error. Your protocol on top could always define its own JSON error response that includes, say an error string. But for low level RPCs this is maybe overkill...

Good comments all around though!

garlick · 2015-12-15T17:59:51Z

Right, I was just going to say the same thing. An error response is defined in RFC 3 as numeric only. That doesn't prevent a service from defining its own structured error response, encoded as a regular response payload. Obviously these things aren't set in stone, but that one's quite a bit harder to change than the EPROTO behavior.

garlick · 2015-12-15T18:02:49Z

It should be noted that the EPROTO change is simple but will involve touching quite a lot of code (like every request handler and every RPC user). I'm OK with doing it and soon if people agree it should change though.

trws · 2015-12-15T18:04:24Z

I would be for it, but why so much fallout? It would now be valid to pass NULL in places where one couldn't before, but is that a sufficiently common pattern to cause a lot of fallout, or is there breakage I just missed?

garlick · 2015-12-15T18:07:39Z

It's the decode paths that are the problem. One would need to start checking for NULL before passing a payload to json_tokener_parse() or equivalent. It applies to request decoding (server side) and response decoding (client side).

grondo · 2015-12-15T18:13:09Z

Could you define a new macro or libflux-internal function that wraps the old behavior, then search-replace existing users?

trws · 2015-12-15T18:26:20Z

Maybe something like flux_rpc_get_assert_payload? Or add a flag to flux_rpc? I realize the latter doesn't actually save any changes, but it would give calling code a way to explicitly say whether they care or not.

garlick · 2015-12-15T18:36:17Z

Flags and/or variadic decode functions might be the way to go. Macros/internal functions might make a transition to a new idiom a bit easier.

Or will variadics cause problems for bindings?

trws · 2015-12-15T18:43:56Z

What would you want it to be variadic for? Do you have a design in mind? The bindings tend not to like them, so I would prefer to avoid it, but if it's sufficiently useful I can always write a wrapper.

This is where having c++-style default arguments would be nice. C99 macro-based default arguments might be a nice compromise if that's the case.

garlick · 2015-12-15T18:50:42Z

I was thinking of something like

// fmt: n=nodeid t=topic string j=json payload r=raw payload l=payload length
flux_rpc_getf (flux_rpc_t *rpc, const char *fmt, ...);

That would allow the raw and normal interfaces to be collapsed and could provide a bit more expressivity for what the caller expects (viz a viz EPROTO).

garlick · 2015-12-15T18:51:31Z

Now that I look at it it says "meh" to me.

trws · 2015-12-15T19:02:30Z

How about something like this?

enum flux_rpc_flags {
  FLUX_RPC_NO_FLAGS = 0,
  FLUX_RPC_PAYLOAD_UNEXPECTED = 1<<0,
  FLUX_RPC_PAYLOAD_RAW = 1<<1,
  FLUX_RPC_PAYLOAD_REQUIRED = 1<<2,
  FLUX_RPC_PAYLOAD_LENGTH = 1<<3
};
flux_rpc_getf (flux_rpc_t *rpc, int *errnum, size_t *payload_length, const char ** payload, flux_rpc_flags flags);

garlick · 2015-12-15T19:21:34Z

errnum is mutually exclusive with payload in the RFC 3 definition of a response message. The idiom we're using is to return -1 and set errno=errnum if the remote sends one back. That's harder to change as noted above.

There is a flags arg already in flux_rpc(). Maybe we could simply add

FLUX_RPC_GET_PAYLOAD_RAW
FLUX_RPC_GET_PAYLOAD_JSON
FLUX_RPC_GET_PAYLOAD_NONE

If any of those are set, you get EPROTO if the response doesn't have the expected payload. If none of these flags are set, you get the behavior that you anticipated when you opened the bug?

garlick · 2015-12-15T19:24:04Z

And maybe similar flags for flux_request_decode()?

trws · 2015-12-15T19:26:16Z

That certainly works for me. Extra points if the flux_respond call can inform the responder of the expectation with an error if they try to misuse it on that end.

trws · 2015-12-15T19:28:46Z

Scratch that last. Your solution looks good, but don't make respond potentially fail because of it. Giving the client the ability to cause an error return against a service just seems like a bad idea...

garlick · 2015-12-15T19:46:09Z

No flux_respond() itself only fails if something's gone wrong at a lower level.

grondo · 2015-12-15T21:05:40Z

This sounds good to me as well. @garlick, I'd be willing to help with grunt work if you'd like since you are already focused on the content service.

garlick · 2015-12-15T22:09:48Z

Sure help would be great!

Incidentally, you may already know this, but I have changes pending in PR #493 (commit bca843d) to disable the EPROTO handling in flux_request_decode_raw, and flux_response_decode_raw.

grondo · 2015-12-15T22:44:41Z

I did see that, but didn't put 2 and 2 together yet. I will base any work I do on top of #493, as that one looks like it will be ready for merge pretty soon.

grondo · 2015-12-17T19:16:31Z

Ok, I started looking into implementing the suggestions here and I'm quickly getting lost.
Here are some observations we might want to discuss and think about

It would be easy and not too disruptive to add flags to flux_rpc_t as described above, but as flux_rpc_get uses flux_response_decode() we still need to "fix" or offer alternative for flux_response_decode().
There exists *_raw functions for flux_rpc_get and flux_response_decode to get "raw" (binary data with encoded length), but flux_msg_set_payload() takes "raw" data argument and there is flux_msg_get/set_payload_json for JSON specifically. This asymmetry bothered me and we might want to fix it.
Should "JSON string" payload be simplified to just "NUL terminated string" -- since we now allow "raw" binary messages, should we also allow plain string (non-json) payload? It seems like string payload is special case of raw payload, and JSON payload is special case of string payload. Is there perhaps some clever way to handle this so that different payload types can inherit common methods from their subclass?

At this point I'm not sure that adding int flags argument to flux_response_decode() is going to be a real fix. @garlick had suggested perhaps now is a good time to step back and design good high level interfaces (perhaps it won't be too far off from what we have now). That is probably a great idea, but where to start?

BTW, my inclination right now would be to focus on flux_msg_t abstraction and build enough methods on top of that to make all the other convenience functions like flux_response_decode etc. easily implemented. This might mean flux_rpc_get() returns a flux_msg_t (another @garlick suggestion).

I don't feel confident enough to say this is the right way to go though.

garlick · 2015-12-17T19:34:24Z

Here's a thought. How about the following low level changes:

payload is mandatory
payload can be zero length
just drop the JSON flag

At the high level (rpc, request/response encode/decode):

drop the json_str functions and drop _raw from the raw functions

While that seems like a cleaner foundation to build upon, it will increase repeated code for users. We should take a look at that and see what we can do about it.

grondo · 2015-12-17T21:24:25Z

Yes, that sounds like a pretty clever way to handle the payload/no payload thing. For duplicated code, I think some kind of convenience functions for "special" payload types may be useful. (This is what I was attempting to get at with the extensible message types I ineloquently hinted at above). We could end up back near where we are now, but with a more solid foundation and (perhaps?) a methodology for adding perhaps other payload types (even for users outside flux-core?) [maybe getting too far off track here.]

trws · 2016-12-20T23:47:53Z

I just happened across this issue randomly, thought it was addressed some time ago but don't see any backreferences. Can we close?

garlick · 2016-12-21T18:34:49Z

Not fixed.

flux_request_decode() and flux_response_decode() still return EPROTO if a non-NULL const char **json_str argument is passed in but the message contains no JSON payload.

flux_rpc_get() calls flux_response_decode() so it too has this behavior.

Now that we have flux_rpc_getf() and flux_request_decodef() that clearly should return EPROTO in this case, I wonder if we could just convert the majority of the users of the above functions to the new ones, then change the behavior as proposed without too much fallout?

chu11 · 2017-03-01T16:44:33Z

payload is mandatory
payload can be zero length

@garlick, in the above do you mean to say flux_respond() (and some related functions) should no longer be able to take a NULL for json_str and users should minimally have to pass in a "{}"?

garlick · 2017-03-01T17:39:14Z

I think the api has evolved a bit since this discussion began, and I'm not too sure that comment stands as the best approach. What I as thinking in the dec 21 2016 comment was that we could just relax the decode function (including flux_rpc_get) handling of optional payload pointer arg so

if payload arg but no payload in msg, assign NULL to arg, not an error
if no payload arg but payload, not an error

This would mean it falls to callers to detect missing payload before attempting to decode it. However, my point above was that some of those may productively migrate to flux_rpc_getf() or similar which since the decoding is internal, must issue EPROTO if it is missing.

Does that make sense?

chu11 · 2017-03-01T22:43:04Z

@garlick Sounds good!

chu11 · 2017-03-03T16:48:46Z

As an aside, yesterday I tried to begin cleanup by replacing flux_rpc_get (r, NULL) calls with flux_rpc_getf (r, "{ ! }"). It took me awhile to realize this doesn't work, as internally flux_response_decode() would always be passed a non-NULL json_str argument.

So another fallout from this EPROTO behavior is in some scenarios flux_rpc() cannot be paired with flux_rpc_getf() seemlessly, atleast in this "obvious" scenario where you know no payload is going to be returned.

garlick · 2017-03-03T17:40:20Z

Yeah it doesn't make much sense to call the variadic version when there is nothing to parse. Going forward I was thinking it could continue to return eproto if there is no payload while relaxing the non variadic version, if that makes sense. On Mar 3, 2017 8:48 AM, "Al Chu" <[email protected]> wrote: As an aside, yesterday I tried to begin cleanup by replacing flux_rpc_get (r, NULL) calls with flux_rpc_getf (r, "{ ! }"). It took me awhile to realize this doesn't work, as internally flux_response_decode() would always be passed a non-NULL json_str argument. So another fallout from this EPROTO behavior is flux_rpc() cannot be paired with flux_rpc_getf() seemlessly, atleast in this "obvious" scenario where you know no payload is going to be returned. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#495 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKX2791b6F4msQ-ExJxyyyBvsiZT0DIks5riERugaJpZM4G1QUr> .

chu11 · 2017-03-03T19:54:41Z

Yeah, I originally was trying to replace as much usage of flux_rpc_get() as possible, but it seems best to only modify those that actually expect a payload back.

Rearchitect flux_request_decode. User can now pass in a NULL or non-NULL json_str pointer. The function will succeed regardless if a payload is available or not. The function will no longer return EPROTO under a payload & pointer mismatch. User is responsible to check for NULL or non-NULL per their expectations. Fixes flux-framework#495

garlick added this to the release 0.7.0 milestone Jan 4, 2017

chu11 self-assigned this Feb 28, 2017

chu11 mentioned this issue Mar 1, 2017

Add tests for Issue 495 #996

Closed

This was referenced Mar 10, 2017

Issue 495 - misc cleanup #1001

Merged

Issue495 - Cleanup #2 #1008

Merged

garlick closed this as completed in 66b5cd6 Mar 21, 2017

flux_respond(h, msg, 0, NULL) results in protocol error? #495

flux_respond(h, msg, 0, NULL) results in protocol error? #495

Comments

trws commented Dec 15, 2015

garlick commented Dec 15, 2015 • edited Loading

trws commented Dec 15, 2015

trws commented Dec 15, 2015

lipari commented Dec 15, 2015

grondo commented Dec 15, 2015

garlick commented Dec 15, 2015

grondo commented Dec 15, 2015

trws commented Dec 15, 2015

grondo commented Dec 15, 2015

garlick commented Dec 15, 2015

garlick commented Dec 15, 2015

trws commented Dec 15, 2015

garlick commented Dec 15, 2015

grondo commented Dec 15, 2015

trws commented Dec 15, 2015

garlick commented Dec 15, 2015

trws commented Dec 15, 2015

garlick commented Dec 15, 2015

garlick commented Dec 15, 2015

trws commented Dec 15, 2015

garlick commented Dec 15, 2015

garlick commented Dec 15, 2015

trws commented Dec 15, 2015

trws commented Dec 15, 2015

garlick commented Dec 15, 2015

grondo commented Dec 15, 2015

garlick commented Dec 15, 2015

grondo commented Dec 15, 2015

grondo commented Dec 17, 2015

garlick commented Dec 17, 2015

grondo commented Dec 17, 2015

trws commented Dec 20, 2016

garlick commented Dec 21, 2016 • edited Loading

chu11 commented Mar 1, 2017

garlick commented Mar 1, 2017

chu11 commented Mar 1, 2017

chu11 commented Mar 3, 2017 • edited Loading

garlick commented Mar 3, 2017 via email

chu11 commented Mar 3, 2017

garlick commented Dec 15, 2015 •

edited

Loading

garlick commented Dec 21, 2016 •

edited

Loading

chu11 commented Mar 3, 2017 •

edited

Loading