Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eliminate czmq dependency in public api #219

Merged
merged 16 commits into from
Jun 12, 2015

Conversation

garlick
Copy link
Member

@garlick garlick commented Jun 7, 2015

As discussed in #168, this PR replaces zmsg_t * in the Flux public API with flux_msg_t. This PR is for feedback not for merging.

Right now, flux_msg_t and zmsg_t * are interchangeable, both defined in terms of the same incomplete type struct _zmsg_t. This makes transitioning a bit easier. Although internally there are still a lot of zmsgs, I believe after this PR libflux-core should no longer require users to link against czmq/zeromq.

Freed from following czmq's API design so much, some public Flux interfaces were also changed (improved?)

There is one new message function:

void flux_msg_destroy (flux_msg_t msg); // to be used instead of zmsg_destroy

There are new low-level send/recv functions:

int flux_send (flux_t h, const flux_msg_t msg, int flags); // flags: unused
flux_msg_t flux_recv (flux_t h, flux_match_t match, int flags); // flags: FLUX_O_NONBLOCK
int flux_requeue (flux_t h, const flux_msg_t msg, int flags); // flags: FLUX_RQ_HEAD or _TAIL

These replace the following functions (still available during transition):

int flux_sendmsg (flux_t h, zmsg_t **zmsg); // free on success
zmsg_t *flux_recvmsg (flux_t h, bool nonblock);
zmsg_t *flux_recvmsg_match (flux_t h, flux_match_t match, bool nonblock);
int flux_putmsg (flux_t h, zmsg_t **zmsg); // requeue to tail, free on success
int flux_pushmsg (flux_t h, zmsg_t **zmsg);  // requeue to head, free on success

@garlick garlick added the review label Jun 7, 2015
@grondo
Copy link
Contributor

grondo commented Jun 8, 2015

Wow, large amount of work for a weekend!

Before a thorough review, a couple things come to mind:

  • is memory management the same for flux_msg_t ? Free on send? (Sorry if this was already discussed, just wondered if this came with a destroy change)
  • Why all the changes to function signatures to "avoid casting flux_free_f" ?

@garlick
Copy link
Member Author

garlick commented Jun 8, 2015

Sorry, it looks like more work that it was because I didn't squash down the small commits.

Memory management is changed: send takes a const flux_msg_t and does not destroy it. My goal is to eliminate that destroy-on-send behavior as I think we agreed was better. I haven't yet made it to the reactor message handler function prototype though, which is sort of based on that.

I renamed FluxFreeFn to flux_free_f and would have had to touch each of those places where there was a cast. Since I had to go there anyway, I just got rid of the cast which was unnecessary (look at any one of those commits and you'll see what I mean).

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

I had thought I should take care of the reactor prototypes in this PR as well, but that is shaping up to be a big job, so I think it might be better to carry that over into another PR.

Here are some todos for this PR.

  • The newly added FLUX_O_NONBLOCK only works when passed to flux_recv(). It has no effect when passed to flux_open(). It should work like flags in open(2), read(2).
  • Although exposed interfaces were converted from zmsg_t * to flux_msg_t, libflux internals now contain a mix of the two types. This should be made consistent internally
  • Add serialize/deserialize functions for flux_msg_t that Initially wrapzmsg_encode() / zmsg_decode(). Use this directly in api module/local connector and drop the libutil/zfd class, which is now sitting in the wrong part of the source tree and is kind of an abomination anyway.
  • Perhaps some loop tests could be added for the basic handle ops, e.g. send/recv/requeue.

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

Argh, it seems that const is not working here the way I thought it did with opaque pointer-to-struct types. This is how I define flux_msg_t:

typedef struct _zmsg_t *flux_msg_t;

I noticed that in a function defined like this:

int flux_msg_get_type (const flux_msg_t msg, int &type);

I can pass msg to a function that drops the const without the compiler complaining about it. Maybe it would be better to define it the same way zmsg_t is defined e.g.

typedef struct _zmsg_t flux_msg_t;

and change all those flux_msg_t arguments to flux_msg_t *. In this stack overflow post there's a small rant about never doing things the way I did.

I advocated for this style in RFC 7.

Any thoughts?

@grondo
Copy link
Contributor

grondo commented Jun 9, 2015

heh. I personally prefer the first answer in that post and tend to just pass struct foo * around. I don't think dropping the "struct" has enough benefit in a language like C to warrant potential confusion. However, I don't feel as strongly as that commenter in this case, and assume there are benefits to typedef struct _foo * foo_t of which I'm not aware (so I'd defer to other project members on this one)

I'm not fond of the change to

typedef struct _zmsg_t flux_msg_t

because it doesn't seem consistent with other things we've done (though there is already the odd case of flux_match_t so perhaps it doesn't matter?)

In this case my personal preference would be to keep typedef struct _zmsg_t * flux_msg_t and do without the const declarations because C can't really do what we want. If we really need the const I'd prefer using const struct flux_msg * or something (i.e. not dropping struct just because)

Again, I'm just expressing my thoughts here and could easily be swayed

@lipari
Copy link
Contributor

lipari commented Jun 9, 2015

I personally agree with the stackoverflow post for the reasons stated. But I'm an even bigger fan of consistency. So, given that you're basing flux_msg_t off zmsg_t, I would agree that flux_msg_t probably should match zmsg_t. You can then update RFC 7.

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

For the record, I think flux_msg_t will only be based on zmsg_t during transition.

I hate to sacrifice const when we could keep it, and get the same effect of abstracting theflux_msg_t implementation by defining as an alias for an incomplete struct, e.g.

struct flux_msg_struct; // or struct _zmsg_t during transition
typedef struct flux_msg_struct flux_msg_t;

and then making arguments const flux_msg_t * or flux_msg_t * and getting compiler support for catching modifications in those functions. For example, the new reactor message handler might take a const flux_msg_t * and then it becomes abundantly clear that you're not to free or modify the message.

@grondo
Copy link
Contributor

grondo commented Jun 9, 2015

@garlick, that seems reasonable to me. I would just be a little careful about having some typedef types (*_t) defined as pointers to struct and others defined as aliases for actual structs. I think it will cause a little bit of confusion if there is mixed usage within a single API. I would also be be fine with using struct flux_msg * in the api as well, though I do see the benefit of flux_msg_t as a hint that we're dealing with an opaque object.

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

We can use const struct flux_msg_struct *msg internally (where type is not opaque) and const flux_msg_t msg externally and then at least we will get const enforcement internally and documentation externally. Maybe that's good enough?

@grondo
Copy link
Contributor

grondo commented Jun 9, 2015

probably getting out of bounds here, but could you declare a second type

typedef const struct flux_msg_struct * flux_msg_const_t;

To use in reactor callbacks where flux_msg shall not be modified? flux_msg_const_t can only be passed to other functions that take flux_msg_const_t and a flux_msg_t could be cast to flux_msg_const_t (perhaps automatically by compiler?)

I don't know, just another idea to throw out there for enforcing const-ness for API users (I like your idea above internally)

note: Uhhh, I didn't even try this before I proposed it, so it may not work

@grondo
Copy link
Contributor

grondo commented Jun 9, 2015

btw, on why this doesn't "just work" as expected with flux_msg_t-- this stackoverflow answer seems applicable.

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

flux_msg_const_t seems like the best and maybe the only way to get const enforcement (such as it is) with an abstract type...

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

Just pushed some changes for flux_msg_const_t and used this type for all read-only messages parameters in the message functions. The const protection is good, and it doesn't actually seem all that ugly (though conceptually it irritates me a bit).

Have a look and see what you think. We can certainly change it. I may be over-valuing the const protection since I anticipate converting a lot of code from the destroy-on-send idiom, which will all be over soon, and then we'll be stuck with people using this.

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

One preemptive comment re: the casts to (zmsg_t *) in message.c: these are not the droids you're looking for, that's an artifact of transition.

@trws
Copy link
Member

trws commented Jun 9, 2015

I just saw this significant number of posts, what is the reasoning for preferring a typedef that looks like a struct but is actually a pointer over passing pointers to actual structs? Just as a personal preference, I tend to want any pointer to be visibly a pointer and get rid of the unnecessary struct keyword on the chance that at some point I may decide to make it a union of structs or a basic value type under the hood. At least for me the reasons are that it doesn't give the potentially erroneous impression of having its memory managed by the stack, and users don't get surprised by a pass-by-value function turning out to be pass-by-reference because of a typedef. It also avoids the struct lock-in and verbosity of using struct everywhere.

Is there a quantifiable benefit to hiding the pointer?

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

Just to be clear, you are arguing for something like:

typedef struct flux_msg_struct flux_msg_t;

and arguments of flux_msg_t * or const flux_msg_t *?

@trws
Copy link
Member

trws commented Jun 9, 2015

Correct. It's the way all of the czmq types are handled, and while I
realize that flux_t already hides a pointer, having it that way for
messages seems more stylistically troublesome to me, since passing
"flux_msg_t" implies to me that a pass-by-value copy of the structure is
being made, which it is not.

On 9 Jun 2015, at 15:40, Jim Garlick wrote:

Just to be clear, you are arguing for something like:

typedef struct flux_msg_struct flux_msg_t;

and arguments of flux_msg_t * or const flux_msg_t *?


Reply to this email directly or view it on GitHub:
#219 (comment)

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

To be honest I hadn't considered that obscuring the pointer would cause confusion, but you make some good points, and the flux_msg_const_t contortion should probably tell us we're going down a rat hole. What do others think? I don't mind reworking.

@grondo
Copy link
Contributor

grondo commented Jun 9, 2015

Hm, as previously stated I tend to agree about not hiding pointers. I think hiding the struct keyword is also overrated though I never though about the fact that does allow you to change the underlying type (but I'd argue the actual likelihood of this is very low to nil)

If you are already using typedef just to hide the fact that you're using an underlying struct I'm kind of ambivalent about also hiding the pointer -- seems like you're already hiding valuable information from the user, and it implies you'll be passing the object to methods only, so it should be irrelevant to a user if they're passing by copy or by reference.

In this case, I kind of dislike flux_msg_const_t (I apologize for having brought it up), and I'm willing to concede that if you really really hate using struct then typedef struct flux_msg_struct flux_msg_t; is the least confusing way to go. (Though I hesitate to actually recommend anything just because it is used in czmq). I'd be fine with eventually changing other use cases as well to be consistent.

As an aside, I would more strongly argue for not using typedef for non-opaque structures like in the case of flux_match_t. (This way we can say a something_t is an opaque type in our api..)

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

No reason not to drop the typedef on flux_match_t. I can do that now.

The flux_msg_t typedef is useful as an alias when transitioning from struct _zmsg_t to struct flux_msg (or whatever). I'd rather not have to change everything at once.

@grondo
Copy link
Contributor

grondo commented Jun 9, 2015

Sorry, yeah good points. I kind of realize my last comment was a bit more strongly worded than I intended. I meant to give my personal preference, and I'm fine with flux_msg_t typedef as long as we're consistent. I'm also coming around to the idea that hiding the pointer is probably pretty bad in most cases.

@garlick
Copy link
Member Author

garlick commented Jun 9, 2015

Yeah me too after the rude awakening on const. OK, lets not hide the pointer but keep the typedef.

We're not consistent with some other flux_foo_t types but we can circle back and fix that to be consistent. I'll update the RFC also. Thanks everybody for the input, I was in the weeds on this one, sadly.

@garlick
Copy link
Member Author

garlick commented Jun 10, 2015

I just forced an update on this PR with the following changes:

  • flux_msg_t no longer hides the fact that its a pointer
  • dropped flux_msg_const_t typedef
  • dropped flux_match_t typedef
  • added first try at a new reactor message handler in the last commit.
  • squashed down tiny commits

The interface proposed for reactor message handlers is implemented along side the old one (actually the old is implemented in terms of the new). Here's the gist of the interface:

typedef struct flux_actor flux_actor_t;

typedef void (*flux_msg_f)(flux_t h, const flux_actor_t *actor,
                           const flux_msg_t *msg, void *arg);

flux_actor_t *flux_actor_add (flux_t h, struct flux_match match,
                              flux_msg_f cb, void *arg);
void flux_actor_remove (flux_t h, flux_actor_t *actor);

The changes from FluxMsgHandler are:

  • returns void
  • message is a const
  • dropped little used type arg (there is flux_msg_get_type() if needed)
  • handler gets a flux_actor_t that it can use to cancel itself

I'm not attached to the naming. To avoid upheaval I needed to leave the existing registration functions alone and needed to invent a name for the thing to be used as a handle for the registration, and actor seemed to fit the role of a message handler.

This is just a first cut - I haven't really used this yet anywhere.

@garlick
Copy link
Member Author

garlick commented Jun 10, 2015

I think "actor" probably implies an agent operating in its own thread so I'm looking for a different word here besides "msghandler".

@trws
Copy link
Member

trws commented Jun 10, 2015

I would agree that actor has implications of concurrency. Maybe "responder?" That has some baggage too, but it seems apt somehow.

@grondo
Copy link
Contributor

grondo commented Jun 10, 2015

The message callbacks might also be handling a response message or an event message, so "responder" seems too specific to me.

What about just naming it a callback or "cb" -- flux_msg_cb_t and flux_msg_cb_add?

I also think it would be helpful to look at the reactor interface as a whole, instead of just message handling. I'm assuming there will be replacements for fd and timeout handlers as well? (for each of these, "callback" might work as well. I would feel better if we had a plan for whole reactor api.

Is there some idiom of construction and naming that we could borrow from libev?

@grondo
Copy link
Contributor

grondo commented Jun 10, 2015

might also want to review dicussion in #124

@garlick
Copy link
Member Author

garlick commented Jun 10, 2015

I was thinking we could repeat the same design for fd's, zeromq sockets, and timers. Callbacks would have a similar signature, e.g. say we go with the "cb" idea,

typedef void (*flux_msg_cb_f)(flux_t h, flux_msg_cb_t *r,
                           const flux_msg_t *msg, void *arg);
typedef void (*flux_fd_cb_f)(flux_t h, flux_fd_cb_t *r,
                           int fd, int revents, void *arg);
typedef void (*flux_zmq_cb_f)(flux_t h, flux_zmq_cb_t *r,
                           void *zmq_socket, int revents, void *arg);
typedef void (*flux_timer_cb_f)(flux_t h, flux_timer_cb_t *,
                           void *arg);

Reg/unreg functions would follow suit

flux_msg_cb_t *flux_msg_cb_add (flux_t h, struct flux_match match,
                              flux_msg_cb_f cb, void *arg);
void flux_msg_cb_remove (flux_t h, flux_msg_cb_t *r);
flux_fd_cb_t *flux_fd_cb_add (flux_t h, int fd, int events,
                              flux_fd_cb_f cb, void *arg);
void flux_fd_cb_remove (flux_t h, flux_fd_cb_t *r);
flux_zmq_cb_t *flux_zmq_cb_add (flux_t h, void *zsock, int events,
                              flux_zmq_cb_f cb, void *arg);
void flux_zmq_cb_remove (flux_t h, flux_zmq_cb_t *r);
flux_timer_cb_t *flux_timer_cb_add (flux_t h, double seconds,
                              flux_timer_cb_f cb, void *arg);
void flux_timer_cb_remove (flux_t h, flux_zmq_timer_t *r);

This is fairly similar to the libev "watchers", in that there is a different (though similar) callback signature for each watcher type, and a type specific handle that is passed to the callback and can be used for cancellation. There are two aspects of libev that I sort of like that aren't used here and maybe should be considered?

  • the equivalents of the flux_msg_cb_t style objects are user allocated (might be on the stack), and can be registered/unregistered with the event loop repeatedly without destroying
  • the divergence of the callback signature for different types of callbacks is minimized in libev by giving you access to some of them via the flux_msg_cb_t style thing that is passed in. For example if you want the file descriptor in an ev_io watcher, you have to pull it out of the ev_io struct that was passed to the handler.

garlick added 16 commits June 11, 2015 10:29
Instead of (zctx_t *) use (struct _zctx *) with struct _zctx,
an incomplete type.
Rename these types:
  red_t -> flux_red_t
  FluxRedFn -> flux_red_f
  FluxSinkFn -> flux_sink_f
Add a new type as the arg to flux_red_f instead of zlist_t
  flux_redstack_t
Add simple stack interface for flux_redstack_t
  flux_redstack_push()
  flux_redstack_pop()
  flux_redstack_count()
Also drop unnecessary (FluxFreeFn) cast from
  cmd/flux-jstat
  modules/api
  modules/kvs
  modules/libjstat
  modules/live
  modules/barrier
  modules/wreck
  test/*
  libflux/flux_log
This seemed better than changing the cast to the new name.
define flux_msg_t in terms of incomplete struct _zmsg_t.
Convert all public libflux headers to use flux_msg_t not
(zmsg_t *) and drop <czmq.h>.

Add flux_msg_destroy().
Replace flux_putmsg() and flux_pushmsg() with flux_requeue().
Also replace the corresponding handle implemention methods.

Update users in:
  libflux/rpc
  libflux/reactor
  connectors/loop
  connectors/local
  broker/modhandle
  test/request
Update users in
  connectors/local
  modules/api
  test/tenc
flux_send(), which does not destroy the message on success, will replace
flux_sendmsg().  (flux_sendmsg() will be available during transition).
Define corresponding handle implementation methods.

Update these users:
  connectors/local
  connectors/loop
  broker
  broker/modhandle
  libflux/response
  libflux/rpc
Add flux_recv(), which will replace flux_recvmsg().
The main difference is it accets flags rather than a boolean 'nonblock'
argument.  flux_recvmsg() remains during transition, implemented in
terms of flux_recv().  Add corresponding handle implementation methods.

Define FLUX_O_NONBLOCK flag.

Update the following users
  broker/modhandle
  connectors/loop
  connectors/local
  libflux/rpc
  flux/reactor
Flags supplied to flux_open() at handle creation are logical
or'ed with flags supplied to flux_send() and flux_recv().
Rather than defining the flux_msg_t type as a pointer to an
incomplete struct, define it directly to the incomplete struct.
This allows const to work properly and avoids obscuring the
fact that we are dealing with a pointer.

Add const qualifier to function parameters where appropriate,
especially in message.[ch], request.[ch], response.[ch], event.[ch].

Update users:  broker module handle, and connectors.
Use 'struct flux_match' directly instead.
Update various users.
We had some code that enables the reactor code to be split out
of handle.c that crept into the public API.

* drop flux_ prefix and move function prototypes to non-installed
  reactor_impl.h:  reactor_create(), reactor_destroy(), reactor_get()

* drop unnecessary abstract reactor_t type for "struct reactor".

* rearrange function order in reactor.h for clarity, and
  drop "tmout" prototypes for functions long ago removed
@garlick
Copy link
Member Author

garlick commented Jun 11, 2015

I think this is probably mergable (my $0.02).

@grondo
Copy link
Contributor

grondo commented Jun 12, 2015

Ok, I'll merge this. Is there a corresponding update for flux-sched as well? (Or is that not needed in this case?)

@garlick
Copy link
Member Author

garlick commented Jun 12, 2015

Wasn't supposed to need one, but it turns out a minor one is needed since FluxFreeFn was renamed to flux_free_f and it is used in a cast. I'll submit a PR on that.

@grondo
Copy link
Contributor

grondo commented Jun 12, 2015

Ok, then merging this one so your flux-sched PR passes travis

grondo added a commit that referenced this pull request Jun 12, 2015
eliminate czmq dependency in public api
@grondo grondo merged commit ad54d7d into flux-framework:master Jun 12, 2015
@grondo grondo removed the review label Jun 12, 2015
@garlick
Copy link
Member Author

garlick commented Jun 12, 2015

Thanks @grondo

I should comment that actually we will want to do a flux-sched PR to update its API usage. However since json-c stuff remains in libflux-internal.la, and the old flux_sendmsg() et al functions remained in place for now, we can wait until public API stablizes, then do it once.

@trws
Copy link
Member

trws commented Jun 12, 2015

How much use of libflux-internal is there in flux-sched? Ideally it would be nice to shift it entirely to the external API to simplify building it.

@garlick
Copy link
Member Author

garlick commented Jun 12, 2015

There is not a large amount of API usage in there. The bigger problem is reliance on convenience stuff in libutil and libjsonc if I recall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants