Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python bindings, python module, etc. #236

Merged
merged 5 commits into from
Jul 17, 2015
Merged

Conversation

trws
Copy link
Member

@trws trws commented Jun 30, 2015

So, this has ended up being a pretty humongous patch apparently. Here's the summary.

Python bindings now live in src/bindings/python. The make_bindings.py script scrapes header files, no parsing, just comment and directive removal and following only " includes, to create scripts that use cffi to generate C-based python binding modules. Because of this, every single externally visible function in our header files is explicitly linked to by the python bindings. That's why some of the header file changes were made, a couple of functions were misspelled, or didn't exist, and only broke when the python bindings tried to link to them, so this ends up being a lint for the headers in an odd kind of way.

Anyway, the extension modules are then wrapped by python modules in the flux package. All of the bindings are in pure python, and most of them have been wrapped so they're reasonably usable. Those that haven't can be called directly from python just by circumventing the high-level bindings, for example the flux_event_subscribe() function has no binding, but f.event_subscribe("evt") works, because the wrapper searches the library, finds the definition, and wraps the c primitives to accept and return python types. It also converts the setting of errno to non-zero into EnvironmentError exceptions, so no errors that use the normal flux conventions can slip by.

The pymod module can load arbitrary python modules as flux modules, there's an example echo module that implements mecho's functionality in the directory with it.

Tests are in two forms, ones that need a flux wrapping them and ones that do not. The test/ directory has the former, test_commands/ the latter. Now that sideflux.py can run an external flux and connect to it based on its FLUX_TMPDIR these can be merged, but that's not done yet. All tests use the python unittest module as an execution framework, and pycotap to highjack its reporting and replace it with TAP.

The python support requires python 2.7 and the cffi module, and is on by default for now, but can be disabled with the usual --disable-python flag to configure.

Also there are some patches to module loading for error reporting and a couple of other places that were tweaked for debugability.

@garlick garlick added the review label Jun 30, 2015
@garlick
Copy link
Member

garlick commented Jul 2, 2015

This is pretty awesome, but not being a python guy I'm a little lost and probably not capable of reviewing beyond superficially. If there are other python people around (@SteVwonder?) could they weigh in and perhaps kick the tires?

My main question is what action is required to maintain this binding as our APIs evolve as they are doing rather quickly right now?

Superficial but it'd be useful to have changes to non-python space broken out into separate commits, and maybe rebase with better commit names considering the broad scope of the project, and squash down fixes to code in earlier commits in the same PR. I assume you'll get to that as the PR gets closer to merge ready.

I hate to raise this question since I suggested that you submit this as a flux-core PR and didn't think of it then, but should we make bindings separate projects in flux-framework, similar to the way the zeromq community does it? I recall discussing this with @grondo back when he was working on lua and we decided that lua should be included so that we could leverage it within flux-core, as we have done. I'm sure a similar argument could be made for python; on the other hand we may make it hard for any one person to change anything in flux-core if we take that too far. Bindings as separate projects does seem to work for zeromq people; there are many bindings, some of which are very out of date, but the ones that get used tend to get pushed forward by the people who use them. Having good processes and API documentation in place for libzmq helps.

@trws
Copy link
Member Author

trws commented Jul 2, 2015

We have some options as far as maintenance is concerned. The downside to this approach, compared to explicitly writing a C extension module, is that at compile time there will not be failures in C code that will make it clear that API breakage has happened. If I leave it as is, where it's generating bindings dynamically on each run, it also wont fail to link when expected functions disappear, it will only fail when unit tests fail. It would not be all that hard to generate a set of unit tests that exercise every function used by the system, I could probably make a reflection test in an hour or so that would do that.

Alternately, the bindings could be generated once then checked in. Since they would keep the API snapshotted, link failures would list the API functions that had changed or moved, and possibly make it easier to update. I actually think this is the way to go in the longer term, but as the API is moving pretty quickly, it's nice to have every addition just show up automatically.

I agree about breaking out the non-python changes, I'll put up separate pull requests for those later today and figure out how to re-factor this.

The python bindings could be built entirely separately from the rest of flux, there are no dependencies on flux-internals or anything else that isn't exported. It would have to be tweaked to have the make-bindings script work with install paths, but that's not a serious issue. As you note, the downside to that would be that nothing in core could depend on the bindings in that case, so it comes down to whether we want to use them for something in the main project. Since we have Lua, I'm not sure I can make a strong enough case for that. For me, it would be nice to have a 'batteries-included' scripting option for simple things like command front-ends and prototyping. On that side, I tried writing up a test for the races we discussed yesterday in Lua, but ran aground of the fact that the luaposix on LC doesn't support the pipe() function of all things... Frustrating, but is it enough to motivate pulling in the python bindings? I don't know.

@trws
Copy link
Member Author

trws commented Jul 2, 2015

On a related note, the recent merge broke both sets of bindings pretty thoroughly...

@grondo
Copy link
Contributor

grondo commented Jul 2, 2015

FWIW, my opinion is that at this point in the project the python bindings would be extremely useful. I was hoping the Lua bindings would offer a rapid prototyping tool during the early phase of the project, but that never really materialized (except for me), and as Tom pointed out Lua isn't always the most canned solution for building utilities.

Once we have a production system, perhaps bindings could be broken out, but given the early state of the project, and the extremely high popularity of python relative to Lua, my argument woud be that providing the python bindings in-project at this point will really make it easy for more people to try stuff.

That being said, Jim's comments about maintainability are valid (and I probably didn't make things easy for him with the way the Lua bindings are done). Maybe if there is a script to generate static bindings during the build as Tom suggests, and a suite to test the bindings, that problem could be made tractable.

That is just my 2 cents.

ps. what were you trying to do with pipe(2) from Lua? Sorry about the out-of-date luaposix on our systems...

@trws
Copy link
Member Author

trws commented Jul 2, 2015

I needed some way to have two communicating sequential processes to make the race occur deterministically. Since Lua is pretty close to the metal API-wise I reached for the old standard of fork() + pipe() thinking that would be the easiest way to get that done, but while fork worked fine there is no pipe() in the version we've got. The closest option was to use popen to re-launch the script or another script, but that's unidirectional (only either stdin or stdout/err, not both) for reasons that apparently make sense to the developers of Lua...

This is a funny thing, I really like the way that Lua handles coprocesses and lambdas and so-forth, but the inability to use regular expressions or low level APIs without libraries is frustrating.

If the maintainability of the current version is an issue, there are a number of different ways to make it more or less welded to the C API, to the point of making them directly in C like the Lua bindings, or even to build the python bindings on top of the Lua bindings. If we were to take it down to C we lose PyPy compatibility, cool sidenote BTW we support PyPy right now. The other thing that is sub-optimal is that I wrote this for python 2.7 because that's what we have on LC. It really should be using python 3+, since python 2 is pretty thoroughly deprecated now, but it's a dependency I don't have a good way to pull in just now.

@trws
Copy link
Member Author

trws commented Jul 2, 2015

I just looked back over that last message, apparently I'm a little more loopy than I thought, if any of this makes no sense feel free to say so...

@SteVwonder
Copy link
Member

I echo Jim's sentiments, this is pretty awesome! Everything looks sane, and from what I can tell, the bindings look pretty easy to use. My goal for the weekend is to write a module that uses these bindings and communicates with my simulator. So I can get back to you on Monday with any issues I run into.

@trws
Copy link
Member Author

trws commented Jul 2, 2015

Wait a couple hours, or grab a version a bit back from the current head. The watcher API change has invalidated some of the API. By the way, if anything seems funny or non-pythonic, it's early enough that all of this is pretty easy to shift around.

@SteVwonder
Copy link
Member

What changes need to be made to make it compatible with Python 3 besides the easy stuff like changing prints to have parentheses? Does Unicode creep in and make the interface w/ C a little more complicated (I remember seeing something about that in the cffi readthedocs)?

@grondo
Copy link
Contributor

grondo commented Jul 2, 2015

@trws there are some *compat bindings for the watchers, you may be able to just pull in the proper header.

@trws
Copy link
Member Author

trws commented Jul 2, 2015

Actually the unicode string vs bytes() thing is the biggest part of
it. That said, I designed around making it pretty easy to abstract
that. Nearly all of the calls go through the wrapper generator, so I
can put in an extra line that does an isinstance(arg, unicodestring)
and encodes it as a bytes() automatically then places it back in the
argument list to pass through to the actual C code.

Clearly this is not performance optimal, but it makes it really easy
to evolve this interface quickly.

On 2 Jul 2015, at 10:29, Stephen Herbein wrote:

What changes need to be made to make it compatible with Python 3
besides the easy stuff like changing prints to have parentheses? Does
Unicode creep in and make the interface w/ C a little more complicated
(I remember seeing something about that in the cffi readthedocs)?


Reply to this email directly or view it on GitHub:
#236 (comment)

@trws
Copy link
Member Author

trws commented Jul 2, 2015

@grondo I saw that, but it's not quite so simple because they're define-based rather than linkable symbols. I'm most of the way done re-working it to the new API already anyway I think.

@trws
Copy link
Member Author

trws commented Jul 2, 2015

Ok, the new watcher interface is represented in the bindings now.

@trws
Copy link
Member Author

trws commented Jul 2, 2015

The non-python changes have been factored out, but I'm going to wait on removing them from here because reverting the header changes will cause immediate breakage.

@SteVwonder
Copy link
Member

It looks like that echo.py is still using the old msghandler_add rather than the new watcher API

@SteVwonder
Copy link
Member

There also seems to be a few issues with the new MessageWatcher class. Below is a version that works for me:

match = ffi.new('struct flux_match *', {
      'typemask' : type_mask,
      'matchtag' : match_tag,
      'bsize' : bsize,
      'topic_glob' : ffi.new("char[]", topic_glob),
      })
self.handle = raw.flux_msg_watcher_create(match[0], MsgHandlerWrapper, wargs)

I'm not very familiar with PR's on github other than the basics. So what is the best way for me to integrate the above changes into this PR?

@garlick
Copy link
Member

garlick commented Jul 6, 2015

Submit your own PR against @trws's python_mod branch?

@trws
Copy link
Member Author

trws commented Jul 6, 2015

Thanks for pointing those out @SteVwonder, I'll re-work those parts. So you know, in case you need to use some un-wrapped cffi APIs, that topic_glob needs to be held in something other than the struct at the top level or it may get collected as garbage. cffi structs don't get searched for liveness unfortunately.

@SteVwonder
Copy link
Member

@trws - three other issues that I'm running into with the binding:

  • I successfully subscribed to an event, but when i 'flux event pub ...' from the command line, my python module spits out:

  • for ctype 'struct _zmsg_t *' must be a pointer to same type, not cdata 'struct _zmsg_t *'" in <bound method InnerWrapper.__del__ of <flux.message.InnerWrapper object at 0x2aaaeb4de450>> ignored

  • Not sure what is happening here since the type signatures match exactly and the del method looks sane

  • When I remove the pymod module from the command line, I get the following:

    bash-4.1$ flux module remove pymod    
    lt-flux-broker: [1436222793.944028] broker.info[0] rmmod pymod
    Traceback (most recent call last):
    File "/g/g17/herbein1/local/flux-core/lib/python2.7/site-packages/flux/core.py", line 229, in mod_main_trampoline
      user_mod.mod_main(flux_instance, **args)
    File "/g/g17/herbein1/local/flux-core/lib/python2.7/site-packages/flux/modules/complete.py", line 18, in mod_main
      if h.reactor_start() < 0:
    File "/g/g17/herbein1/local/flux-core/lib/python2.7/site-packages/flux/wrapper.py", line 98, in NoneWrapper
      raise EnvironmentError(err, os.strerror(err))
    EnvironmentError: [Errno 9] Bad file descriptor
    
    • This occurs even with a extremely simple python module that only calls h.reactor_start()
  • When I try to access the 'topic' property of a message, I get: AttributeError: Flux Wrapper object masks function get_topic type: <ctype 'int(*)(struct _zmsg_t *, char * *)'> match: <ctype 'struct _zmsg_t *'>

    • This one seems to be a problem with comparing ctypes. I again cannot tell the difference between the first argument of the get_topic type signature and the match type

@trws
Copy link
Member Author

trws commented Jul 7, 2015

@SteVwonder:

  • This usually means one module got re-built without rebuilding one that depends on it. That should be prevented by the build system, but that's the most likely culprit since the result is the two modules having types of the same name but different underlying values.
  • The error on unload is because it doesn't receive the shutdown message and stop the reactor manually. Because the reactor gets a bad file descriptor, and sets errno, on a forced removal, that's what the error message is. I could probably have a shutdown handler added by default, but since the C API doesn't, I wasn't sure that would be reasonable.
  • This should be the same as the first, a clean and rebuild in the bindings should take care of it, but I'll take a look into how they can get out of sync like that.

@SteVwonder
Copy link
Member

@trws, the modules not being rebuilt makes a lot of sense. (My bad)

How are you currently using the python echo module? Whenever I load it and publish an event from the command line (flux event pub mrpc.mecho), I get a seg fault that I assume is due to a malformed event (since it looks like MRPC.from_event is trying to extract information from JSON that doesn't exist).

I tried making a simpler module that just listens for an event (e.g. "sim.complete") and then prints a static message, but this results in a seg fault caused by flux_msg_destroy. Not sure why this is happening (and I can't make much sense of its stack trace), but I assume its related to the above.

Sorry for all these questions. I'm excited for the possibility of python modules, but haven't quite figured out the bindings/all of the wrappers to the point where I can debug well. Would you prefer me to keep posting questions here in the PR or to just email you directly?

@trws
Copy link
Member Author

trws commented Jul 7, 2015

@SteVwonder: I'm testing it as a drop-in replacement for mecho, meaning that I literally do flux module remove mecho and flux module load pymod --module=echo then run an mping test like those in t/t1003-mecho.t.

Would you mind posting the code for the module you're trying? There may be an issue in my callback wrapper that my tests haven't exposed, it might help me diagnose it.

As to where, here seems best since it will be archived with the changeset.

@SteVwonder
Copy link
Member

Here is the code:

import syslog
import flux
from flux import core

def handle_complete(h, typemask, message, arg):
    h.log(syslog.LOG_INFO, "Received sim.complete event")
    return 0

def mod_main(h, **arg_dict):
    if h.event_subscribe("sim.complete") < 0:
        h.fatal_error("event subscription failed")
    with core.MessageWatcher(h,
                        type_mask=flux.FLUX_MSGTYPE_EVENT,
                        callback=handle_complete,
                    topic_glob="sim.complete") as msg_watch:
        if h.reactor_start() < 0:
            h.fatal_error( "reactor start failed!")
    h.log(syslog.LOG_INFO, "pcomplete unloading")

and here is what I ran & the result:

ipa15@herbein1:flux module load pymod --module=complete
flux-broker: [1436309041.890813] broker.info[0] insmod pymod
flux-broker: [1436309041.939703] pymod.info[0] loading python module named: complete
ipa15@herbein1:flux event pub "sim.complete"
flux-broker: src/zmsg.c:79: zmsg_destroy: Assertion `zmsg_is (self)' failed.
flux-broker: [1436309052.748733] pymod.info[0] Received sim.complete event
ipa15@herbein1:flux-start: 0 (pid 142943) Aborted

I get the same assert failure (not a seg fault, my bad) with echo and mping:

ipa15@herbein1:flux start
flux-broker: [1436309261.596240] broker.info[0] nodeset: 0 (complete)
flux-broker: [1436309261.596352] broker.info[0] starting shell
ipa15@herbein1:flux module remove mecho
flux-broker: [1436309274.165650] broker.info[0] rmmod mecho
ipa15@herbein1:flux module load pymod --module=echo
flux-broker: [1436309287.637247] broker.info[0] insmod pymod
flux-broker: [1436309287.697374] pymod.info[0] loading python module named: echo
ipa15@herbein1:flux mping --count 1 0 
got here, I swear. topic: mrpc.mecho, payload_str: { "dest": "0", "vers": 9, "sender": 0, "path": "mrpc.2270328330064317BDCFAAB14316067E" }
flux-broker: [1436309314.951760] pymod.info[0] in impl, args:(<flux.core.MessageWatcher object at 0x2aaabe1db610>, <flux.message.Message object at 0x2aaabe1db710>, None)
<flux.mrpc.MRPC object at 0x2aaabe1dba90>
flux-mping: mecho: pad=0 seq=0 time=7.485 ms
ipa15@herbein1:flux mping --count 1 0
got here, I swear. topic: mrpc.mecho, payload_str: { "dest": "0", "vers": 12, "sender": 0, "path": "mrpc.27A434E68F824D80AEDC91A4DDD4DF7B" }
flux-broker: [1436309317.156361] pymod.info[0] in impl, args:(<flux.core.MessageWatcher object at 0x2aaabe1db610>, <flux.message.Message object at 0x2aaabe1db710>, None)
<flux.mrpc.MRPC object at 0x2aaabe1e60d0>
flux-broker: src/zmsg.c:79: zmsg_destroy: Assertion `zmsg_is (self)' failed.
flux-mping: 0: no response
flux-start: 0 (pid 143781) Aborted

For the sake of completeness, here is the stack trace from GDB for the echo/mping assert failure:

Core was generated by `/g/g17/herbein1/local/flux-core-twrs/libexec/flux/flux-broker --size=1 --rank=0'.
Program terminated with signal 6, Aborted.
#0  0x00002aaaac54a625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install czmq-3.0.0.rc1-1.ch5.3.x86_64 json-c-0.11-12.ch5.3.x86_64 libgcc-4.4.7-11.el6.x86_64 libsodium-0.4.5-3.el6.x86_64 libstdc++-4.4.7-11.el6.x86_64 libuuid-2.17.2-12.18.el6.x86_64 munge-libs-0.5.11-1.ch5.1.1.x86_64 openpgm-5.1.118-3.el6.x86_64 zeromq4-4.0.5-2.ch5.3.x86_64
(gdb) bt
#0  0x00002aaaac54a625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00002aaaac54be05 in abort () at abort.c:92
#2  0x00002aaaac54374e in __assert_fail_base (fmt=<value optimized out>, assertion=0x2aaaab83faea "zmsg_is (self)", file=0x2aaaab83fadf "src/zmsg.c", line=<value optimized out>, function=<value optimized out>) at assert.c:96
#3  0x00002aaaac543810 in __assert_fail (assertion=0x2aaaab83faea "zmsg_is (self)", file=0x2aaaab83fadf "src/zmsg.c", line=79, function=0x2aaaab840840 "zmsg_destroy") at assert.c:105
#4  0x00002aaaab80c93a in zmsg_destroy () from /usr/lib64/libczmq.so.1
#5  0x00002aaaaaf12c93 in flux_msg_destroy (msg=0x2aaac41d4010) at message.c:201
#6  0x00002aaaaaf10a40 in handle_cb (loop=<value optimized out>, hw=<value optimized out>, revents=<value optimized out>) at reactor.c:381
#7  0x00002aaaaaf2b27d in ev_invoke_pending (loop=0x2aaac4008250) at ev.c:3088
#8  0x00002aaaaaf2f8aa in ev_run (loop=0x2aaac4008250, flags=0) at ev.c:3488
#9  0x00002aaaaaf10d5b in flux_reactor_start (h=<value optimized out>) at reactor.c:127
#10 0x00002aaabe404e69 in _cffi_f_flux_reactor_start (self=<value optimized out>, arg0=<value optimized out>) at _core.c:6578
#11 0x00002aaaafc2c0ec in ext_do_call (f=<value optimized out>, throwflag=<value optimized out>) at Python/ceval.c:4345

@trws
Copy link
Member Author

trws commented Jul 7, 2015

That makes sense now. I made a mistake updating to the new interface, where I had the message wrapper destroying the message passed into the callback when its destructor ran. It was valid behavior for the old handler interface, but not for the new one. Assuming this version passes travis, give it a try. Your module should work with it now.

@@ -30,9 +30,18 @@ flux_broker_SOURCES = \
shutdown.h \
shutdown.c

cmbd_LDFLAGS = -export-dynamic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automake is complaining about this line:

src/broker/Makefile.am:33: variable `cmbd_LDFLAGS' is defined but no program or
src/broker/Makefile.am:33: library has `cmbd' as canonical name (possible typo)

Should this be under broker_LDFLAGS (or something similar)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that line can just be removed now. It was originally added when the bindings operated on only dynamic symbols. Now that they link directly to the external interfaces it just doesn't matter anymore.

@trws
Copy link
Member Author

trws commented Jul 16, 2015

Something that doesn't have to be done now, but that I have been thinking about especially since I put in the test coverage checks, is it would probably be good to decide on a single model of access to this set of bindings. What I mean is that it could be all object based, where you always have to call on a flux object or kvs object or whatever for all actions, or a free-function interface. No matter what we do there will always be a mix of these, since some things just don't make sense as one or the other, but it came to mind because we have almost a complete duplication of functionality in the KVS API between both types.

The kvs_* functions basically all take a flux_t and do something on the kvs directly, where the kvsdir_* functions take a kvsdir but otherwise work similarly. I've been thinking I should probably just pick one of these and remove the other, but I would rather like to have your opinions on how this should look. The free functions, ideally, would be of the form flux.<function> or flux.kvs.<function> where the object style would be to create a class that holds the handle and just has all of the related functions on it as methods. Where this bugs me is that we have some functions that don't take the handle, and exposing them sensibly in an object-only model is not entirely straightforward... Perhaps this is something we'll tackle as we go. At present the ones in core are accessed as methods on a Flux() object as a side-effect of the way the wrapper works, and not wanting to have to have a class instance masquerade as a module (which would do what we want, but at the cost of being a very advanced thing to debug if something goes wrong).

@grondo
Copy link
Contributor

grondo commented Jul 16, 2015

Is there any kind of option for "verbose on failure?" It's possible that there was some errant test print or something in my version that mucked it up, but it should always fail in that case. This is making me wonder now...

I wish there was.. I think I could add this to specific tests, or we could enhance sharness' test_cmp shell function (or maybe write our own in sharness.d/flux-sharness.sh)

Failed tests keep their log files around in the trash directory-*, maybe the best thing for now would be to have an on_failure: for travis that post-processes logfiles on make check failure (though this will be a bit tricky because we are doing make distcheck)

@garlick
Copy link
Member

garlick commented Jul 16, 2015

Remember the KVS interfaces will be changing. In #76 we were discussing mapping kvs operations to flux_rpc_t, and having returned values including kvsdir_t owned by the flux_rpc_t (destroyed by flux_rpc_destroy()). Does that affect your thinking at all on how these should be packaged up in the python binding?

In reworking the kvs API should we try some variant of #176, where module interfaces look much more like objects, kvs->get() and the like?

@trws
Copy link
Member Author

trws commented Jul 16, 2015

Quite so, that's part of why I'm leaving it to later, but as a general policy it would be good to straighten out before there are too many users of the interface.

Seeing #76, which had slipped by me, explains some of the misinterpretations I had in #271 also. Combining #176 ideas with @grondo's last comment in #76 could make a whole lot of sense, having rpc as the base type for a kvs type that would then expose a set of functions, or even just be passed to free functions, would be pretty ideal for me. Having a generic RPC object used for it would be OK, but it would make the internal usage in the bindings a little odd type-wise. Perhaps doing a little object-oriented C would take care of that though, defining a kvs object as something like this?

typedef struct{
  // vtable?
  flux_rpc_t rpc;
  kvs_specific_cache/data?
} flux_kvs_rpc_t;
int kvs_get(flux_kvs_rpc_t *rpc...){
  flux_rpc_get(rpc...);
}

It would have the advantage of being polymorphic with RPCs, so then and even destroy could be used just the same way though destroy would require a vtable, as long as we ensure there is no type slicing. Having either a set of free functions that take a type as the first argument or a struct containing a set of function pointers would wrapping them with the bindings pretty much trivial. The function pointers would be a little more complicated, because my current wrapper generator doesn't have functionality to detect and wrap these things, but CFFI has native function pointer support so it wouldn't be all that hard to add.

@garlick
Copy link
Member

garlick commented Jul 17, 2015

I just hit the same wreckrun hang in PR #273 which is doc only so it would seem that python bindings are likely not to blame :-)

Any objections if we merge this?

@trws
Copy link
Member Author

trws commented Jul 17, 2015

I'm good with it at this point. Not to say that there's no work to do on it going forward, but it's reasonably stable and I think it's where it needs to be to be usable.

@garlick
Copy link
Member

garlick commented Jul 17, 2015

Just had a thought - is flux-python required or could we just prepend the path to the configured python interpreter to PATH in the flux command driver and then flux python would work and so would #!/usr/bin/env python?

@grondo
Copy link
Contributor

grondo commented Jul 17, 2015

@garlick, good idea, can't python PATHs be handled similar to Lua path fixups in the flux cmd driver? (There might be some python specific reason why not, which is why I'm asking)

In general for this pull request it would be nice if it were cleaned up a bit more so that all the refactoring was squashed together, unless there is a good reason to keep that in the history for all time (e.g. @SteVwonder's commits should stay his). For example, broker/Makefile.am is changed in the first commit then that change is reverted later. This is probably noise in your working history at this point and could be removed.

At this point, though, I don't think it is quite as important to maintain clean pull requests as it will be in the future, and I know you've already put quite a lot of work into cleaning up the history here. So if you do feel the commits are cleaned up enough I'd be fine with merging.

@trws
Copy link
Member Author

trws commented Jul 17, 2015

The PYTHONPATH can certainly be set in the cmd driver, an in-interpreter version of this is in py_mod already actually, but I thought it might be useful to have a way to invoke the python with which flux built its bindings. In many cases that may not matter, but since our connector modules are compiled they may not work reliably with another version of the interpreter than the one they were compiled with.

Thomas R. W. Scogland and others added 3 commits July 17, 2015 10:40
bindings/python:python module development

This version works, up to the point of trying to call msg_decode.  I can't
seem to find a way to get ctypes to address the flux_msg_decode function even
if it's loaded in an extra shared library.  The symbol must be masked off in
  loading by a version in a previously loaded binary, but that symbol doesn't
  appear with dlsym.  Hopefully we can fix this with a documented external
  API or by compiling some real python bindings.

slightly closer, event decode is still not a visible symbol however, which worries me

python module now working with new flux core stuff, preparing for rebase

working on python module stability

full commit for python bindings

Small tweak to fix distcheck

adjusting travis configuration to install easy_install

removing errant test print

tweaks to dependency resolution to fix distcheck from a clean clone

adding a transitive dependency that was not evident until testing against travis

fixed a mis-spelled make variable

fixing up shebang line issue in python tests to use non-redhat path for env

fixing up some error checking on the kvsdir constructor

removed import of alternate tap module from unit-tests

re-worked to target the new watcher API

Fixed watcher and KVSWatch issues related to the switch to the new API.  More
watchers still need to be added, but they are handled more as objects now.

adding a check to ensure the C destroy is only called once

bindings/python:Fixes based on @SteVwonder's feedback

adding some docstrings

removed spurious code from Watcher

fixes for module load path, kvs interactions, wrapper performance regression, and the example module

bindings/python:refactoring and build fixes

Re-factoring in the python interface

The external interface has not changed noticeably, but the internal organization
has been re-worked to keep files from getting unwieldly.

Adding the new sub-directory of package files

adding missing barrier file

Stopped destructing un-owned object

The message watcher was destructing the message, as the old interface
expected, causing the new interface to crash.

adding in jsc.py

adding in sec.py

fixing wrong module import all in core

bindings/python:Removing cmbd_LDFLAGS line

The `-export-dynamic` flag is no longer necessary now that the python bindings link explicitly to the libraries rather than using runtime symbol information.

bindings/python:fixed error check for isdir and added kvs.walk to walk the full system
fixed relative kvs directory acceses

kvs walk works from relative paths now

added a success_code to kvs calls

calls that pick up the spurious EBADF can now "safely" ignore it based
on the return value == the success code

Addressing error-checking issue from flux-framework#253

This version checks the return from integer and pointer returning functions
and ensures that the value is a potential error before raising an exception on
errno.  To make the actual wrapper more extensible it is now a callable object
masquerading as a method instead of a closure.  This makes each a bit easier
to reason about.
bindings/python:fixing pre-refactor argument, adding walk test

bindings/python:Added specific-type gets to KVSDir

If you know what the target is, it's better to get it directly than have the
interface try and guess.

bindings/python:cleanup and formatting to address some of the more obvious pylint issues

bindings/python:updates to the message wrapper class

Handling of messages in echo was a little off.  Added a property to get/set
the message type and to get the type as a human-readable string, which helps
with debugging messages.

test/python:testing coverage, isolation and ergonomics improvements

This adds a coverage checking target, a large number of tests, and some general
ergonomics improvements to testing all over the place.  It also fixes some bugs
exposed by the new tests.  This version is sitting at 84% coverage of the
bindings, by number of statements.

altering check setup for python module detection

bindings/python:added a flux-python command

The new command invokes the python interpreter that flux was built with after
adding the flux installations exec prefix to the PYTHONPATH.  Nothing terribly
fancy, but it can be used to ensure that a script or command built on the flux
bindings will find them.

bindings/python: test fixups and re-factor

general cleanup
@trws
Copy link
Member Author

trws commented Jul 17, 2015

Ok, they're squashed down about as far as they'll go. I'm going to add PYTHONPATH setup to the flux driver, should I also remove the command?

@garlick
Copy link
Member

garlick commented Jul 17, 2015

Not if you think it's useful.

trws added 2 commits July 17, 2015 11:38
Along with the path setup, the path configuration has been refactored a bit.
@trws
Copy link
Member Author

trws commented Jul 17, 2015

Really there should be a better way to do this... maybe a flux command to launch something in the appropriate environment given a PID? I could probably throw something together for that.

@grondo
Copy link
Contributor

grondo commented Jul 17, 2015

Should this problem have its own issue opened?
I'm not sure I fully understand the need being met here yet. (Though in
general I like @trws idea of a flux command to perform the work)
Also, what flux-start allowed for "named" sessions (think tmux -s) That
would of course only work for local instances, but it seems like testing is
the target environment here?

On Fri, Jul 17, 2015 at 11:57 AM, Tom Scogland [email protected]
wrote:

Really there should be a better way to do this... maybe a flux command to
launch something in the appropriate environment given a PID? I could
probably throw something together for that.


Reply to this email directly or view it on GitHub
#236 (comment)
.

@trws
Copy link
Member Author

trws commented Jul 17, 2015

That's part of it. In a more general sense we'll need a way to communicate with a target onstage at some point, but for now just being able to construct named sessions would be a big step forward. It's largely orthogonal to the bindings, but since they contain a workaround to do this for their tests it makes sense that it came up here.

Sent with Good (www.good.com)


From: Mark Grondona
Sent: Friday, July 17, 2015 1:32:23 PM
To: flux-framework/flux-core
Cc: Scogland, Thomas Richard William
Subject: Re: [flux-core] Python bindings, python module, etc. (#236)

Should this problem have its own issue opened?
I'm not sure I fully understand the need being met here yet. (Though in
general I like @trws idea of a flux command to perform the work)
Also, what flux-start allowed for "named" sessions (think tmux -s) That
would of course only work for local instances, but it seems like testing is
the target environment here?

On Fri, Jul 17, 2015 at 11:57 AM, Tom Scogland [email protected]
wrote:

Really there should be a better way to do this... maybe a flux command to
launch something in the appropriate environment given a PID? I could
probably throw something together for that.


Reply to this email directly or view it on GitHub
#236 (comment)
.


Reply to this email directly or view it on GitHubhttps://github.com//pull/236#issuecomment-122404334.

@grondo
Copy link
Contributor

grondo commented Jul 17, 2015

Ok, I opened #274. I'm not familiar with the terminology "communicate with a target onstage" so hopefully I've captured the short term need you're talking about.

@trws
Copy link
Member Author

trws commented Jul 17, 2015

Ah, I meant to type "target instance," and apparently missed an unfortunate autocorrect on my phone.

@garlick
Copy link
Member

garlick commented Jul 17, 2015

Beer beckons. Merge?

@trws
Copy link
Member Author

trws commented Jul 17, 2015

I think I addressed the remaining points. @grondo?

@grondo
Copy link
Contributor

grondo commented Jul 17, 2015

Yeah. I ACK

On Fri, Jul 17, 2015, 16:13 Tom Scogland [email protected] wrote:

I think I addressed the remaining points. @grondo
https://github.com/grondo?


Reply to this email directly or view it on GitHub
#236 (comment)
.

garlick added a commit that referenced this pull request Jul 17, 2015
Python bindings, python module, etc.
@garlick garlick merged commit 1dfedac into flux-framework:master Jul 17, 2015
@garlick garlick removed the review label Jul 17, 2015
@garlick
Copy link
Member

garlick commented Jul 17, 2015

Very nice work @trws.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants