-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python bindings, python module, etc. #236
Conversation
This is pretty awesome, but not being a python guy I'm a little lost and probably not capable of reviewing beyond superficially. If there are other python people around (@SteVwonder?) could they weigh in and perhaps kick the tires? My main question is what action is required to maintain this binding as our APIs evolve as they are doing rather quickly right now? Superficial but it'd be useful to have changes to non-python space broken out into separate commits, and maybe rebase with better commit names considering the broad scope of the project, and squash down fixes to code in earlier commits in the same PR. I assume you'll get to that as the PR gets closer to merge ready. I hate to raise this question since I suggested that you submit this as a flux-core PR and didn't think of it then, but should we make bindings separate projects in flux-framework, similar to the way the zeromq community does it? I recall discussing this with @grondo back when he was working on lua and we decided that lua should be included so that we could leverage it within flux-core, as we have done. I'm sure a similar argument could be made for python; on the other hand we may make it hard for any one person to change anything in flux-core if we take that too far. Bindings as separate projects does seem to work for zeromq people; there are many bindings, some of which are very out of date, but the ones that get used tend to get pushed forward by the people who use them. Having good processes and API documentation in place for libzmq helps. |
We have some options as far as maintenance is concerned. The downside to this approach, compared to explicitly writing a C extension module, is that at compile time there will not be failures in C code that will make it clear that API breakage has happened. If I leave it as is, where it's generating bindings dynamically on each run, it also wont fail to link when expected functions disappear, it will only fail when unit tests fail. It would not be all that hard to generate a set of unit tests that exercise every function used by the system, I could probably make a reflection test in an hour or so that would do that. Alternately, the bindings could be generated once then checked in. Since they would keep the API snapshotted, link failures would list the API functions that had changed or moved, and possibly make it easier to update. I actually think this is the way to go in the longer term, but as the API is moving pretty quickly, it's nice to have every addition just show up automatically. I agree about breaking out the non-python changes, I'll put up separate pull requests for those later today and figure out how to re-factor this. The python bindings could be built entirely separately from the rest of flux, there are no dependencies on flux-internals or anything else that isn't exported. It would have to be tweaked to have the make-bindings script work with install paths, but that's not a serious issue. As you note, the downside to that would be that nothing in core could depend on the bindings in that case, so it comes down to whether we want to use them for something in the main project. Since we have Lua, I'm not sure I can make a strong enough case for that. For me, it would be nice to have a 'batteries-included' scripting option for simple things like command front-ends and prototyping. On that side, I tried writing up a test for the races we discussed yesterday in Lua, but ran aground of the fact that the luaposix on LC doesn't support the |
On a related note, the recent merge broke both sets of bindings pretty thoroughly... |
FWIW, my opinion is that at this point in the project the python bindings would be extremely useful. I was hoping the Lua bindings would offer a rapid prototyping tool during the early phase of the project, but that never really materialized (except for me), and as Tom pointed out Lua isn't always the most canned solution for building utilities. Once we have a production system, perhaps bindings could be broken out, but given the early state of the project, and the extremely high popularity of python relative to Lua, my argument woud be that providing the python bindings in-project at this point will really make it easy for more people to try stuff. That being said, Jim's comments about maintainability are valid (and I probably didn't make things easy for him with the way the Lua bindings are done). Maybe if there is a script to generate static bindings during the build as Tom suggests, and a suite to test the bindings, that problem could be made tractable. That is just my 2 cents. ps. what were you trying to do with |
I needed some way to have two communicating sequential processes to make the race occur deterministically. Since Lua is pretty close to the metal API-wise I reached for the old standard of This is a funny thing, I really like the way that Lua handles coprocesses and lambdas and so-forth, but the inability to use regular expressions or low level APIs without libraries is frustrating. If the maintainability of the current version is an issue, there are a number of different ways to make it more or less welded to the C API, to the point of making them directly in C like the Lua bindings, or even to build the python bindings on top of the Lua bindings. If we were to take it down to C we lose PyPy compatibility, cool sidenote BTW we support PyPy right now. The other thing that is sub-optimal is that I wrote this for python 2.7 because that's what we have on LC. It really should be using python 3+, since python 2 is pretty thoroughly deprecated now, but it's a dependency I don't have a good way to pull in just now. |
I just looked back over that last message, apparently I'm a little more loopy than I thought, if any of this makes no sense feel free to say so... |
I echo Jim's sentiments, this is pretty awesome! Everything looks sane, and from what I can tell, the bindings look pretty easy to use. My goal for the weekend is to write a module that uses these bindings and communicates with my simulator. So I can get back to you on Monday with any issues I run into. |
Wait a couple hours, or grab a version a bit back from the current head. The watcher API change has invalidated some of the API. By the way, if anything seems funny or non-pythonic, it's early enough that all of this is pretty easy to shift around. |
What changes need to be made to make it compatible with Python 3 besides the easy stuff like changing prints to have parentheses? Does Unicode creep in and make the interface w/ C a little more complicated (I remember seeing something about that in the cffi readthedocs)? |
@trws there are some *compat bindings for the watchers, you may be able to just pull in the proper header. |
Actually the unicode string vs Clearly this is not performance optimal, but it makes it really easy On 2 Jul 2015, at 10:29, Stephen Herbein wrote:
|
@grondo I saw that, but it's not quite so simple because they're define-based rather than linkable symbols. I'm most of the way done re-working it to the new API already anyway I think. |
Ok, the new watcher interface is represented in the bindings now. |
The non-python changes have been factored out, but I'm going to wait on removing them from here because reverting the header changes will cause immediate breakage. |
It looks like that echo.py is still using the old msghandler_add rather than the new watcher API |
There also seems to be a few issues with the new MessageWatcher class. Below is a version that works for me: match = ffi.new('struct flux_match *', {
'typemask' : type_mask,
'matchtag' : match_tag,
'bsize' : bsize,
'topic_glob' : ffi.new("char[]", topic_glob),
})
self.handle = raw.flux_msg_watcher_create(match[0], MsgHandlerWrapper, wargs) I'm not very familiar with PR's on github other than the basics. So what is the best way for me to integrate the above changes into this PR? |
Submit your own PR against @trws's python_mod branch? |
Thanks for pointing those out @SteVwonder, I'll re-work those parts. So you know, in case you need to use some un-wrapped cffi APIs, that topic_glob needs to be held in something other than the struct at the top level or it may get collected as garbage. cffi structs don't get searched for liveness unfortunately. |
@trws - three other issues that I'm running into with the binding:
|
|
@trws, the modules not being rebuilt makes a lot of sense. (My bad) How are you currently using the python echo module? Whenever I load it and publish an event from the command line ( I tried making a simpler module that just listens for an event (e.g. "sim.complete") and then prints a static message, but this results in a seg fault caused by Sorry for all these questions. I'm excited for the possibility of python modules, but haven't quite figured out the bindings/all of the wrappers to the point where I can debug well. Would you prefer me to keep posting questions here in the PR or to just email you directly? |
@SteVwonder: I'm testing it as a drop-in replacement for mecho, meaning that I literally do Would you mind posting the code for the module you're trying? There may be an issue in my callback wrapper that my tests haven't exposed, it might help me diagnose it. As to where, here seems best since it will be archived with the changeset. |
Here is the code: import syslog
import flux
from flux import core
def handle_complete(h, typemask, message, arg):
h.log(syslog.LOG_INFO, "Received sim.complete event")
return 0
def mod_main(h, **arg_dict):
if h.event_subscribe("sim.complete") < 0:
h.fatal_error("event subscription failed")
with core.MessageWatcher(h,
type_mask=flux.FLUX_MSGTYPE_EVENT,
callback=handle_complete,
topic_glob="sim.complete") as msg_watch:
if h.reactor_start() < 0:
h.fatal_error( "reactor start failed!")
h.log(syslog.LOG_INFO, "pcomplete unloading") and here is what I ran & the result:
I get the same assert failure (not a seg fault, my bad) with echo and mping:
For the sake of completeness, here is the stack trace from GDB for the echo/mping assert failure:
|
That makes sense now. I made a mistake updating to the new interface, where I had the message wrapper destroying the message passed into the callback when its destructor ran. It was valid behavior for the old handler interface, but not for the new one. Assuming this version passes travis, give it a try. Your module should work with it now. |
@@ -30,9 +30,18 @@ flux_broker_SOURCES = \ | |||
shutdown.h \ | |||
shutdown.c | |||
|
|||
cmbd_LDFLAGS = -export-dynamic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automake is complaining about this line:
src/broker/Makefile.am:33: variable `cmbd_LDFLAGS' is defined but no program or
src/broker/Makefile.am:33: library has `cmbd' as canonical name (possible typo)
Should this be under broker_LDFLAGS (or something similar)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, that line can just be removed now. It was originally added when the bindings operated on only dynamic symbols. Now that they link directly to the external interfaces it just doesn't matter anymore.
Something that doesn't have to be done now, but that I have been thinking about especially since I put in the test coverage checks, is it would probably be good to decide on a single model of access to this set of bindings. What I mean is that it could be all object based, where you always have to call on a flux object or kvs object or whatever for all actions, or a free-function interface. No matter what we do there will always be a mix of these, since some things just don't make sense as one or the other, but it came to mind because we have almost a complete duplication of functionality in the KVS API between both types. The |
I wish there was.. I think I could add this to specific tests, or we could enhance sharness' Failed tests keep their log files around in the |
Remember the KVS interfaces will be changing. In #76 we were discussing mapping kvs operations to In reworking the kvs API should we try some variant of #176, where module interfaces look much more like objects, |
Quite so, that's part of why I'm leaving it to later, but as a general policy it would be good to straighten out before there are too many users of the interface. Seeing #76, which had slipped by me, explains some of the misinterpretations I had in #271 also. Combining #176 ideas with @grondo's last comment in #76 could make a whole lot of sense, having rpc as the base type for a kvs type that would then expose a set of functions, or even just be passed to free functions, would be pretty ideal for me. Having a generic RPC object used for it would be OK, but it would make the internal usage in the bindings a little odd type-wise. Perhaps doing a little object-oriented C would take care of that though, defining a kvs object as something like this? typedef struct{
// vtable?
flux_rpc_t rpc;
kvs_specific_cache/data?
} flux_kvs_rpc_t;
int kvs_get(flux_kvs_rpc_t *rpc...){
flux_rpc_get(rpc...);
} It would have the advantage of being polymorphic with RPCs, so |
I just hit the same wreckrun hang in PR #273 which is doc only so it would seem that python bindings are likely not to blame :-) Any objections if we merge this? |
I'm good with it at this point. Not to say that there's no work to do on it going forward, but it's reasonably stable and I think it's where it needs to be to be usable. |
Just had a thought - is |
@garlick, good idea, can't python PATHs be handled similar to Lua path fixups in the In general for this pull request it would be nice if it were cleaned up a bit more so that all the refactoring was squashed together, unless there is a good reason to keep that in the history for all time (e.g. @SteVwonder's commits should stay his). For example, At this point, though, I don't think it is quite as important to maintain clean pull requests as it will be in the future, and I know you've already put quite a lot of work into cleaning up the history here. So if you do feel the commits are cleaned up enough I'd be fine with merging. |
The PYTHONPATH can certainly be set in the cmd driver, an in-interpreter version of this is in py_mod already actually, but I thought it might be useful to have a way to invoke the python with which flux built its bindings. In many cases that may not matter, but since our connector modules are compiled they may not work reliably with another version of the interpreter than the one they were compiled with. |
bindings/python:python module development This version works, up to the point of trying to call msg_decode. I can't seem to find a way to get ctypes to address the flux_msg_decode function even if it's loaded in an extra shared library. The symbol must be masked off in loading by a version in a previously loaded binary, but that symbol doesn't appear with dlsym. Hopefully we can fix this with a documented external API or by compiling some real python bindings. slightly closer, event decode is still not a visible symbol however, which worries me python module now working with new flux core stuff, preparing for rebase working on python module stability full commit for python bindings Small tweak to fix distcheck adjusting travis configuration to install easy_install removing errant test print tweaks to dependency resolution to fix distcheck from a clean clone adding a transitive dependency that was not evident until testing against travis fixed a mis-spelled make variable fixing up shebang line issue in python tests to use non-redhat path for env fixing up some error checking on the kvsdir constructor removed import of alternate tap module from unit-tests re-worked to target the new watcher API Fixed watcher and KVSWatch issues related to the switch to the new API. More watchers still need to be added, but they are handled more as objects now. adding a check to ensure the C destroy is only called once bindings/python:Fixes based on @SteVwonder's feedback adding some docstrings removed spurious code from Watcher fixes for module load path, kvs interactions, wrapper performance regression, and the example module bindings/python:refactoring and build fixes Re-factoring in the python interface The external interface has not changed noticeably, but the internal organization has been re-worked to keep files from getting unwieldly. Adding the new sub-directory of package files adding missing barrier file Stopped destructing un-owned object The message watcher was destructing the message, as the old interface expected, causing the new interface to crash. adding in jsc.py adding in sec.py fixing wrong module import all in core bindings/python:Removing cmbd_LDFLAGS line The `-export-dynamic` flag is no longer necessary now that the python bindings link explicitly to the libraries rather than using runtime symbol information. bindings/python:fixed error check for isdir and added kvs.walk to walk the full system
fixed relative kvs directory acceses kvs walk works from relative paths now added a success_code to kvs calls calls that pick up the spurious EBADF can now "safely" ignore it based on the return value == the success code Addressing error-checking issue from flux-framework#253 This version checks the return from integer and pointer returning functions and ensures that the value is a potential error before raising an exception on errno. To make the actual wrapper more extensible it is now a callable object masquerading as a method instead of a closure. This makes each a bit easier to reason about.
bindings/python:fixing pre-refactor argument, adding walk test bindings/python:Added specific-type gets to KVSDir If you know what the target is, it's better to get it directly than have the interface try and guess. bindings/python:cleanup and formatting to address some of the more obvious pylint issues bindings/python:updates to the message wrapper class Handling of messages in echo was a little off. Added a property to get/set the message type and to get the type as a human-readable string, which helps with debugging messages. test/python:testing coverage, isolation and ergonomics improvements This adds a coverage checking target, a large number of tests, and some general ergonomics improvements to testing all over the place. It also fixes some bugs exposed by the new tests. This version is sitting at 84% coverage of the bindings, by number of statements. altering check setup for python module detection bindings/python:added a flux-python command The new command invokes the python interpreter that flux was built with after adding the flux installations exec prefix to the PYTHONPATH. Nothing terribly fancy, but it can be used to ensure that a script or command built on the flux bindings will find them. bindings/python: test fixups and re-factor general cleanup
Ok, they're squashed down about as far as they'll go. I'm going to add PYTHONPATH setup to the flux driver, should I also remove the command? |
Not if you think it's useful. |
Along with the path setup, the path configuration has been refactored a bit.
Really there should be a better way to do this... maybe a flux command to launch something in the appropriate environment given a PID? I could probably throw something together for that. |
Should this problem have its own issue opened? On Fri, Jul 17, 2015 at 11:57 AM, Tom Scogland [email protected]
|
That's part of it. In a more general sense we'll need a way to communicate with a target onstage at some point, but for now just being able to construct named sessions would be a big step forward. It's largely orthogonal to the bindings, but since they contain a workaround to do this for their tests it makes sense that it came up here. Sent with Good (www.good.com) From: Mark Grondona Should this problem have its own issue opened? On Fri, Jul 17, 2015 at 11:57 AM, Tom Scogland [email protected]
— |
Ok, I opened #274. I'm not familiar with the terminology "communicate with a target onstage" so hopefully I've captured the short term need you're talking about. |
Ah, I meant to type "target instance," and apparently missed an unfortunate autocorrect on my phone. |
Beer beckons. Merge? |
I think I addressed the remaining points. @grondo? |
Yeah. I ACK On Fri, Jul 17, 2015, 16:13 Tom Scogland [email protected] wrote:
|
Python bindings, python module, etc.
Very nice work @trws. |
So, this has ended up being a pretty humongous patch apparently. Here's the summary.
Python bindings now live in
src/bindings/python
. Themake_bindings.py
script scrapes header files, no parsing, just comment and directive removal and following only"
includes, to create scripts that use cffi to generate C-based python binding modules. Because of this, every single externally visible function in our header files is explicitly linked to by the python bindings. That's why some of the header file changes were made, a couple of functions were misspelled, or didn't exist, and only broke when the python bindings tried to link to them, so this ends up being a lint for the headers in an odd kind of way.Anyway, the extension modules are then wrapped by python modules in the
flux
package. All of the bindings are in pure python, and most of them have been wrapped so they're reasonably usable. Those that haven't can be called directly from python just by circumventing the high-level bindings, for example theflux_event_subscribe()
function has no binding, butf.event_subscribe("evt")
works, because the wrapper searches the library, finds the definition, and wraps the c primitives to accept and return python types. It also converts the setting of errno to non-zero intoEnvironmentError
exceptions, so no errors that use the normal flux conventions can slip by.The
pymod
module can load arbitrary python modules as flux modules, there's an example echo module that implements mecho's functionality in the directory with it.Tests are in two forms, ones that need a flux wrapping them and ones that do not. The
test/
directory has the former,test_commands/
the latter. Now thatsideflux.py
can run an external flux and connect to it based on itsFLUX_TMPDIR
these can be merged, but that's not done yet. All tests use the python unittest module as an execution framework, andpycotap
to highjack its reporting and replace it with TAP.The python support requires python 2.7 and the cffi module, and is on by default for now, but can be disabled with the usual
--disable-python
flag to configure.Also there are some patches to module loading for error reporting and a couple of other places that were tweaked for debugability.