-
-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process substitution / anonymous named pipes #66
Comments
I kind of follow what you're getting at. Could you write up a few pbs example use-cases here of how you envision it? It will become more clear to me then. Are you thinking of something like this? import pbs
pbs.tail(pbs.cat("/tmp/test", _out="fifo")) |
Okay, so suppose for example I want to do something like this in bash (this was my original use case): git config -f <(curl http://raw.github.com/heavenlyhash/projectWhatever/master/.gitmodules) -l Now I want to do that in PBS, and it's a little tough, but what I ended up hacking into being was this: from pbs import git;
with closing(urllib.urlopen(githubRawUrl+"/.gitmodules")) as f:
remoteModulesStr = f.read();
git.config("-f", "/dev/fd/0", "-l", _in=remoteModulesStr) And that works, because /dev/fd/0 is already a magic file in my system that is a fifo that will read from standard in of that process. In the more general case though, what if stdin is already used by that process for something special? Or I want to do diff <(curl http://thingy.com/resource1) <(curl http://thingy.com/resource2) Now that trick with stdin won't work; I need other channels, or several of them. To see what bash is doing here, you can do something like this: diff <(tail -f /dev/null) <(tail -f /dev/null) &
ps -f | grep diff ...and you'll see something like "diff /dev/fd/63 /dev/fd/62". Possibly exactly that. So, the most direct way to expose this from pbs might look like this: pbs.diff("/dev/fd/63", "/dev/fd/62", __63=inMemStrA, __62=inMemStrB) That's a little ugly. Cooler would be maybe more like... pbs.diff(pbs.stream(inMemStrA), pbs.stream(inMemStrB)); Actually, coolest might be somewhere in the middle. Gimme a syntax to pass arbitrarily numbered channels in and out (and then _in, _out, and _err become mere special cases of that system and are synonymous to __0, __1, and __2), just in case I'm interfacing with some crazy program that uses the higher numbers. Then also have a wrapper object that makes the pbs command invocation step aware that there's something here that should be shunted via an anonymous pipe, and there hide all the numbers (and more importantly, the /dev/ shenanigans) from the library user. |
I think I'm going to hold off on this one for right now, but it will go on the roadmap. The dev branch desperately needs to get finished up and merged to master, and it has a complete rewrite of subprocesss.Popen in it. So doing this feature might be easier on the dev branch. |
is there a newer way to do process substitution since this was originally issued? |
@brentp negative |
for the record, what [...]
pipe([3, 4]) = 0
fcntl(63, F_GETFD) = -1 EBADF (Bad file descriptor)
dup2(3, 63) = 63
close(3) = 0
[...]
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe4425f59d0) = 16235
[...]
execve("/usr/bin/tail", ["tail", "-f", "/dev/fd/63"], [/* 40 vars */]) = 0 and on the dup2(4, 1) = 1
[...]
write(1, "generated\n", 10) = 10 So it's the same as setting up a pipe ( |
@amoffat This is a really old suggestion and hasn't seen any support from others for the last many years. Maybe we should just close it without fixing? |
I for one would still be interested in this. Granted, my use case is currently handled by just using sh to call a script that handles the named pipes, but in the interest of feature completeness I think it'd be useful to at least keep this on the roadmap. |
@ecederstrand I think I tend to agree with you. It is a cool idea, and it seems like when people need it, it would very very convenient, but it also seems like people don't need it very often. I'll close it and we can re-open if more momentum builds behind it. |
I have a situation where I want to call one program which can only accept a certain kind of input via a file which must be named in the arguments, but I want to feed it content generated from another program.
In other words, I have a situation that would be expressed in bash with a process substitution like this:
(Relevant: https://en.wikipedia.org/wiki/Process_substitution )
In python, I can solve this with a tempfile fairly easily.
A step better: I can also solve it with a named pipe with a mkfifo call fairly easily, which gives me the joys of in-memory rather than actually hitting the filesystem needlessly.
However, that still leaves something to be desired; I have to pick a name for my fifo, and I have to remove it again when I'm done. If I get SIGKILL, I leave a dangling fifo hanging around on my filesystem. What would really be excellent is if I could tap into the magic stuff in the /proc/$pid/fd and /dev/fd/$fd areas common in a linux world... that would give me a system where the kernel itself is functioning as my cleanup.
That example of process substitution in bash up above does something clever like that. If you run that example and then look at what actually happened with
ps
, you'll see something like this:Bash created a fifo somewhere where I don't have to worry about it (I think it's somewhere under /proc/ so it just goes away when the processes die?); stdout of the
echo
writes into the fifo and the reading end of the fifo is made into file descriptor 63 fortail
. And then the "/dev/fd/63" part is magic that happens to be a name for the fifo that is fd 63 to the current process.What would really be excellent is if I could tap into the same level of magic up in the python world.
In the course of writing this, I ended up realizing that I can use "/dev/fd/0" as an argument to get a program to read its own standard in as a file, and since I don't happen to be using stdin already in my current case, this solves my immediate problem. A more general solution would still be excellent, though, and for that we would need the ability to pass arbitrarily numbered file descriptors into child processes, instead of being limited to stdin/stdout/stderr aka 0/1/2.
Also, I'm not sure how portable the "/dev/fd/$fd" stuff is; I feel a little uncomfortable hardcoding that in, and bash takes care of it for me, but I have no idea how I'd go about finding out in a cross platform way what the location is for the magic filenames-to-selfprocess-file-descriptors.
The text was updated successfully, but these errors were encountered: