Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core/process] Fix/Implement redirections <>, 5>&-, 6>&5-, {fd}>file, etc. #672

Merged
merged 17 commits into from
Mar 21, 2020

Conversation

akinomyoga
Copy link
Collaborator

@akinomyoga akinomyoga commented Mar 21, 2020

1. 328e636 Support redirections 5<>, 6>&5- and 6>&-

2. 7ea281a Update test

The arguments don't match with the function declaration? I guess the test is old. If this is not the right fix, please let me know.

3. 8dd5a02 Add tests for redirections

This demonstrates bugs related to the subsequent fixes (4., 5., and 6.) and also includes tests for 20> file and a Bash bug.

4. 10136ce Fix "Bad file descriptor" on 3>&3

This is a bug fix. The file descriptor was broken with echo 3>&3. The following test case (in spec/redirect.test.sh) had been failing due to this bug.

#### : 3>&3 (OSH regression)
: 3>&3
echo hello
## stdout: hello
## BUG mksh stdout-json: ""
## BUG mksh status: 1

5. 17e2f05 Fix fd-restoring order in FdState.Pop

This is another fix. The order of restoring original fds in saved and need_close were mixed. The ordering should be preserved. The following test case (in spec/redirect.test.sh) had failed due to this problem:

#### : 3>&- << EOF (OSH regression: fail to restore fds)
exec 3> "$TMP/fd.txt"
echo hello 3>&- << EOF
EOF
echo world >&3
exec 3>&-
cat "$TMP/fd.txt"
## STDOUT:
hello
world
## END

6. f5c2178 Fix fd-leak on : 5> file

See the following test case. The file descriptor 9 was still alive after the end of the command.

#### : 9> fdleak (OSH regression)
: 9> "$TMP/fd.txt"
( echo world >&9 )
cat "$TMP/fd.txt"
## stdout-json: ""

7. 0af7781 Fix fd-leak on exec 5>&1

The saved fds are not released on FdState.MakePermanent(). See the following example. In Oil, there were remaining fds as many as the invokations of exec 5>&1.

bash$ echo /proc/self/fd/* | wc -w
9
bash$ for i in {1..100}; do exec 5>&1; done
bash$ echo /proc/self/fd/* | wc -w
10

osh$ echo /proc/self/fd/* | wc -w
8
osh$ for i in {1..100}; do exec 5>&1; done
osh$ echo /proc/self/fd/* | wc -w
108

I didn't add a test to spec/redirect.test.sh because I don't know how to make it portable (/proc/self/fd is available in Linux but not portable). I tried to use ulimit -n 100 for this, but it turned out that the limit cannot be changed in Oil (see below). Maybe I could have implemented the test case to generate fds as many as $(ulimit -n), but I'm afraid that $(ulimit -n) might be extremely large in some system (but I don't know actually).

osh$ ulimit -n 100
osh$ ulimit -n
1024

8. da16c45 Support redirection {fd}>, {fd}<, etc.

Also, it includes the support of more-than-single-digit redirections such as 20> file. Actually, I'm not familiar with ASDL, so maybe I'm doing weird things although it works as expected superficially. I would appreciate it if you could check this commit in detail.

Copy link
Contributor

@andychu andychu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! It probably fixes some leaks like #503, and #223 is related. This part of the code definitely needed some attention. As mentioned I don't really use these constructs and I have trouble reading them.

As for the schema, I think a sum type like dest = Name(Token name) | Num(int fd) might be cleaner, but I think that can wait for another commit. Right now it may be tricky to share types between two different ASDL files -- I don't remember exactly.

Lately I tend to preserve the Token rather than using string, because it gives better syntax and runtime errors, but that's also a small issue.

I made a couple minor comments and then I think it's ready to merge and we can keep testing more stuff on top of this.

osh/cmd_exec.py Outdated
fd = consts.RedirDefaultFd(n.op.id) if n.fd == runtime.NO_SPID else n.fd
fd = n.fd
fd_name = n.fd_name
if fd == runtime.NO_SPID and not fd_name:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NO_SPID is supposed to only be for span_id. Maybe make another constant NO_FD ? Or -1 might be OK too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I defined process.NO_FD 1caf56d. If you think NO_FD should be defined in a different place, please let me know.

@@ -303,6 +303,17 @@ def IsKeyword(name):
R(r'[0-9]*<>', Id.Redir_LessGreat),
R(r'[0-9]*>\|', Id.Redir_Clobber),

R(r'\{[_a-zA-Z][_a-zA-Z0-9]*\}<', Id.Redir_Less),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is identical to VAR_NAME_RE right? If so let's use it, maybe with an intermediate like:

FD_VAR_NAME = '\{' + VAR_NAME_RE + '\}'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I defined FD_VAR_NAME cc491ff.

@andychu
Copy link
Contributor

andychu commented Mar 21, 2020

Second thought: maybe we should just have string fd for both 1 and {fd} cases, in just syntax.asdl, or in both files ?

And then in the evaluator can just check isdigit().

But if you don't think that will simplify the code, it's not that important.

@akinomyoga akinomyoga force-pushed the impl-redirections branch 2 times, most recently from e5e82fe to 57324c9 Compare March 21, 2020 14:26
@akinomyoga
Copy link
Collaborator Author

As for the schema, I think a sum type like dest = Name(Token name) | Num(int fd) might be cleaner, but I think that can wait for another commit.

Thank you for all your review! Actually initially I thought about a similar one int | string but was not sure if it is a good idea or not.

Second thought: maybe we should just have string fd for both 1 and {fd} cases, in just syntax.asdl, or in both files?

Thank you for the suggestion! I created commits for this suggestion (74f9d28 for syntax.asdl and 0f2b96b for runtime.asdl). You can have a look on these commits. If you don't like them, I can drop these commits. Of course, additional suggestions are also welcome.


I updated the default fd for <> b81d34e. Also, I applied other cosmetic changes suggested in #671 [58fb6c8 for type annotations, 668d2bd for .tag_(), and 0a096e6 for spaces in default arguments].

Copy link
Contributor

@andychu andychu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK there is one issue about FDs over 9 that I didn't pick up in the original review. Let me know what you think.

else:
fd = runtime.NO_SPID
index = 0
while index < len(op_tok.val) and op_tok.val[index].isdigit():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important issue: I think we need a limit on user descriptors so they don't collide with descriptors the shell uses? Mentioned here:

https://www.aosabook.org/en/bash.html

The other complication is one bash brought on itself. Historical versions of the Bourne shell allowed the user to manipulate only file descriptors 0-9, reserving descriptors 10 and above for the shell's internal use. Bash relaxed this restriction, allowing a user to manipulate any descriptor up to the process's open file limit. This means that bash has to keep track of its own internal file descriptors, including those opened by external libraries and not directly by the shell, and be prepared to move them around on demand. This requires a lot of bookkeeping, some heuristics involving the close-on-exec flag, and yet another list of redirections to be maintained for the duration of a command and then either processed or discarded.

Are you using descriptors over 9? I haven't seen shell scripts using more, and POSIX only guarantees up to 9 I think.

Maybe we can limit it to 99 instead of 9 ? But if there's no usage I would keep it at 9 until we encounter it.

core/process.py Outdated


class FdState(object):
"""This is for the current process, as opposed to child processes.

For example, you can do 'myfunc > out.txt' without forking.
"""
def __init__(self, errfmt, job_state):
# type: (ErrorFormatter, JobState) -> None
def __init__(self, errfmt, job_state, mem = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mem=None here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'm sorry, this was an oversight. Fixed 704ba1e.

@andychu
Copy link
Contributor

andychu commented Mar 21, 2020

One thing that is annoying is that the use of unrestricted descriptors means you can't link arbitrary libraries into the shell that do I/O! Like simply fopen() by a library could break shell scripts. I found that out the hard way ... I believe GNU readline has to take a lot of care about this.

For example, I thought it would be cool to link in libevent / libev / libuv for some kind of event-driven programming / coroutines / state machines in shell. But I think you can't do that if they use low-numbered file descriptors themselves, and open() defaults to the lowest FD if I recall correctly.

@andychu
Copy link
Contributor

andychu commented Mar 21, 2020

Thinking about this more: I would say if ble.sh does not use descriptors over 9, let's just keep it at a single digit, and add a TODO to revisit that code later. The other changes are more important.

If it does use descriptors over 9, we can discuss what the right restriction is.

@akinomyoga
Copy link
Collaborator Author

akinomyoga commented Mar 21, 2020

I think we need a limit on user descriptors so they don't collide with descriptors the shell uses?

The other complication is one bash brought on itself. Historical versions of the Bourne shell allowed the user to manipulate only file descriptors 0-9, reserving descriptors 10 and above for the shell's internal use. Bash relaxed this restriction, allowing a user to manipulate any descriptor up to the process's open file limit. This means that bash has to keep track of its own internal file descriptors, including those opened by external libraries and not directly by the shell, and be prepared to move them around on demand. This requires a lot of bookkeeping, some heuristics involving the close-on-exec flag, and yet another list of redirections to be maintained for the duration of a command and then either processed or discarded.

You don't have to worry about the collisions because Oil has already implemented above mentioned "a lot of bookkeeping, some heuristics involving the close-on-exec flag, and yet another list of redirections to be maintained for the duration of a command and then either processed or discarded" completely, and it should work without problems now. The above explanation is written in a way that it looks like a highly non-trivial matter, but that's an exaggeration. If you implemented it carefully, there is no problem.

Are you using descriptors over 9?

This is a good question. Actually ble.sh does use file descriptors starting from 30 by default for its internal usage in Bash 4.0 and below. From Bash 4.1, ble.sh uses the construct {fd}> to let Bash determine the new file descriptors. The reason why it doesn't use 0-9 is that ble.sh want to keep this range (0-9) of file descriptors for users. The reason why the default value is starting from 30 is that I want to have some margins from the file descriptors 10 which Bash itself uses for its internal usage. In addition, the starting file descriptor of ble.sh internal usage can be configured by the shell variable bleopt_openat_base in case it conflicts with another shell framework.

I think shell libraries/frameworks inevitably use larger values of file descriptors (as far as they don't require the redirection of the form {fd}>) because they need to leave 0-9 for applications. If you restrict the file descriptor range, the problem becomes more severe because an application and shell frameworks have to work with only 7 (excluding 1 and 2) file descriptors. Oil already manages file descriptors properly, so I don't see a reason to restrict the range to 0-9, 0-99 or whatever.

For example, I thought it would be cool to link in libevent / libev / libuv for some kind of event-driven programming / coroutines / state machines in shell.

Ah, OK... It's non-trivial. But can we control the range of the file descriptors used in those libraries? If we can't control it, the problem wouldn't be changed even if we restrict the available range to 0-9. For safe programming, I think one should always use the form {fd}>, etc. (although it is only available from Bash 4.1), and should not use the form INT>, etc.

@andychu
Copy link
Contributor

andychu commented Mar 21, 2020

OK yeah I guess you're right, as long as the program always uses fd_state.Open() then everything works, and third party libs are going to have a problem regardless.

I had to learn things about colliding file descriptors the hard way as described here [1], even though I think I had already read that bit about bash.

I agree that {fd} is the preferred construct for descriptors other than 0, 1, and 2.

I think it would be nice to have a better syntax for this, e.g. basically expose fd_state.Open() in Oil. I will file a bug about that.

[1] http://www.oilshell.org/blog/2017/07/02.html

@andychu andychu merged commit 9cdc686 into oils-for-unix:master Mar 21, 2020
andychu pushed a commit that referenced this pull request Mar 21, 2020
I don't know exactly why I need this, but I noticed that Travis' fd
state is not clean, or at least not the same as the bash on my desktop.

Related to PR #672.
@andychu
Copy link
Contributor

andychu commented Mar 21, 2020

FWIW this test passed on my machine, but on Travis bash did NOT fail, so the test failed.

b627479

I guess this is the same problem where file descriptors are hidden global state that is not specified when opening a program. In Oil I would deprecate it for #673 and that's another reason to avoid direct FD manipulation in shell (again http://www.oilshell.org/blog/2017/08/12.html).

(Although obviously in this case we need it for the test)

@andychu
Copy link
Contributor

andychu commented Mar 21, 2020

Only other problem: this JSON test case is now failing, and I'm able to reproduce it on my machine:

http://travis-ci.oilshell.org/jobs/2020-03-21__19-53-14.wwz/_tmp/spec/oil-json.html

Error closing descriptor 0: [Errno 9] Bad file descriptor
osh I/O error: Bad file descriptor

Looking into it ...


Oh it has to be of this, not sure exactly why but I'll look at it later...

https://github.com/oilshell/oil/blob/master/oil_lang/builtin_oil.py#L288

@akinomyoga akinomyoga deleted the impl-redirections branch March 22, 2020 00:33
@akinomyoga
Copy link
Collaborator Author

Oh it has to be of this, not sure exactly why but I'll look at it later...

I guess fclose(stdin) [ which performs fflush(stdin) and close(fileno(stdin)) ] is missing. So the data buffered in stdin will be output after the file descriptor is closed by the shell redirection.

@andychu
Copy link
Contributor

andychu commented Mar 22, 2020

OK that was a quicker fix than I expected. The issue was that the local file returned by posix.fdopen() is closed when it goes out of scope due to CPython reference counting, but the redirect should be responsible for closing it.

This kind of thing shouldn't be an issue when it's translated to C++, and is another good reason to translate it.

BTW I'm not sure if core/process.py should be translated or rewritten. It uses a lot of I/O which is hard to translate. For example I extracted osh/bool_stat.py which also does a lot of I/O for test -f, etc. That one is very easy to rewrite in C++ so there's a stub in cpp/osh_bool_stat.h. We'll have see about core/process.

I think it's easier to reason about the edge conditions and errno handling in C/C++ too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants