Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source' #1628

Closed
andychu opened this issue May 22, 2023 · 2 comments

Comments

@andychu
Copy link
Contributor

andychu commented May 22, 2023

43. BUG: shopt -u expand_aliases fails in scripts sourced with arguments

The problem doesn't arise when the script is sourced without arguments. It happens only when the script is sourced with at least one argument.

$ cat test-t4-s1.sh
#!/usr/bin/env bash

shopt -u expand_aliases
$ cat test-t4.sh
#!/usr/bin/env bash

echo '<A>'
source test-t4-s1.sh
echo '</A>'

echo '<B>'
source test-t4-s1.sh
echo '</B>'

echo '<C>'
source test-t4-s1.sh a
echo '</C>'

echo '<D>'
shopt -u expand_aliases
echo '</D>'

echo '<E>'
set -- a
shopt -u expand_aliases
echo '</E>'

$ osh test-t4.sh
<A>
</A>
<B>
</B>
<C>
  shopt -u expand_aliases
  ^~~~~
test-t4-s1.sh:3: fatal: Syntax options must be set at the top level (outside any function)
</C>
<D>
</D>
<E>
</E>

I'm not sure what is the expected behavior since osh seems to try to change the behavior of Bash here, but at least the current behavior doesn't match with the error message, and it seems also inconsistent that the behavior changes depending on the number of arguments.


The ble.sh implementation of read -e (i.e. ble/builtin/read -e) still doesn't work.

Full tests running

Instead, the full tests are now running. The full tests include the tests for the Bash-parser module and the completion module. Actually, three years ago, osh couldn't parse the Bash-parser module, but I didn't report the problems because the "Bash"-parser module doesn't seem to be useful in Osh as not being fully compatible with the "Osh/Oil" syntax. The related tests were also excluded then. Now I tried to run the full tests and realized that osh can now parse the two modules (17k LoC in total) without any problems and workarounds! This is the test summary.

# Result of osh out/ble.osh --test

 84.2% [section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
 89.3% [section] ble/util: 1078/1206 (128 fail, 0 crash, 28 skip)
 94.1% [section] ble/canvas/trace (relative:confine:measure-bbox): 16/17 (1 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (cfuncs): 18/18 (0 fail, 0 crash, 0 skip)
 93.5% [section] ble/canvas/trace (justify): 29/31 (2 fail, 0 crash, -1 skip)
 36.3% [section] ble/canvas/trace-text: 4/11 (7 fail, 0 crash, 0 skip)
100.0% [section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
 71.4% [section] ble/unicode/GraphemeCluster/c2break: 55/77 (22 fail, 0 crash, 0 skip)
 76.3% [section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 2481/3251 (770 fail, 0 crash, 0 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/edit: 2/2 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/syntax: 22/22 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/complete: 7/7 (0 fail, 0 crash, 0 skip)

For comparison, here is the results within Bash:

# Result of bash out/ble.sh --test

100.0% [section] ble/main: 19/19 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/util: 1228/1228 (0 fail, 0 crash, 6 skip)
100.0% [section] ble/canvas/trace (relative:confine:measure-bbox): 17/17 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (cfuncs): 18/18 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (justify): 30/30 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace-text: 11/11 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/unicode/GraphemeCluster/c2break: 77/77 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 3251/3251 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/edit: 2/2 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/syntax: 22/22 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/complete: 7/7 (0 fail, 0 crash, 0 skip)

Speed: 30-40x slower than Bash

I run those tests on a virtual machine of Ubuntu 20.04 LTS, where 2 GB RAM is assigned. I noticed that the C++ version of osh is still slower than Bash by an order of magnitude. While the tests take 12 sec in Bash, the same tests take 8 min in the C++ version of osh. It is forty times slower. The time roughly doubles in the Python version of osh. This means that even though the C++ version reduces the execution time by half compared to the Python version, it is still much slower than Bash.

The speed of the C++ version of osh seems to be much more similar to the Python version of osh than Bash. Though the optimization of the GC seems to be undergoing, is what makes this large performance degradation known? I'm not sure how the translation to C++ is designed, nor I haven't checked the generated C++ codes, but I hardly think it is just an issue of a GC. Or in other words, it is unlikely that the GC occupies 97% (= 39/40) of the total execution time. This is just my random thinking without looking at any codes, but I suspect that the translated version succeeds some aspects of the object/execution model of Python, e.g., an object is generated for every single operation, etc.

Footprint: 10x larger than Bash

I also checked the memory footprints with oshrc.test-util, though I haven't measured it for the full tests, Bash uses about 21MB but the C++ version of osh uses about 230MB (with cache on disk for some data). When there is no cache on disk, ble.sh runs an additional initialization where osh consumes an extra 90MB (i.e., about 320MB in total). On the other hand, Bash doesn't seem to change the footprint by the additional initialization.

As osh seems to use a GC, the footprint difference could be understandable because I can imagine that the GC wouldn't collect the garbages at all as far as there is sufficient free space.

Originally posted by @akinomyoga in #1069 (comment)

@andychu andychu changed the title shopt -u expand_aliases behavior Parsing options like 'shopt -u expand_aliases' are restricted upon 'source' May 22, 2023
@andychu andychu changed the title Parsing options like 'shopt -u expand_aliases' are restricted upon 'source' Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source' May 22, 2023
@andychu
Copy link
Contributor Author

andychu commented May 22, 2023

Great bug! This is an error in our core/state.py logic

andychu pushed a commit that referenced this issue May 24, 2023
This is #1628.

There are non-proc calls like 'source foo.sh a b c' and 'FOO=bar' that
push the stacks in state.Mem().  So we have more subtle logic.

It would be nice if mem.debug_stack had more structured data.  It could
tell us about 'source' vs function calls.  What they have in common is
the argv_stack.
@andychu
Copy link
Contributor Author

andychu commented Jun 29, 2023

Released a few weeks ago: https://www.oilshell.org/blog/2023/06/release-0.16.0.html

@andychu andychu closed this as completed Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant