Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source' #1628

andychu · 2023-05-22T23:46:38Z

43. BUG: `shopt -u expand_aliases` fails in scripts sourced with arguments

The problem doesn't arise when the script is sourced without arguments. It happens only when the script is sourced with at least one argument.

$ cat test-t4-s1.sh
#!/usr/bin/env bash

shopt -u expand_aliases
$ cat test-t4.sh
#!/usr/bin/env bash

echo '<A>'
source test-t4-s1.sh
echo '</A>'

echo '<B>'
source test-t4-s1.sh
echo '</B>'

echo '<C>'
source test-t4-s1.sh a
echo '</C>'

echo '<D>'
shopt -u expand_aliases
echo '</D>'

echo '<E>'
set -- a
shopt -u expand_aliases
echo '</E>'

$ osh test-t4.sh
<A>
</A>
<B>
</B>
<C>
  shopt -u expand_aliases
  ^~~~~
test-t4-s1.sh:3: fatal: Syntax options must be set at the top level (outside any function)
</C>
<D>
</D>
<E>
</E>

I'm not sure what is the expected behavior since osh seems to try to change the behavior of Bash here, but at least the current behavior doesn't match with the error message, and it seems also inconsistent that the behavior changes depending on the number of arguments.

The ble.sh implementation of read -e (i.e. ble/builtin/read -e) still doesn't work.

Full tests running

Instead, the full tests are now running. The full tests include the tests for the Bash-parser module and the completion module. Actually, three years ago, osh couldn't parse the Bash-parser module, but I didn't report the problems because the "Bash"-parser module doesn't seem to be useful in Osh as not being fully compatible with the "Osh/Oil" syntax. The related tests were also excluded then. Now I tried to run the full tests and realized that osh can now parse the two modules (17k LoC in total) without any problems and workarounds! This is the test summary.

# Result of osh out/ble.osh --test

 84.2% [section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
 89.3% [section] ble/util: 1078/1206 (128 fail, 0 crash, 28 skip)
 94.1% [section] ble/canvas/trace (relative:confine:measure-bbox): 16/17 (1 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (cfuncs): 18/18 (0 fail, 0 crash, 0 skip)
 93.5% [section] ble/canvas/trace (justify): 29/31 (2 fail, 0 crash, -1 skip)
 36.3% [section] ble/canvas/trace-text: 4/11 (7 fail, 0 crash, 0 skip)
100.0% [section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
 71.4% [section] ble/unicode/GraphemeCluster/c2break: 55/77 (22 fail, 0 crash, 0 skip)
 76.3% [section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 2481/3251 (770 fail, 0 crash, 0 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/edit: 2/2 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/syntax: 22/22 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/complete: 7/7 (0 fail, 0 crash, 0 skip)

For comparison, here is the results within Bash:

# Result of bash out/ble.sh --test

100.0% [section] ble/main: 19/19 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/util: 1228/1228 (0 fail, 0 crash, 6 skip)
100.0% [section] ble/canvas/trace (relative:confine:measure-bbox): 17/17 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (cfuncs): 18/18 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (justify): 30/30 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace-text: 11/11 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/unicode/GraphemeCluster/c2break: 77/77 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 3251/3251 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/edit: 2/2 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/syntax: 22/22 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/complete: 7/7 (0 fail, 0 crash, 0 skip)

Speed: 30-40x slower than Bash

I run those tests on a virtual machine of Ubuntu 20.04 LTS, where 2 GB RAM is assigned. I noticed that the C++ version of osh is still slower than Bash by an order of magnitude. While the tests take 12 sec in Bash, the same tests take 8 min in the C++ version of osh. It is forty times slower. The time roughly doubles in the Python version of osh. This means that even though the C++ version reduces the execution time by half compared to the Python version, it is still much slower than Bash.

The speed of the C++ version of osh seems to be much more similar to the Python version of osh than Bash. Though the optimization of the GC seems to be undergoing, is what makes this large performance degradation known? I'm not sure how the translation to C++ is designed, nor I haven't checked the generated C++ codes, but I hardly think it is just an issue of a GC. Or in other words, it is unlikely that the GC occupies 97% (= 39/40) of the total execution time. This is just my random thinking without looking at any codes, but I suspect that the translated version succeeds some aspects of the object/execution model of Python, e.g., an object is generated for every single operation, etc.

Footprint: 10x larger than Bash

I also checked the memory footprints with oshrc.test-util, though I haven't measured it for the full tests, Bash uses about 21MB but the C++ version of osh uses about 230MB (with cache on disk for some data). When there is no cache on disk, ble.sh runs an additional initialization where osh consumes an extra 90MB (i.e., about 320MB in total). On the other hand, Bash doesn't seem to change the footprint by the additional initialization.

As osh seems to use a GC, the footprint difference could be understandable because I can imagine that the GC wouldn't collect the garbages at all as far as there is sufficient free space.

Originally posted by @akinomyoga in #1069 (comment)

The text was updated successfully, but these errors were encountered:

andychu · 2023-05-22T23:49:08Z

Great bug! This is an error in our core/state.py logic

This is #1628. There are non-proc calls like 'source foo.sh a b c' and 'FOO=bar' that push the stacks in state.Mem(). So we have more subtle logic. It would be nice if mem.debug_stack had more structured data. It could tell us about 'source' vs function calls. What they have in common is the argv_stack.

andychu · 2023-06-29T03:28:27Z

Released a few weeks ago: https://www.oilshell.org/blog/2023/06/release-0.16.0.html

andychu changed the title ~~shopt -u expand_aliases behavior~~ Parsing options like 'shopt -u expand_aliases' are restricted upon 'source' May 22, 2023

andychu changed the title ~~Parsing options like 'shopt -u expand_aliases' are restricted upon 'source'~~ Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source' May 22, 2023

andychu added the pending-release label May 24, 2023

andychu closed this as completed Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source' #1628

Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source' #1628

andychu commented May 22, 2023

andychu commented May 22, 2023

andychu commented Jun 29, 2023

Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source' #1628

Parsing options like 'shopt -u expand_aliases' shouldn't be restricted upon 'source' #1628

Comments

andychu commented May 22, 2023

43. BUG: shopt -u expand_aliases fails in scripts sourced with arguments

Full tests running

Speed: 30-40x slower than Bash

Footprint: 10x larger than Bash

andychu commented May 22, 2023

andychu commented Jun 29, 2023

43. BUG: `shopt -u expand_aliases` fails in scripts sourced with arguments