Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[word_eval] Fix ${#BASH_SOURCE}, ${BASH_SOURCE:offset:length}, etc. #726

Merged
merged 2 commits into from
Apr 23, 2020

Conversation

akinomyoga
Copy link
Collaborator

This fixes the following issues:

  • The parameter expansions of the form ${#BASH_SOURCE} for array variables counts the number of elements instead of the number of characters of the first element.
$ osh -c 'f1() { echo ${#BASH_SOURCE}; }; f1'
1
  • Slices of the parameter expansions ${BASH_SOURCE:offset:length} are used to select array elements
$ osh -c 'f1() { echo ${BASH_SOURCE::1}; }; f1'
-c flag
$ osh -c 'f1() { echo ${BASH_SOURCE:1}; }; f1'
  • set -u doesn't detect undefined first elements with the scalar form $BASH_SOURCE.
# Error with the following form:
$ osh -uc 'echo ${BASH_SOURCE[0]}'
  echo ${BASH_SOURCE[0]}
  ^~~~
[ -c flag ]:1: fatal: Undefined variable

# No errors generated with the following form
$ osh -uc 'echo $BASH_SOURCE'

@akinomyoga akinomyoga changed the title [word_eval] Fix ${#BASH_SOURCE}, ${BASH_COMPAT:offset:length}, etc. [word_eval] Fix ${#BASH_SOURCE}, ${BASH_SOURCE:offset:length}, etc. Apr 22, 2020
@andychu
Copy link
Contributor

andychu commented Apr 22, 2020

This is a good fix, but I find bash's behavior very confusing...

I'm tempted to allow ${BASH_SOURCE} by default and disallow ${#BASH_SOURCE} by default, since it can always be rewritten as ${BASH_SOURCE[0]}.

I think the latter is always clearer, because of the confusion with ${#BASH_SOURCE[@]}.

Arrays are first class: https://github.com/oilshell/oil/wiki/Language-Design-Principles


In fact maybe it's cleaner just to have shopt -s bash_array for everything, and get rid of ALL special cases... however I think that still depends somewhat on #604

This bug is more evidence that #604 would help, because the control flow of this function is pretty confusing.


Let me think about it a little... Thank you for the fix!

@akinomyoga
Copy link
Collaborator Author

In fact maybe it's cleaner just to have shopt -s bash_array for everything,

I created such an option in my private version of Oil for the purpose of testing ble.sh (because it was the largest blocker). Actually, the problems of this PR were found in implementing the option. I also have another patch for $((BASH_LINENO)) (the scalar form doesn't work in arithmetic expressions of the current Oil). That patch is based on this PR, so I will make a PR after this PR is merged.

@andychu
Copy link
Contributor

andychu commented Apr 22, 2020

OK interesting... off the top of my head I would say we should consider these two behaviors:

  1. By default Oil disallows ${array}, ${#array}, ${array:1:1} etc., regardless of which variable it is.
    • But it allows them with shopt -s bash_array (or compat_array. Not sure if other shells with arrays do this.)
  2. Same as above, but Oil allows ${array} only for BASH_SOURCE, etc.

It depends on how much code the special case costs us ...

BTW neofetch patched itse usage of ${array} to ${array[0]}. Isn't that clearer?

I guess I would accept compat_array if it's a small amount of code (which it seems to be), but I am surprised you will want to use it for cases other than BASH_SOURCE (which seems to occur a lot in practice, which is why I added it.)

@andychu
Copy link
Contributor

andychu commented Apr 22, 2020

To clarify the intention: I don't think ${#BASH_SOURCE} or ${BASH_SOURCE:1:1} is common enough to allow by default, since they are confusing.

The user can opt into them with [0] or with compat_array, but by default they should still be a runtime error (or syntax error even)

I was reading bug-bash@ yesterday and people still report array == array[0] as a bug 10 years later, because no other language works like that.

@akinomyoga
Copy link
Collaborator Author

I guess I would accept compat_array if it's a small amount of code (which it seems to be), but I am surprised you will want to use it for cases other than BASH_SOURCE (which seems to occur a lot in practice, which is why I added it.)

Unfortunately, all the 44k+ lines of ble.sh codes have been written under the assumption that ${array} is equal to ${array[0]}. Initially, I thought I can rewrite them to ${array[0]}, but I realized that there are too many places that use the assumption, and also it is difficult to tell which ones are array variables and which ones are scalar variables by appearance. There are even cases that a variable becomes a scalar variable or an array variable depending on the situation.

To clarify the intention: I don't think ${#BASH_SOURCE} or ${BASH_SOURCE:1:1} is common enough to allow by default, since they are confusing.

The user can opt into them with [0] or with compat_array, but by default they should still be a runtime error (or syntax error even)

OK, I agree with that. And, when they are enabled by compat_array option, they should behave like ${#BASH_SOURCE[0]} but not like ${#BASH_SOURCE[@]} as the current Oil does. I think I would change my implementation of compat_array in such a way when I make a PR later.

I was reading bug-bash@ yesterday and people still report array == array[0] as a bug 10 years later, because no other language works like that.

Yeah. Even users who understand ${array} == ${array[0]} are often confused by the fact ${array+set} is actually ${array[0]+set} in the mailing list. Actually, array == array[0] seems to be introduced by ksh93 which implemented arrays in the shell for the first time, so I think it has been already 27 years at least. Bash just inherited the behavior of ksh.

@andychu
Copy link
Contributor

andychu commented Apr 23, 2020

OK sounds good.

Yes I actually "blamed" AT&T (as opposed to GNU) in a link off of the last blog post :)

http://www.oilshell.org/blog/2020/04/release-0.8.pre4.html#appendix-a-more-bad-parts-of-shell

bash is poorly implemented along many dimensions, but I would say arrays are one of the worse areas. But I learned that it got a lot of that bad behavior from ksh, i.e. “bash-isms” are really “ksh-isms”. So the blame for such a bad language to some extent goes back to AT&T and not GNU.

Another interesting thing is this Usenix '94 paper about ksh:

http://www.oilshell.org/archive/ksh-usenix.pdf

It was trying to be like Perl and Python -- e.g. ksh had GUI bindings and people would write applications in it. But I think they didn't do a good enough job on the language, so Perl and Python overtook it.

Python was first developed in 1989 and bash is from 1985 or so, so they aren't even that different in age. I think Perl was around 1985 too. But Python was designed to be a language and that mattered.


OK I will merge this and I think add some failing tests for compat_array. If you already started it I would appreciate that patch too.

@andychu andychu merged commit eebb9b1 into oils-for-unix:master Apr 23, 2020
@andychu
Copy link
Contributor

andychu commented Apr 23, 2020

OK I added a failing test so we don't forget about it:

ee80ef0

Feel free to change it / enhance it, etc. Thanks for the pull requests!

@akinomyoga akinomyoga deleted the fix-bash-array-compat branch April 23, 2020 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants