-
-
Notifications
You must be signed in to change notification settings - Fork 86
Internals
I have written an explanation on how ble.sh
works in a wiki page of Oil:
"How Interactive Shells Work · oilshell/oil Wiki".
I created this page because I think it is a good opportunity to summarize the internal implementation of ble.sh
.
Please be careful that the details of the internal implementation can be changed in future.
I originally wrote this section in "How Interactive Shells Work · oilshell/oil Wiki" to explain what kind of APIs is needed to make an interactive interface on top of them.
ble.sh
uses bind -x
which can be used to bind a user-provided command to a user input sequence.
ble.sh
steals all the user inputs from GNU Readline by binding a shell function to all possible byte values 0-255.
The essential idea can be illustrated by the following code
(although there are many workarounds for old Bash bugs in actual ble.sh
. See lib/init-bind.sh
).
declare i
for i in {0..255}; do
declare keyseq=$(untranslate-keyseq "$i")
bind -x "\"$keyseq\": process-byte $i"
done
There is no explicit main loop in ble.sh
.
ble.sh
processes received bytes asynchronously one-by-one.
In other words, it borrows the main loop of GNU Readline in which Readline calls the shell functions bounded by bind -x
.
The input byte stream is decoded into the character stream by the specified input encoding (default: UTF-8).
The character stream is translated into the key stream
by processing special escape sequences that represents cursor keys, function keys, key modifiers, etc.
Finally key sequences are constructed from keys in the key stream based on the current keymaps and are dispatched for various operations.
All of these input processing is implemented by Bash programs (See src/decode.sh
).
Another important Bash feature that ble.sh
utilizes is read -t 0
which can be used to test if the next byte in standard input is already available or not.
ble.sh
uses read -t 0
for polling.
For example, ble.sh
implements costly operations (e.g. history load, autosuggestions, filtering of menu items, history search)
in a kind of coroutines/fibers and perform them in backgrounds while there is no user inputs.
When ble.sh
detects user inputs by read -t 0
, it suspends the fiber and resume it after finishing the processing of the user inputs.
Also ble.sh
uses read -t 0
to detect the pasting from clipboard (assuming that many inputs in a short time is pasting),
etc. (cf the fiber system is implemented by functions ble/util/idle.*
in src/util.sh
).
API Requirements: To summarize, ble.sh
only requires primitive I/O operations, receive byte (bind -x
) and poll (read -t 0
) for its essential part.
In other words, Bash/Readline doesn't provide any satisfactory high-level APIs for user-input processing
(Bash/Readline provides bind
for key bindings but it has tight limitations).
If a shell provides some high-level support, a customizable key-binding system and a coroutine system would help users to develop interactive interfaces.
ble.sh
directly constructs the terminal control sequences (escape sequences) by itself.
First it determines the graphic attributes (highlighting color, etc.) of each character
in the command line (this is another long story, so I'll skip the details).
Next, it calculates the width of each Unicode character (it doesn't support combining characters currently)
and determine the display position of each character.
Then it constructs the control sequences to update the changed part
(the characters which has colors or positions different from those in the previous rendering).
Finally it outputs the constructed sequences to stderr
(See src/canvas.sh
for primitive layout/rendering functions, and ble/textarea#*
in src/edit.sh
for command line rendering).
When ble.sh
calculates the layout, it uses the terminal sizes which is available through the special Bash variables LINES
and COLUMNS
(Of course shopt -s checkwinsize
is turned on by ble.sh
).
Also ble.sh
traps SIGWINCH
to update the layout and redraw the command line on the size change of terminals.
It should also be noted that prompts are also calculated by ble.sh
by analyzing PS1
so that ble.sh
knows the size and cursor movement of the prompt (See ble-edit/prompt/*
in src/edit.sh
).
When constructing the control sequences, ble.sh
also refers to terminfo/termcap by tput
command if available (See lib/init-term.sh
).
Also, when ble.sh
is activated, all the outputs from Bash/Readline is suppressed.
To achieve this, ble.sh
performs redirection of file descriptors of Bash process using exec >... <...
.
API Requirements: ble.sh
requires a primitive I/O operation output string (printf
).
In addition, the means to get the current terminal size (LINES
and COLUMNS
) is needed.
The same information can be obtained by external commands such as tput lines
and tput cols
(ncurses) or resize
(xterm utility),
yet it is useful to provide them as builtin features (as these commands might not be available in the system).
If a shell provides high-level support for this, layout and rendering can be performed by the shell
but not by the shell scripts so that the shell scripts only have to specify the characters and their graphic attributes.
If the shell provides the prompt calculation, it should also provide the cursor position information after the prompt is printed.
The means to suppress/control the I/O of the original shell is also needed.
ble.sh
uses eval
. The commands must be executed in the top-level context (i.e., not in the function scope),
so ble.sh
uses a form of bind -x
slightly modified from that described in the above section (Processing user inputs):
bind -x "\"$keyseq\": process-byte $i; eval -- \"\$_toplevel_commands\""
Here the shell variable _toplevel_commands
is usually empty
but contains commands only when some commands should be executed in the top-level context.
Also ble.sh
needs to adjust the state of terminals and TTY handlers using special terminal sequences
and also the external command stty
before and after the command execution.
Those adjustments are also included in _toplevel_commands
API Requirements: The ble.sh
requires a means to execute commands in the top-level context (direct eval
in bind -x
).
Also ble.sh
uses the external command stty
to adjust the pty handler state which might be better to be built in the shell.
ble.sh
expects Bash for primitive IO operations such as read (bind -x
), write (printf
),
select/poll (read -t 0
), file descriptor manipulation (exec redirections
).
Also, it uses bind -x
& eval
to execute command in the top-level context.
To properly layout and render the command line contents,
it needs a means to get the current terminal size ($LINES
and $COLUMNS
) and detect the terminal-size change (SIGWINCH
trap).
There are many heavy operations in interactive interface of shells as described below.
These operations are performed in backgrounds with some mechanism of concurrency in ble.sh
.
-
Delayed Load: One example is the initialization of
ble.sh
. The entire codebase ofble.sh
involves more than 40k lines, and it will take some time to source the entire scripts and perform even a minimal initialization. To reduce the start-up time of the Bash interactive session for better user experience, the main code ofble.sh
only contains the basic line-editor and command execution feature (though it is still about 21k lines). The other modules such as syntax analysis (lib/core-syntax.sh
~ 7k lines), completion engine (lib/core-complete.sh
~ 6k lines), vim editing-mode (keymap/vi.sh
~ 8k lines), and other initialization scripts are loaded in backgrounds afterble.sh
session started. -
History Initialization: Another example of heavy operation is loading of
history
.ble.sh
refers to Bash command history in line editing to visit and search old commands in history. To implement this feature, the history should be loaded into arrays which takes some time when there are many history entries.ble.sh
initializes the history arrays in the background in idle time in which there are no user inputs. -
History Search: Another example is the history search which also takes some time.
To enable user to cancel the search or to provide progress bars for the search,
ble.sh
also wants to perform some concurrent operations. -
Completion: Also,
complete
can be another heavy operations when hundreds or thousands of possible completions are generated. In particular, asble.sh
processes all the possible completions in Bash script, it can take a longer time than the normal Bash interface.
In Bash, one may create a background subshell by command &
for concurrency.
But the problem of this method is that it is complicated to synchronize
the data between the main shell process and the subshell in real time.
Another complication is that the standard output needs to be synchronized between the main process and background subshells,
or the background subshells should not output anything to the standard output.
Also, launching a new process by fork
needs some computational costs.
To avoid these problems, ble.sh
runs in a single process/thread (mostly)
but uses some mechanisms of concurrency similar to coroutines or fibers.
There are two major framework of concurrency in ble.sh
.
One is ble/util/idle
and the other is
3. Internal and external states†
The shell settings and the terminal settings for the line editor and the
command execution is in general different. For example, the "echo" of the user
input is desired for the command execution, while we don't want the "echo" of
the user input when the line editor is in the foreground. In ble.sh
, the
setup for the line editor is called the internal state, and the setup for the
command execution is called the external state. ble.sh
switches many
settings when it goes under transition from the line editor mode to the command
execution mode and vice versa.
ble.sh
adjusts the necessary part of the settings in its internal state,
while it tries to preserve the external state for the command execution as much
as possible. In principle, ble.sh
tries to save the external settings when
it switches to the internal state, and restores the external settings when it
switches to the external state. However, for various reasons, some settings
cannot be preserved or intentionally changed.
ble.sh
uses the POSIX utility stty
to adjust the state of the TTY handler.
However, the available options depend on the system, and the detailed interface
of stty
also depends on the system. For this reason, it is impossible to
fully specify the TTY options for the internal state. Assuming that the
external TTY state is not too strange, ble.sh
copies the external TTY state
and changes the parts that typically need the adjustments. If the external TTY
state is totally broken, ble.sh
may not work as expected.
ble.sh
by default does not save and restore the external TTY state. There
are two reasons. One is that the external TTY state can be broken by a crash
of the executed command, where the necessary cleanup is missed. If the
external TTY state were fully restored, it would affect the behavior of the
succeeding commands. We assume that a broken external state is not an intended
one by default. Another reason is that saving and restoring the TTY state
requires the additional overhead of fork/exec, which is normally negligibly
small but can be noticeable in systems like Cygwin, WSL, and Termux or when the
system has a high load average. To make ble.sh
fully save and restore the
external TTY state, please use the option bleopt term_tty_restore
.
Since ble.sh
itself is written in Bash script, some strange shell settings
can break the line editor.
For example, if there are any shell functions or aliases that change the
behavior of the builtin commands, those shell functions and aliases will be
removed in the internal state. ble.sh
still tries to restore the shell
functions and aliases for the external state, but some commands such as
builtin
will not be restored.
Some builtin commands (trap
, readonly
, bind
, history
, read
, and
exit
) are replaced by ble.sh
's wrapper functions.
To distinguish ESC and the Meta modifier from each other, ble.sh
sets the Readline setting keymap-timeout
.
Other shell settings that are changed in the internal states include shell
options (set -ekuvxBT
), Bash options (extdebug
, nocasematch
,
expand_aliases
), a Readline setting (convert-meta
), and the locale
variables (LANG
, LC_ALL
, LC_COLLATE
, and others affected by LC_ALL
),
and the variable IFS
, POSIXLY_CORRECT
, IGNOREEOF
, FUNCNEST
, PS1
,
PROMPT_COMMAND
, and TIME_FORMAT
. It also saves and restores
BASH_REMATCH
. Those are supposed to be restored in the external state.
ble.sh
also adjusts the terminal colors specified by SGR. If no SGR is
specified in PS1
in plain Bash (without ble.sh
), the graphic settings
specified by a command using the control function SGR can affect all the
rendering and the output of the subsequent commands. However, ble.sh
also
needs to change the setting for its syntax highlighting and other UIs. We do
not restore the SGR state for the external state because there is no general
way to get the current state so we cannot reliably save the setting in the
first place. Some terminals have a mechanism to request the current SGR state,
but this is not supported by many terminals and it has a delay because it
requires a roundtrip communication. Also, some terminals support
pushing/popping the SGR states, but this is supported by only a small number of
terminals. Another reason not to restore it is that this again can be broken
by a crash of an executed command, and thus perfectly restoring the external
state is not useful in general.
Similarly, ble.sh
changes the advanced keyboard protocols such as
modifyOtherKeys
and kitty's protocol, but it is difficult to obtain the
current setting, so it is impossible to perfectly restore it for the external
state. ble.sh
also changes the bracketed paste mode and the cursor styles.
They are all set to the typical (sane) state in the external state regardless
of the previous external state. If the user sets them to some insane states
for the command execution, it will not be preserved.