Skip to content
Koichi Murase edited this page Jan 7, 2025 · 6 revisions

Since ble.sh is implemented in Bash script, it can be slow depending on the environment, setup, programs running in background, and other factors. In some cases, performance might be improved by changing the setup.

Related discussions are also found in Issues - label:performance.

1. Considering setup

1.1 Bash version: Use release version >= 4.0

Bash 3: We recommend using ble.sh with Bash >= 4.0 although ble.sh supports Bash >= 3.0. Bash < 4.0 lacks the feature necessary to run additional processing in background, so ble.sh blocks the user's inputs until it completes all the processing for every byte in the input stream. For the same reason, ble.sh cannot provide the support for auto-complete and menu-filter with Bash < 4.0. In addition, Bash < 4.0 does not provide a way to reliably bind a shell function to C-d, so ble.sh uses a child process to detect C-d, which can increase the overhead and also can be fragile depending on the setup.

macOS Bash: Bash < 4.0 are outdated and currently rare in modern distributions, but macOS still ships Bash 3.2 for a licensing issue. If you are a macOS user and use Bash 3.2 shipped with macOS, please install the latest version of Bash and switch the login shell of the user account.

Here, we illustrate the Bash installation and setup using Homebrew. If you have not yet installed Homebrew, please complete it by following the instruction on the homepage of Homebrew. After Homebrew is set up, you can install the latest version of Bash by the following command:

# brew install bash

The latest version of Bash will typically be installed at /opt/homebrew/bin/bash. However, to use the installed version of Bash in terminals, you also need to *change the login shell of your account to /opt/homebrew/bin/bash. To do it, you first need to make sure /opt/homebrew/bin/bash is included in /etc/shells. If it is not included, please add a newline /opt/homebrew/bin/bash in /etc/shells.

# edit /etc/shells     # <-- please replace "edit" with a text editor (such as vi, nano, etc)

Then, run the following command to change the login shell of your account (which is typically recorded in /etc/passwd):

# chsh -s /opt/homebrew/bin/bash

Some terminals directly reference the passwd entry to determine the user's login shell, but most terminals in the market reference the environment variable SHELL, which is initialized by the passwd entry on the user login to the GUI session (i.e., the window manager). For this reason, you will probably have to reboot the system to make sure the latest version of Bash is used by your terminal.

After rebooting, you can confirm the Bash version of the current session by pressing C-xC-v. Another way to check the Bash version is to use the following command:

$ "$BASH" --version


Note: The following is INCORRECT.  It prints the version of Bash found first in
the current environment variable PATH, which is unrelated to the version of
Bash of the current session.

$ bash --version    # WRONG WAY TO CHECK THE CURRENT SHELL

Note: The following is also INCORRECT.  It prints the version of the login
shell of the current user, which is not the version of Bash of the current
session.  Your terminal probably uses the login shell stored in SHELL for the
interactive shell, but it is not ensured actually.  Even if it is the case,
when a different shell is started inside the terminal, the shell of the current
session can be different from the login shell.

$ echo "$SHELL"     # WRONG WAY TO CHECK THE CURRENT SHELL

MSYS1 Bash: Until recently, MSYS1 provided by the MinGW project was shipped with Bash 3.1. MSYS1 Bash has other compatibility issues as well as the aforementioned performance issues. If you still use MSYS1, please switch to MSYS2.

Bash devel/alpha/beta: We recommend using a release version of Bash. The devel/test versions of Bash can be extremely slow with ble.sh because of the slow memory allocation for debugging used by the devel/test versions of Bash. When ble.sh detects a devel/test version in its initialization, it prints a warning to stderr. If you want to use ble.sh with a devel/test version of Bash, it is recommended to build Bash with the configure option --with-bash-malloc=no for practical performance:

~/bash-devel$ ./configure --with-bash-malloc=no
~/bash-devel$ make all
~/bash-devel$ make install

To suppress the warning message on the startup, please specify the option --bash-debug-version=short or once or ignore when sourcing ble.sh.

# bashrc

# Show a short version of the message
source /path/to/ble.sh --bash-debug-version=short

# Do not print the warning message more than once
source /path/to/ble.sh --bash-debug-version=once

# Show the warning message only once for each debug version of Bash
source /path/to/ble.sh --bash-debug-version=ignore

4.2 <= Bash < 5.3: If you also care about the memory footprint, one should avoid the version range 4.2 <= Bash < 5.3, which has a bug of storing duplicate data of shell functions. The bug is fixed in Bash >= 5.3 (which has not yet been released as of 2024-11-07).

1.2 Slow systems

ble.sh is mainly tested in Linux (Fedora), where I do not feel any performance issue except for the case of massive completion candidates.

However, for specific systems, there seems to be sometimes performance issues. For example, Bash may consume much computational resources when Bash is run on top of an emulation layer, such as Cygwin/MSYS2/WSL1/Termux. The overhead by ble.sh can also be an issue with devices with a lower power, such as Raspberry Pi and smart phones (with Termux). In the past, macOS users also seemed to report the slow performance, but we do not hear a similar report recently.

The performance issue in such systems is due to the systems' inherent limitation on the computational power, so there is no simple solution. If you want to use ble.sh with such systems, you will need to adjust various ble.sh settings to see if the situation is improved. See the later sections.

1.3 Slow file systems

File systems can also be slow. For example, WSL2's /mnt contains bridges to the file systems in the Windows subsystem, which internally seems to cause a round-trip communication for every single syscall and thus extremely slow when a directory contains many file entries. This affects the executable file search since PATH contains the bridges in WSL2 by default (GitHub#96). A similar problem may happen with network-based file systems such as NFS, FUSE-SSHFS, SMBFS, CIFS, etc.

For another example, macOS 10.15 (Sequoia) seems to have introduced the security check for every attempt of opening file, which may cause a serious delay in every part of ble.sh.

The slow file systems cause the performance problem when a TAB completion or auto-complete is attempted or when the highlighting of filenames are attempted. In this case, one can either remove the file systems from the related search paths (such as PATH or the ones referenced in completion settings), or turn off the affected feature.

WSL2 /mnt

In the case of WSL2 (which is explained above), one may try removing /mnt/* from PATH to see if the situation changes.

# blerc

ble/path#remove-glob PATH '/mnt/*'

Or another way is to change the WSL setting in /etc/wsl.conf so that the initial value of PATH does not include /mnt/*. If /etc/wsl.conf does not exist, you can create a new text file. Then, you can add the following lines to /etc/wsl.conf:

[interop]
appendWindowsPath = false

macOS >= 10.15

According to the report by @dlyongemallo and reference therein, macOS started to check the file contents on every attempt to open an (executable) file by sending and receiving some information to an Apple server. Although the communication with the Apple server seems to be cached in the system, it still seems to perform the check for every new file. The discussion also suggests that an attempt to open an existing file after changing the file contents would also cause the server access.

ble.sh uses Bash 5.3's ${ funsub; } or an equivalent feature in lower Bash version, but ${ funsub; } involves a new temporary file for every evaluation. This causes a serious delay in ble.sh. According to the report, the shell startup may take up to 30 seconds with macOS's check enabled. Even after the starting the session, ble.sh seems to be affected by the delay.

If you use Terminal.app, in the macOS system settings, you may turn off the security check for the processes started inside the terminal.

  1. Open System Settings of macOS

  2. Input "security" on the top left text box to find "Privacy & Security".

  3. Go to the category "Developer Tool". If the category is not found, you can enable it by running the following command in a terminal:

    $ sudo spctl developer-mode enable-terminal
  4. Press the button [+] and add "Terminal" to the list.

Warning

This setting will turn off the security check of all the processes started in the terminal, so the user needs to be careful not to run random scripts and programs obtained from the internet.

Note

You may add other terminals to the list, but it doesn't seem to be effective. This workaround can only be used with Terminal.app.

Filename highlighting

Highlighting of the words based on the filenames can be turned off by the following setting:

# blerc

bleopt highlight_filename=

Instead of completely turning off the highlighting of the words, one may set timeouts and limits for the highlighting. One can basically try to reduce the numbers:

# blerc

bleopt highlight_timeout_async=5000
bleopt highlight_timeout_sync=50
bleopt highlight_eval_word_limit=200

Highlighting of the completion candidates can be turned off by the following setting:

# blerc

# Note: This internally sets "bind 'set colored-stats off'".
bleopt complete_menu_color=off

# Note: This internally sets "bind 'set colored-completion-prefix off'".
bleopt complete_menu_color_match=on

1.4 Slow completions

Completions and related features can also cause a performance issue when the number of generated candidates is too large or when the third-party programmable completion setting takes time. This typically becomes a problem with auto-complete because auto-complete tries to perform the completion generation in background while the user inputs the text.

Setting limits and timeouts

To reduce the processing time for the completion generation, you can set related limits, timeouts, and frequency to check the user's input. These settings can be set to smaller numbers to reduce the blocking time.

# blerc

bleopt complete_limit_auto=2000
bleopt complete_limit_auto_menu=100
bleopt complete_timeout_auto=5000
bleopt complete_timeout_compvar=200
bleopt complete_polling_cycle=50

To reduce the time of constructing the menu, one might set the following settings:

# Limit the menu height
bleopt complete_menu_maxlines=10

# Use a simple layout for the menu
bleopt complete_menu_style=dense

Third-party programmable completions

To generate completions, ble.sh uses programmable completion settings, which are set up by users and third-party frameworks. ble.sh can be blocked when the corresponding programmable completion setting hangs or takes time. This is typically the case when the input delay is increased for the arguments to specific commands.

This is an issue with the programmable completion setting, and not the problem of ble.sh. In the original Readline, this problem might be worked around by canceling the completion by pressing C-c. However, this still breaks the terminal layout, so the programmable completion setting is anyway broken or badly designed. See the following discussions:

In this case, one should try to fix the programmable completion setting. To identify the provider of the programmable completion setting, one can first check the completion setting by running the following command:

$ complete -p <command_name>

where <command_name> is the affected command name (for which the input delay of arguments becomes significant). If you find -F <func> (where <func> is a string) in the output, <func> is the shell function name that generates the completion candidates for the specified command. If you find -C <cmd> (where <cmd> is a string) in the output, <cmd> is the command name that generates the completion candidates.

Identify the completion setting

The first thing to consider is to optimize the implementation of the programmable completion setting. If you are not sure, you can consider reporting the performance issue to the provider of the programmable completion setting. You can run the following command to check the filename where the function is defined:

$ (shopt -s extdebug; declare -F <func>)

where <func> is the identified function name. Or if the completion setting is specified by -C <cmd>, you can check the location of the command by running

$ type -p <cmd>

Based on the file location, you may try to identify the package that provides the file and to improve the implementation in the upstream.

Modify the completion setting dynamically

Another possibility is to adjust the behavior of the existing completion setting using ble/function#advice.

For example, one may write the setting to turn off the completion for a specific command in auto-complete with the following settings:

# blerc

function blerc/disable-progcomp-for-auto-complete.advice {
  if [[ $BLE_ATTACHED && :$comp_type: == *:auto:* ]]; then
    return 0
  fi
  ble/function#advice/do
}

_comp_load <command_name1> && ble/function#advice around <func1> blerc/disable-progcomp-for-auto-complete.advice
_comp_load <command_name2> && ble/function#advice around <cmd2>  blerc/disable-progcomp-for-auto-complete.advice
...

where <command_name1>, <command_name2>, ... are the command names for which the programmable completion settings are provided, <func1>, <cmd2>, etc. are the function and command names that generate the completion candidates, which are identified above. The original discussion is found in #522.

Possibility of a setup issue

If the programmable completion setting hangs, it might not be a performance issue but rather a problem with the setup of the programmable completion. Possible problems with auto-complete is described on the Reporting Issue page.

Completion in background

Completions performed in the background might be turned off entirely if you do not use them. The auto-complete, menu-filter, and auto-menu features can be turned off by the following settings:

# blerc

bleopt complete_auto_complete=
bleopt complete_menu_filter=
bleopt complete_auto_menu=     # This is default; auto-menu is off by default

Instead of completely turning off auto-complete and auto-menu, one can instead specify a delay of starting the processing with the following settings:

# blerc

bleopt complete_auto_delay=500
bleopt complete_auto_menu=500

1.5 Long command line

If a long command line is input, ble.sh's response becomes slow. ble.sh is implemented in a Bash script and cannot use a proper and efficient data structure. The command line string is stored in a single scalar variable, and related metadata are stored in a flat array. This implementation limitation causes the bad scaling of the processing time for a long command line. ble.sh limits the command-line length by default, but the limits are set to a relatively large number by default. If the long command line would be the problem for you, you may lower the limits of the command length with the following options.

bleopt line_limit_length=10000
bleopt history_limit_length=10000

The behavior when the limit is reached can also be changed by the following option:

bleopt line_limit_type=editor

For behavioral consistency in navigating through the command history by up, down, etc., history_limit_length is better to be set to equal or less than line_limit_length.

1.6 History

History size

A large command history can also affect the initialization time and the memory footprint of the interactive Bash process. If you have extremely large command history, you might consider reducing the size limit or removing irrelevant old commands.

History sharing

If you have the setting like history -a; history -c; history -r to synchronize the command history between sessions, please instead use the following setting:

bleopt history_share=1

The setting history -a; history -c; history -r loads the command history of Bash every time a command is executed, and then ble.sh needs to reload and process the entire command history every time, which is extremely inefficient. The overhead becomes particularly significant when the command history size becomes large. Even if the history size is small, it is a redundant process to reload the entire command history. The setting like history -a; history -c; history -r should not be used. The setting bleopt history_share=1 enables processing only the newly added entries in the command history, and thus it is recommended.

Erasedups limit

If ble.sh naively would emulate the Bash feature HISTCONTROL=erasedups, when the feature is enabled with a large command history, it could have been slow because ble.sh implements all the processing on Bash arrays in Bash scripts. For this reason, by default, ble.sh limits the target command range within the commands that are added after the current session started when removing duplicates in the command history.

If you want to recover the original behavior where all the commands are checked for duplicates, you can use the following setting (which may increase the delay of the command starting):

bleopt history_erasedups_limit=

The performance issue shouldn't happen with the default setting basically. However, if you run a large number of commands in a single session. the performance of the erasedups feature might become a problem. In this case, one can limit the target commands within the last N commands, where N can be specified in the following way for N = 100:

bleopt history_erasedups_limit=100

1.7 keyseq timeout

For the delay of ESC in the vi/vim editing mode, please check Vi (Vim) editing mode.

1.8 DEBUG trap

Other Bash configurations might set the DEBUG trap, which makes the entire shell execution slow. Although ble.sh temporarily removes the user's DEBUG trap for its internal processing, it may still impact the performance when ble.sh's utility is called outside the ble.sh internal state.

The DEBUG trap is typically used to emulate the preexec hook of Zsh.

  • For this purpose, please use blehook PREEXEC instead of the DEBUG trap.
  • An external framework, bash-preexec uses the DEBUG trap to implement its preexec hook. If you want to use the bash-preexec feature, please use ble-import integration/bash-preexec instead of bash-preexec. Even when bash-preexec is loaded in a ble.sh session, ble.sh tries to adjust the hook introduced by bash-preexec. However, this may not be robust, so please consider switching to ble-import integration/bash-preexec.
  • Starship uses the DEBUG trap when it does not detect other frameworks providing a preexec mechanism. Since Starship uses ble.sh's blehook PREEXEC when it detects ble.sh, please load ble.sh before initializing Starship by eval "$(starship init bash)".
  • Atuin also wants to use the preexec hook and relies on either bash-preexec or ble.sh to make itself work properly. When ble.sh is loaded, bash-preexec does not need to be loaded because Atuin internally calls ble.sh's blehook integration/bash-preexec.

2. Profiling and optimization

2.1 Profiler

If you want to identify the cause of the bad performance you experience, you might use the ble.sh profiler. The profiler can be started and stopped by running ble/debug/profiler/start and ble/debug/profiler/stop.

$ ble/debug/profiler/start
$     # Do the operation that takes time
$ ble/debug/profiler/stop

The execution times of functions and lines are summarized in files ./prof.$$.*.txt (where . represents the working directory when ble/debug/profiler/start is called). If you want to change the prefix of the output files, you can specify it as ble/debug/profiler/start name so that the output is written to name.*.txt. When the output files already exist, the existing files will be updated by summing up the execution times of the old and new results.

Warning

Since the profiler internally saves a large size of execution log (increasing by 10 megabytes per second or more when processing undergoes), the profiler should not be turned on for a long time.

Tip

The measurement results include the sleeping time. In particular, entries containing the string ble/util/idle, sleep, msleep, etc. are related to the waiting for the user's inputs, so their long execution time would probably be unrelated to the performance issue.

To get a different type of statistics or to export the result in a different format, you can adjust the option bleopt debug_profiler_opts. You can also check the option bleopt debug_profiler_tree_threshold.

2.2 Initialization time

If you want to get a hint on the initialization time, you can also enable the measurement of ble.sh's loading time using a modified version of ble.sh. You can first rewrite the macro variable measure_load_time at the beginning of ble.pp:

diff --git a/ble.pp b/ble.pp
index a8703e2b..82fb3362 100644
--- a/ble.pp
+++ b/ble.pp
@@ -1,7 +1,7 @@
 #!/bin/bash
 #%$> out/ble.sh
 #%[release = 0]
-#%[measure_load_time = 0]
+#%[measure_load_time = 1]
 #%[debug_keylogger = 1]
 #%[leakvar = ""]
 #%#----------------------------------------------------------------------------

Then, you can rebuild ble.sh by running make (and install it if necessary). The initialization times will be measured and printed when the modified ble.sh is loaded.

2.3 General hints for responsive shell implementation

If you want to add new features in Bash or modify existing parts of ble.sh, you will probably want to care about the performance because Bash is in general slower than C/C++ implementations. Here, we do not discuss the general optimization universal to any languages (such as proper algorithm and data structures). We here discuss things that needs a special care for Bash.

The shell functions mentioned here may not be documented in the manual, but you can check ble.sh's source code in that case. We usually have code documentation to important utilities. You can also check the actual implementation to understand the detailed behavior. The code documentation might be occasionally explained in Japanese, but even in that case, you can use machine translation to understand the usage.

Avoid subshells and external commands

In Bash, subshells are implemented by forking processes, which is significantly slower than the other shell operations. Depending on the operating system, a single subshell may take about the order of 100 milliseconds. Even on faster systems like Linux, a fork would take a few milliseconds, which is 50x or 100x slower than the typical shell operations without forking. The subshells are used in many places of the shell, such as pipes [cmd1 | cmd2], command substitutions [$(cmd) and `cmd`], process substitutions [<(cmd) and >(cmd)], and subshell groups [(cmd)].

Calling external commands are also slow for the same reason. To launch an external command, Bash first forks itself and also performs exec. The exec syscall needs to load the binary image from the file system and perform all the process initialization, so exec is also a slow operation at the same order as fork.

If the same result can be easily obtained by built-in Bash features, one should use the built-in Bash features without using subshells. If you want to obtain stdout of a command as a string, you can use ble.sh's shell function ble/util/assign:

ble/util/assign varname 'command'

where varname is the variable name to store the result, and the command can be specified to 'command', which is internally executed by the eval builtin. There are variants for arrays such as ble/util/assign-words and ble/util/assign-array. The former splits stdout by white spaces and store the elements in an array. The latter does the similar but splitting stdout using newlines.

However, if you need to use an external command to process some part of the operation, you should try to minimize the number of calls of external commands. In particular, even when only a part cannot be implemented in the built-in features, one should consider implementing the whole processing in the external commands. The awk command is useful for that purpose because it can easily do most of the processing that the other tools like sed, seq, tr, cut, etc. offer, and because it can also do many operations in a single call.

Avoid unbuffered read

Also, reading data from streams using the read builtin could also be extremely slow because the read builtin needs to read one byte at most at once from a stream. This is related to the expectation that the read builtin leaves the unprocessed data in the stream for the subsequent operations (possibly performed by a different process). By design, the read builtin reads only a part of the stream, and its length is determined on looking at newline characters (or specified delimiters) in the stream. For this reason, there is no way for the read builtin to predict when the data to be read would end at all, and the read builtin can only read one byte at once from unseekable streams. If the read builtin would have read the data too much, that data cannot be seen from the other commands run after the read builtin. However, this means that the read builtin issues system calls as many times as the data size of the processed data, which would be extremely slow (e.g. sometimes 2000x slower). Therefore, one should usually avoid using the read builtin for reading a large data supplied through a pipe, etc.

Cancel an operation

When you implement an operation that may take time, you should also consider allowing a way to cancel the operation in the middle. In particular, the slow operation should monitor the user's inputs in stdin, and if the user starts inputting anything, the operation should be canceled at a good timing. To check if there is a user input, you can use the shell function ble/util/is-stdin-ready:

while (some slow loop); do
  if ble/util/is-stdin-ready; then
    cancel processing
    break
  fi
  process something
done

Inside completion settings, one should use ble/complete/check-cancel instead of ble/util/is-stdin-ready. When ble/complete/check-cancel succeeds, one should cancel the current processing of completion and return exit status 148.

When one wants to call a slow external shell function or command but wants to cancel it with a certain condition such as ble/complete/check-cancel without re-implementing or modifying the shell function or the command, you may consider calling it through ble/util/conditional-sync. This utility calls the provided command in a background subshell and monitor the condition. When the condition becomes unsatisfied before the command completes, ble/util/conditional-sync kills the background command and immediately returns. One can also specify a timeout for the command in the option of ble/util/conditional-sync.

Clone this wiki locally