Automatically install and use tree-sitter major modes in Emacs 29+. If the tree-sitter version can’t be used, fall back to the original major mode.
Each of these behaviors are configurable and documented under the
“Configuration” section. By activating global-treesit-auto-mode
, Emacs will:
- Automatically switch to
<name>-ts-mode
when the grammar for<name>
is installed - Stick with
<name>-mode
if the grammar isn’t installed - Automatically install a grammar before opening a compatible file
- Modify
auto-mode-alist
for tree-sitter modes
There is also a convenience function M-x treesit-auto-install-all
, which will
install all (or a selected subset) of the maintained and compatible
grammars. You can add these grammars to your auto-mode-alist
automatically by
invoking the treesit-auto-add-to-auto-mode-alist
function in your
configuration.
treesit-auto
is available from MELPA. After following their setup, you can
use your preferred package manager. If that’s the default package.el
, simply
M-x package-refresh-contents
and then
M-x package-install RET treesit-auto
If you want a local clone of the repository, rather than just a copy of the
source, you might instead use package-vc-install
M-x package-vc-install RET https://github.com/renzmann/treesit-auto.git
Then, in your Emacs configuration file (~/.emacs.d/init.el
),
(use-package treesit-auto
:config
(global-treesit-auto-mode))
For most users, this will be enough. There are some nifty things you might want to enable, though, which are covered in the “Configuration” section below.
This is how I configure treesit-auto
for my own personal use.
(use-package treesit-auto
:custom
(treesit-auto-install 'prompt)
:config
(treesit-auto-add-to-auto-mode-alist 'all)
(global-treesit-auto-mode))
Emacs 29, while featuring treesit.el
and a convenient
treesit-install-language-grammar
, will not feature an intelligent way to choose
between a default mode, such as python-mode
, and its tree-sitter enhanced
version, python-ts-mode
. This package attempts to remedy that by adjusting the
major-mode-remap-alist
and treesit-language-source-alist
variables in order to
get the following behavior:
1. If the grammar is installed, then switch to the appropriate tree-sitter mode:
In this case, assuming we open a Python buffer, and the Python tree-sitter
grammar is installed, then Emacs will use python-ts-mode
instead of
python-mode
.
2. The grammar is NOT installed and treesit-auto-install is non-nil:
When the grammar is not installed and treesit-auto-install
is t, then upon
activating any major mode that has a corresponding tree-sitter mode, the grammar
will be downloaded and compiled using treesit-install-language-grammar
. Emacs
will then activate the tree-sitter major mode for that buffer.
prompt
is like t, except a message will be displayed in the echo area asking
for a yes/no response before attempting the installation.
As an example for both cases: if I visit a Python file and didn’t already have
the grammar installed, I wind up with an installed grammar and a buffer using
python-ts-mode
.
Otherwise, when treesit-auto-install
is nil, it will try to fall back to
another major mode as described in the following two rules.
3. If the grammar is NOT installed, and a fallback is specified
Most languages will have a fallback mode specified, such as python-ts-mode
falling back to python-mode
, if the grammar is not installed. If you ever need
to double-check what that fallback will be, you can double check what’s in the
recipe for that language like this:
(treesit-auto-recipe-remap (alist-get 'python treesit-auto-lang-recipe-alist)) ⇒ python-mode
See “Configuration/Configuring behavior for a specific language” in case you would like to specify different fallback modes than the default.
4. The grammar is installed, but there is no fallback mode
You can optionally use the treesit-auto-add-to-auto-mode-alist
function to
ensure that your installed tree-sitter languages have their corresponding
...-ts-mode
added to auto-mode-alist
, so that Emacs opens the buffer in that
...-ts-mode
, rather than the default Fundamental
mode.
Supposing for instance we don’t have typescript-mode
installed, then even if
we do have typescript-ts-mode
installed along with the typescript grammar
compiled in ~/.emacs.d/tree-sitter/
, Emacs still won’t use
typescript-ts-mode
unless you also added '("\\.ts\\'" . typescript-ts-mode)
to auto-mode-alist
. By calling (treesit-auto-add-to-auto-mode-alist)
in
your configuration, this modification to auto-mode-alist
is done automatically
for you.
5. All other cases…
This is the most general case, where the grammar is not installed,
treesit-auto-install
is nil, and no fallback mode is specified in the language
recipe present on treesit-auto-recipe-list
. In this case, we still gain the
benefit of quickly installing grammars through treesit-install-language-grammar
without having the build the recipe interactively, but treesit-auto
will make no
attempt to switch away from the tree-sitter mode.
If you have modified treesit-language-source-alist
through setq
, then it is
recommended to put any configuration of this package AFTER that setq
.
You can globally alter the behavior of treesit-auto
to only consider a
specific set of languages by setting the treesit-auto-langs
list to a set of
language symbols. By default, this list includes every possible language that
treesit-auto
supports, so you can use M-x describe-variable RET
treesit-auto-langs
to see what the options are.
One way to use this variable is to just set it manually:
(setq treesit-auto-langs '(python rust go))
Now, treesit-auto
features will only ever affect Python, Rust, and Go files.
Running treesit-auto-install-all
will only install these three grammars, and
no automatic prompting/installation will occur when visiting a buffer that is
not one of these three, either.
Another method is to disable specific languages by just removing them from this list:
(delete 'awk treesit-auto-langs)
Here, treesit-auto
behaves as it normally would for all languages except AWK.
The treesit-auto-install
variable controls whether a grammar should be installed
automatically when activating a major mode compatible with tree-sitter.
nil
, the default, meanstreesit-auto
won’t try to install anything, and will rely on the fallback logic outlined abovet
meanstreesit-auto
should always try to clone and install a grammar when missingprompt
will cause a yes/no prompt to appear in the minibuffer before attempting installation
(setq treesit-auto-install 'prompt)
Then, supposing I don’t have libtree-sitter-python.so
(or its mac/Windows
equivalent) under ~/.emacs.d/tree-sitter
(or anywhere else in
treesit-extra-load-path
), visiting a Python file or calling M-x python-ts-mode
will generate this prompt:
Tree-sitter grammar for python is missing. Would you like to install it from https://github.com/tree-sitter/tree-sitter-python? (y or n)
Responding with “yes” will use treesit-install-language-grammar
to go fetch and
compile the missing grammar.
The other function that respects this variable is treesit-auto-install-all
.
When treesit-auto-install
is t, using M-x treesit-auto-install-all
will skip all
prompts. Otherwise, it will ask before attempting the installation.
The variable treesit-auto-recipe-list
keeps track of all the language “recipes.”
These control how treesit-auto
decides which modes to upgrade/downgrade to/from,
where the source code of the language grammar is hosted, and which C/C++
compiler to use. Each recipe can take these arguments:
:lang :ts-mode :remap :url :revision :requires :source-dir :cc :c++
To create a recipe, use make-treesit-auto-recipe
:
(setq my-js-tsauto-config
(make-treesit-auto-recipe
:lang 'javascript
:ts-mode 'js-ts-mode
:remap '(js2-mode js-mode javascript-mode)
:url "https://github.com/tree-sitter/tree-sitter-javascript"
:revision "master"
:source-dir "src"
:ext "\\.js\\'"))
(add-to-list 'treesit-auto-recipe-list my-js-tsauto-config)
Here, we’ve specified that the tree-sitter compiler will be creating a file
named libtree-sitter-javascript.so
(or .dylib
or .dll
), based on the :lang
field. The corresponding tree-sitter mode in Emacs is called js-ts-mode
, and
all of js2-mode
, js-mode
, and javascript-mode
should attempt switching to the
js-ts-mode
, if possible.
Moreover, since js-2-mode
is first under the :remap
section, that is the
“primary fallback.” Meaning that if the tree-sitter grammar is not available,
it will be the first mode tried. If that doesn’t work, it will try js-mode
, and
javascript-mode
, in that order, until one does work. If only one fallback needs
to be specified, a single quoted symbol is also acceptable. For instance,
python-ts-mode
just uses :remap 'python-mode
in this argument position.
If a grammar mandates any other grammars be installed as a dependency, the
:requires
keyword can specify a language symbol or list of symbols that should
be installed. One example of this is found in the TypeScript recipe, which
specifies :requires 'tsx
, since activating typescript-ts-mode
on some Emacs
builds will attempt to load the TSX grammar.
The :url
, :revision
, :source-dir
, :cc
, and :c++
arguments are all documented
under treesit-language-source-alist
, which is part of base Emacs, not this
package.
This package does not modify any of your major mode hooks. That is, if you have
functions in python-mode-hook
, but not in python-ts-mode-hook
, then your hook
from python-mode
will not be applied, assuming python-ts-mode
is what gets
loaded. For major modes in which this is a concern, the current recommendation
is to address this as part of your configuration.
(setq rust-ts-mode-hook rust-mode-hook)
Some modes have a shared base, such as python-ts-mode
and python-mode
both
deriving from python-base-mode
. For these languages, you can opt to hook into
python-base-mode-hook
instead of explicitly setting the tree-sitter mode’s hook.
You can register tree-sitter modes to auto-mode-alist
by calling
treesit-auto-add-to-auto-mode-alist
. Depending on the optional argument
langs
, this function can behave in three different ways:
nil
, the default - Only add tree-sitter modes toauto-mode-alist
if the tree-sitter mode is available to Emacs and the grammar is installed.'all
- For every tree-sitter mode available to Emacs and intreesit-auto-langs
, add it toauto-mode-alist
regardless of whether the grammar is installed. This has the beneficial side effect of installing grammars for you when opening files that have a tree-sitter mode that comes with Emacs, but have no fallback mode. Examples of this arerust-ts-mode
,go-ts-mode
, and a few others in Emacs 29+.- A list of language symbols - behaves like
'all
, but only for the listed languages
For instance, you might run this function as:
(treesit-auto-add-to-auto-mode-alist '(rust go toml))
This registers your tree-sitter modes according to the common file extension for
Rust, Go, and TOML, but no other modes. Most users will probably want to use
(treesit-auto-add-to-auto-mode-alist 'all)
for the easiest general behavior;
always prompting/installing grammars when we can.
I find it a good practice to be skeptical of adding any new dependency to my
Emacs configuration. So, what would you have to do to get similar behavior in
your Emacs configuration without the treesit-auto
package?
We’ll take two examples: TypeScript and Python.
Emacs 29 ships with typescript-ts-mode
and tsx-ts-mode
, but no equivalent
typescript-mode
or tsx-mode
. python-ts-mode
and python-mode
, on the
other hand, are both available out of the box. If you wanted these grammars to
automatically install on launch, and then use the tree-sitter modes instead of
the base modes for every file, you’d need the following code in your init file:
(setq treesit-language-source-alist
'((typescript . ("https://github.com/tree-sitter/tree-sitter-typescript" "master" "typescript/src"))
(tsx . ("https://github.com/tree-sitter/tree-sitter-typescript" "master" "tsx/src"))
(python . ("https://github.com/tree-sitter/tree-sitter-python"))))
(dolist (source treesit-language-source-alist)
(unless (treesit-ready-p (car source))
(treesit-install-language-grammar (car source))))
(add-to-list 'auto-mode-alist '("\\.ts\\'" . typescript-ts-mode))
(add-to-list 'auto-mode-alist '("\\.tsx\\'" . tsx-ts-mode))
(add-to-list 'major-mode-remap-alist '(python-mode . python-ts-mode))
There are plenty of reasons why some users would prefer to take this type of hand-tuned approach to their tree-sitter modes. For most Emacs people, though, you can see the natural progression of where a config like the above would go:
- You need to maintain a library of language symbols (which must match the
language’s
grammar.js
), URLs, revisions, and sub-directories fortreesit-language-source-alist
. Additionally, you’d need to know thattypescript-ts-mode
relies on the tree-sitter grammar fortsx
, which must be installed alongside the grammar for TypeScript. This is probably the main value proposition of this package, since all of this information is stored under the community-contributedtreesit-auto-recipe-list
- You need to be aware of when to use
auto-mode-alist
andmajor-mode-remap-alist
, depending on what modes are installed into Emacs - If you want lazy installation of grammars, rather than installing them all
up-front in your
init.el
, you’d need to write some hooks or derived modes to checktreesit-ready-p
for the current buffer, and install the language before loading the main tree-sitter mode - After a certain number of langauges, it becomes unweildy to do these
add-to-list
calls for every single one, and it’s better to programmatically operate on a list of selected languages
If you were to follow this chain yourself, you’d probably wind up with something
in your init.el
that looks very similar to the code in this package.
All in all, this is a small package. Roughly half of it is just maintaining a
library of information for treesit-language-source-alist
, and the other half
works through the logic of handling edge cases related to the remaining bullets
above.
This package is, admittedly, a hack. treesit.el
provides an excellent
foundation to incremental source code parsing for Emacs 29, and over time that
foundation will expand into an improved core editing experience. With that in
mind, I fully expect this package to eventually be obsolesced by the default
options in Emacs 30 and beyond. That does not preclude us from adding a few
quality of life improvements to Emacs 29, though, and so it still seems prudent
to have this plugin available in the meantime.
treesit-auto
doesn’t play super well with Org-babel, since Org has its own
methods of hooking into and using languages. In particular, you may need to set
org-src-lang-modes
yourself to get tree-sitter modes working correctly.
Another side behavior you may encounter is when opening an Org document with
shell scripts inside and treesit-auto-install
is non-nil, then treesit-auto
will prompt to install the Bash grammar, but won’t display the prompt until you
interact with Emacs in some way, such as using next-line
(C-n
by default) or
hovering over Emacs with your mouse.
Bug reports, feature requests, and contributions are most welcome. Even though this is a small project, there is always room for improvement. I also appreciate “nitpicky” contributions, such as formatting, conventions, variable naming, code simplification, and improvements to language in documentation.
Issues are tracked on GitHub, which is also where patches and pull requests should be submitted.
If you would like to submit a new language recipe to be distributed as part of this package, see CONTRIBUTING.org for a quick guide on how to write and submit the new recipe.