Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add treesitter textobjects #728

Merged
merged 7 commits into from
Oct 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,5 @@
- [Keymap](./keymap.md)
- [Key Remapping](./remapping.md)
- [Hooks](./hooks.md)
- [Guides](./guides/README.md)
- [Adding Textobject Queries](./guides/textobject.md)
4 changes: 4 additions & 0 deletions book/src/guides/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Guides

This section contains guides for adding new language server configurations,
tree-sitter grammers, textobject queries, etc.
30 changes: 30 additions & 0 deletions book/src/guides/textobject.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Adding Textobject Queries

Textobjects that are language specific ([like functions, classes, etc][textobjects])
require an accompanying tree-sitter grammar and a `textobjects.scm` query file
to work properly. Tree-sitter allows us to query the source code syntax tree
and capture specific parts of it. The queries are written in a lisp dialect.
More information on how to write queries can be found in the [official tree-sitter
documentation](tree-sitter-queries).

Query files should be placed in `runtime/queries/{language}/textobjects.scm`
when contributing. Note that to test the query files locally you should put
them under your local runtime directory (`~/.config/helix/runtime` on Linux
for example).

The following [captures][tree-sitter-captures] are recognized:

| Capture Name |
| --- |
| `function.inside` |
| `function.around` |
| `class.inside` |
| `class.around` |
| `parameter.inside` |

[Example query files][textobject-examples] can be found in the helix GitHub repository.

[textobjects]: ../usage.md#textobjects
[tree-sitter-queries]: https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax
[tree-sitter-captures]: https://tree-sitter.github.io/tree-sitter/using-parsers#capturing-nodes
[textobject-examples]: https://github.com/search?q=repo%3Ahelix-editor%2Fhelix+filename%3Atextobjects.scm&type=Code&ref=advsearch&l=&l=
13 changes: 10 additions & 3 deletions book/src/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,10 @@ Multiple characters are currently not supported, but planned.

## Textobjects

Currently supported: `word`, `surround`.
Currently supported: `word`, `surround`, `function`, `class`, `parameter`.

![textobject-demo](https://user-images.githubusercontent.com/23398472/124231131-81a4bb00-db2d-11eb-9d10-8e577ca7b177.gif)
![textobject-treesitter-demo](https://user-images.githubusercontent.com/23398472/132537398-2a2e0a54-582b-44ab-a77f-eb818942203d.gif)

- `ma` - Select around the object (`va` in vim, `<alt-a>` in kakoune)
- `mi` - Select inside the object (`vi` in vim, `<alt-i>` in kakoune)
Expand All @@ -60,5 +61,11 @@ Currently supported: `word`, `surround`.
| --- | --- |
| `w` | Word |
| `(`, `[`, `'`, etc | Specified surround pairs |

Textobjects based on treesitter, like `function`, `class`, etc are planned.
| `f` | Function |
| `c` | Class |
sudormrfbin marked this conversation as resolved.
Show resolved Hide resolved
| `p` | Parameter |

Note: `f`, `c`, etc need a tree-sitter grammar active for the current
document and a special tree-sitter query file to work properly. [Only
some grammars](https://github.com/search?q=repo%3Ahelix-editor%2Fhelix+filename%3Atextobjects.scm&type=Code&ref=advsearch&l=&l=)
currently have the query file implemented. Contributions are welcome !
1 change: 1 addition & 0 deletions helix-core/src/indent.rs
Original file line number Diff line number Diff line change
Expand Up @@ -463,6 +463,7 @@ where
unit: String::from(" "),
}),
indent_query: OnceCell::new(),
textobject_query: OnceCell::new(),
}],
});

Expand Down
43 changes: 41 additions & 2 deletions helix-core/src/syntax.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ pub struct Configuration {
#[serde(rename_all = "kebab-case")]
pub struct LanguageConfiguration {
#[serde(rename = "name")]
pub(crate) language_id: String,
pub language_id: String,
pub scope: String, // source.rust
pub file_types: Vec<String>, // filename ends_with? <Gemfile, rb, etc>
pub roots: Vec<String>, // these indicate project roots <.git, Cargo.toml>
Expand All @@ -55,6 +55,8 @@ pub struct LanguageConfiguration {

#[serde(skip)]
pub(crate) indent_query: OnceCell<Option<IndentQuery>>,
#[serde(skip)]
pub(crate) textobject_query: OnceCell<Option<TextObjectQuery>>,
}

#[derive(Debug, Serialize, Deserialize)]
Expand Down Expand Up @@ -84,6 +86,32 @@ pub struct IndentQuery {
pub outdent: HashSet<String>,
}

#[derive(Debug)]
pub struct TextObjectQuery {
pub query: Query,
}

impl TextObjectQuery {
/// Run the query on the given node and return sub nodes which match given
/// capture ("function.inside", "class.around", etc).
pub fn capture_nodes<'a>(
&'a self,
capture_name: &str,
node: Node<'a>,
slice: RopeSlice<'a>,
cursor: &'a mut QueryCursor,
) -> Option<impl Iterator<Item = Node<'a>>> {
sudormrfbin marked this conversation as resolved.
Show resolved Hide resolved
let capture_idx = self.query.capture_index_for_name(capture_name)?;
let captures = cursor.captures(&self.query, node, RopeProvider(slice));

captures
.filter_map(move |(mat, idx)| {
(mat.captures[idx].index == capture_idx).then(|| mat.captures[idx].node)
})
.into()
}
}

fn load_runtime_file(language: &str, filename: &str) -> Result<String, std::io::Error> {
let path = crate::RUNTIME_DIR
.join("queries")
Expand Down Expand Up @@ -132,7 +160,6 @@ impl LanguageConfiguration {
// highlights_query += "\n(ERROR) @error";

let injections_query = read_query(&language, "injections.scm");

let locals_query = read_query(&language, "locals.scm");

if highlights_query.is_empty() {
Expand Down Expand Up @@ -182,6 +209,18 @@ impl LanguageConfiguration {
.as_ref()
}

pub fn textobject_query(&self) -> Option<&TextObjectQuery> {
self.textobject_query
.get_or_init(|| -> Option<TextObjectQuery> {
let lang_name = self.language_id.to_ascii_lowercase();
let query_text = read_query(&lang_name, "textobjects.scm");
let lang = self.highlight_config.get()?.as_ref()?.language;
let query = Query::new(lang, &query_text).ok()?;
Some(TextObjectQuery { query })
})
.as_ref()
}

pub fn scope(&self) -> &str {
&self.scope
}
Expand Down
51 changes: 51 additions & 0 deletions helix-core/src/textobject.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
use std::fmt::Display;

use ropey::RopeSlice;
use tree_sitter::{Node, QueryCursor};

use crate::chars::{categorize_char, char_is_whitespace, CharCategory};
use crate::graphemes::next_grapheme_boundary;
use crate::movement::Direction;
use crate::surround;
use crate::syntax::LanguageConfiguration;
use crate::Range;

fn find_word_boundary(slice: RopeSlice, mut pos: usize, direction: Direction) -> usize {
Expand Down Expand Up @@ -51,6 +55,15 @@ pub enum TextObject {
Inside,
}

impl Display for TextObject {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(match self {
Self::Around => "around",
Self::Inside => "inside",
})
}
}

// count doesn't do anything yet
pub fn textobject_word(
slice: RopeSlice,
Expand Down Expand Up @@ -108,6 +121,44 @@ pub fn textobject_surround(
.unwrap_or(range)
}

/// Transform the given range to select text objects based on tree-sitter.
/// `object_name` is a query capture base name like "function", "class", etc.
/// `slice_tree` is the tree-sitter node corresponding to given text slice.
pub fn textobject_treesitter(
slice: RopeSlice,
range: Range,
textobject: TextObject,
object_name: &str,
slice_tree: Node,
lang_config: &LanguageConfiguration,
_count: usize,
) -> Range {
let get_range = move || -> Option<Range> {
let byte_pos = slice.char_to_byte(range.cursor(slice));

let capture_name = format!("{}.{}", object_name, textobject); // eg. function.inner
let mut cursor = QueryCursor::new();
let node = lang_config
.textobject_query()?
.capture_nodes(&capture_name, slice_tree, slice, &mut cursor)?
.filter(|node| node.byte_range().contains(&byte_pos))
.min_by_key(|node| node.byte_range().len())?;

let len = slice.len_bytes();
let start_byte = node.start_byte();
let end_byte = node.end_byte();
if start_byte >= len || end_byte >= len {
return None;
}

let start_char = slice.byte_to_char(start_byte);
let end_char = slice.byte_to_char(end_byte);

Some(Range::new(start_char, end_char))
};
get_range().unwrap_or(range)
}

#[cfg(test)]
mod test {
use super::TextObject::*;
Expand Down
19 changes: 19 additions & 0 deletions helix-term/src/commands.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4160,9 +4160,28 @@ fn select_textobject(cx: &mut Context, objtype: textobject::TextObject) {
let (view, doc) = current!(cx.editor);
let text = doc.text().slice(..);

let textobject_treesitter = |obj_name: &str, range: Range| -> Range {
let (lang_config, syntax) = match doc.language_config().zip(doc.syntax()) {
Some(t) => t,
None => return range,
};
textobject::textobject_treesitter(
text,
range,
objtype,
obj_name,
syntax.tree().root_node(),
lang_config,
count,
)
};

let selection = doc.selection(view.id).clone().transform(|range| {
match ch {
'w' => textobject::textobject_word(text, range, objtype, count),
'c' => textobject_treesitter("class", range),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we want a better name than class here, maybe type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment up here about the naming, and I think we should aim for an intuitive name. I think "type" might be a bit misleading ? Though I guess the same can be said for "class" :|

'f' => textobject_treesitter("function", range),
'p' => textobject_treesitter("parameter", range),
// TODO: cancel new ranges if inconsistent surround matches across lines
ch if !ch.is_ascii_alphanumeric() => {
textobject::textobject_surround(text, range, objtype, ch, count)
Expand Down
21 changes: 21 additions & 0 deletions runtime/queries/go/textobjects.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
(function_declaration
body: (block)? @function.inside) @function.around

(func_literal
(_)? @function.inside) @function.around

(method_declaration
body: (block)? @function.inside) @function.around

;; struct and interface declaration as class textobject?
(type_declaration
(type_spec (type_identifier) (struct_type (field_declaration_list (_)?) @class.inside))) @class.around

(type_declaration
(type_spec (type_identifier) (interface_type (method_spec_list (_)?) @class.inside))) @class.around

(parameter_list
(_) @parameter.inside)

(argument_list
(_) @parameter.inside)
14 changes: 14 additions & 0 deletions runtime/queries/python/textobjects.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
(function_definition
body: (block)? @function.inside) @function.around

(class_definition
body: (block)? @class.inside) @class.around

(parameters
(_) @parameter.inside)

(lambda_parameters
(_) @parameter.inside)

(argument_list
(_) @parameter.inside)
26 changes: 26 additions & 0 deletions runtime/queries/rust/textobjects.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
(function_item
body: (_) @function.inside) @function.around

(struct_item
body: (_) @class.inside) @class.around

(enum_item
body: (_) @class.inside) @class.around

(union_item
body: (_) @class.inside) @class.around

(trait_item
body: (_) @class.inside) @class.around

(impl_item
body: (_) @class.inside) @class.around

(parameters
(_) @parameter.inside)

(closure_parameters
(_) @parameter.inside)

(arguments
(_) @parameter.inside)