Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language Injection? #13

Closed
colinxs opened this issue May 20, 2021 · 14 comments · Fixed by #31
Closed

Language Injection? #13

colinxs opened this issue May 20, 2021 · 14 comments · Fixed by #31

Comments

@colinxs
Copy link

colinxs commented May 20, 2021

What do you think about adding language injection for string and `indented_string to get syntax highlighting for embedded scripts? Here's what it looks like for a bash string:
image

That's with [ (string) (indented_string) ] @bash in injections.scm. I'm not sure how to auto-detect the language however. Looking at the docs that's what @language is for, but unlike e.g. Markdown there's no node with the language name. Is it possible to match based off of something like a shebang in the string?

@rummik
Copy link

rummik commented Jun 27, 2021

A couple years ago I started using a comment prefix on indented/multiline strings, which borrows an idea from Markdown where you tag the language being used. I've been meaning to try and start a discussion about it to hopefully standardize on something.

This is what it looks like:
image

You can take a look at the PR at LnL7/vim-nix#28

@oxalica
Copy link
Contributor

oxalica commented Oct 15, 2021

For anyone interested, I implemented bash injections for mkDerivation's phase scripts and hooks in my fork for nvim-treesitter, which would automatically just work in most of packages.
Since I also updated many grammar, if you want to try it, you need to recompile the parser from my fork and copy queries/nvim-* to nvim runtime path queries/nix/* to override nvim-treesitter's packaged queries.

@cstrahan
Copy link
Collaborator

@oxalica Could I pull in your injections under this project's license (MIT)? That looks pretty slick.

@oxalica
Copy link
Contributor

oxalica commented Mar 18, 2022

@oxalica Could I pull in your injections under this project's license (MIT)? That looks pretty slick.

Sure, thanks. But I did some changes on grammar rules in my fork and I'm not sure the injection scm is fully compatible. Maybe some manual tests should be done.

@nrdxp
Copy link

nrdxp commented May 18, 2022

FWIW, I tried to use @oxalica's injections without recompiling from his branch and got a runtime error in Helix, so it'll most likely be necessary to pull his grammar changes as well.

@nrdxp
Copy link

nrdxp commented Aug 28, 2022

Is it possible in tree-sitter to simply try to parse a string as a given language and fallback if it fails? We could then just add a segment to parse '' '' style strings as a series of languages until one suceeds (say bash, python, perl, etc) , falling back to just highlighting as a string if none do.

@the-mikedavis
Copy link
Contributor

Tree-sitter itself doesn't have a built-in way to do that. It would be up to the consumer (editor/linter/etc) to implement it.

@nrdxp
Copy link

nrdxp commented Aug 30, 2022

Gave a go of this in helix-editor/helix#3594

nrdxp added a commit to nrdxp/tree-sitter-nix that referenced this issue Sep 1, 2022
nrdxp added a commit to nrdxp/tree-sitter-nix that referenced this issue Sep 1, 2022
nrdxp added a commit to nrdxp/tree-sitter-nix that referenced this issue Sep 1, 2022
nrdxp added a commit to nrdxp/tree-sitter-nix that referenced this issue Sep 3, 2022
nrdxp added a commit to nrdxp/tree-sitter-nix that referenced this issue Sep 7, 2022
@rnhmjoj
Copy link

rnhmjoj commented Dec 5, 2022

Can you tell me how do you use this feature? I have tree-sitter-nix at 1b69cf1 and nvim +checkhealth shows nix supports injections, but I don't see any difference.

@nrdxp
Copy link

nrdxp commented Dec 6, 2022

@rnhmjoj, I haven't used neovim for a number of years, but the queries themselves would actually have to be used by it to be useful. Not sure how that is handled there, but in Helix (what I use now), they maintain their own in-tree queries and don't use upstream, so perhaps something like that is going on?

For example, I have a PR open to inject based on shebang which isn't even possible with tree-sitter proper, but because of their model of maintaining their own implementation and queries it is possible in Helix specifically.

@rnhmjoj
Copy link

rnhmjoj commented Dec 9, 2022

they maintain their own in-tree queries and don't use upstream, so perhaps something like that is going on?

Indeed, they maintain their own queries at nvim-treesitter. The injections for Nix have only been recently added and are not in NixOS 22.11, yet. See nvim-treesitter/nvim-treesitter#3842

Anyway, I looked into this and I noticed two problems:

  1. I think the injection for runCommand is wrong, or at least it's not matching anything. I fixed it with this change:

      (apply_expression
       (apply_expression
         function: (apply_expression function: (_) @_func))
       argument: [
         (string_expression (string_fragment) @bash)
         (indented_string_expression (string_fragment) @bash)
       ]
       (#match? @_func "(^|\\.)runCommand(((No)?CC)?(Local)?)?$"))
  2. Neovim seems to be using a different syntax for setting the language based on a match:
    @injection.language@language, @injection.content@content. Also injection.combined is not supported.
    I tried this below, but it's not working either, too bad.

      (((comment) @language)
       (indented_string_expression
         (string_fragment) @content))
    

@oxalica
Copy link
Contributor

oxalica commented Dec 9, 2022

@rnhmjoj

Anyway, I looked into this and I noticed two problems:

I suggest you to create a PR directly to nvim-treesitter if you found anything worthy, since the query is maintained by them.

Up to now, every editor have their own query dialects, like the @injection.language (helix) v.s. @language (nvim-treesitter) difference you've noticed. Tree-sitter has one official dialect, nvim-treesitter has its own, helix has another, I guess emacs treesitter created one also. Different query dialects even have different priorities (trees-itter official's follows former-override-latter, while nvim-treesitter follows latter-override-former), and you sometimes need to reverse the whole query file to make it work.

I think it's too hard to maintain every query dialects in this repo. I'm also tired of the fragmentation.

@tomberek
Copy link

Can/should this also have something supporting a single-line comment:

# lua
''
... some lua code
``

rather than only /* lua */ block-style comment?

@nrdxp
Copy link

nrdxp commented Apr 16, 2024

Can/should this also have something supporting a single-line comment:

It works that way in helix (you might be able to steal some of the queries from downstream). FWIW, I also wrote as many intelligent queries as I could to catch as many cases as possible without having to manually comment anything. For example, in:

pkgs.writeText "foo.py" ''
  # some python
  ''

would detect the filetype based on the filename extension in the first argument and correctly highlight as python. I even extended the tree-sitter protocol implemented in helix somewhat to allow for highlight strings based on shebangs, which is not an upstream feature.

I'm not sure about other editors though, probably not as thorough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants