Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generics with React TSX Language Leave plain-text Token Open #2594

Open
karlhorky opened this issue Oct 16, 2020 · 9 comments
Open

Generics with React TSX Language Leave plain-text Token Open #2594

karlhorky opened this issue Oct 16, 2020 · 9 comments

Comments

@karlhorky
Copy link
Contributor

Information:

  • Prism version: Test Drive version
  • Plugins: Test Drive plugins
  • Environment: Test Drive page

Description

The generic Foo<string> in the following code breaks syntax highlighting of the second function Add2 with the React TSX language:

function Add1(a, b) { return <div>a + b</div> }

type Bar = Foo<string>;

function Add2(a, b) { return <div>a + b</div> }

Example

Screenshots from Test Drive:

Screen Shot 2020-10-16 at 18 26 59

Screen Shot 2020-10-16 at 18 28 12

https://prismjs.com/test.html#language=tsx

@karlhorky
Copy link
Contributor Author

karlhorky commented Oct 16, 2020

Another example, this time with more common, real-world code (highlighting on handleChange function is broken):

import React, { MouseEvent, ChangeEvent, FormEvent } from 'react';

function handleSubmit(event: FormEvent<HTMLFormElement>) {
  event.preventDefault();
}

function handleChange(event: ChangeEvent<HTMLInputElement>) {
  console.log(event.target.value);
}

function handleClick(event: MouseEvent) {
  console.log(event.button);
}

export default function Form() {
  return (
    <form onSubmit={handleSubmit}>
      <input onChange={handleChange} placeholder="Name" />
      <button onClick={handleClick}></button>
    </form>
  );
}

Screen Shot 2020-10-16 at 18 34 31

@RunDevelopment
Copy link
Member

I looked into it and I don't see an easy fix for this issue.

I'm sorry to say this but this might be open for a while.

@karlhorky
Copy link
Contributor Author

karlhorky commented Oct 16, 2020

Yeah, I was thinking that it may not be an easy one to fix!

Wonder if this fact could help: there is no matching closing tag for generics but JSX requires a matching closing tag.

Was thinking in terms of regex lookahead, but maybe that would be expensive...?

@karlhorky
Copy link
Contributor Author

karlhorky commented Oct 16, 2020

Or, another idea: require a non-word character \W in front of a JSX tag - otherwise assume it's a generic.

It's rare to have text up against the left side of a starting tag in JSX - and it can and most often is formatted differently than this.

@RunDevelopment
Copy link
Member

Wonder if this fact could help: there is no matching closing tag for generics but JSX requires a matching closing tag.

Unfortunately, that won't work for two reasons.

Firstly, there are some tags that are self-closing as per HTML spec without the self-closing syntax (e.g. <br>). I don't know if JSX/TSX have some restrictions here and actually demand valid XML but I doubt it.

Secondly, nested tags. Let's look at <div><div></div></div>. Obviously, we can't stop after the first </div> but that means that we have to count the number of opening and closing divs which is impossible for regexes.

Or, another idea: require a non-letter character [^a-zA-Z] in front of a JSX tag - otherwise assume it's a generic.

We can actually go a step further here. JSX/TSX sections are compiled down to JS expressions, so know what the character before it has to look like. JS's regex pattern uses the same idea, so we could just copy the lookbehind.

However, this means that Prism wouldn't be able highlight beginner-level JSX code like <p>foo <strong>bar</strong></p> which is just sad.


My main idea on how to implement a fix is to not use regexes.

Instead of matching individual HTML tags, we match the whole Markup section. We would wrap each Markup section with its own token and then highlight tags within it. This would make a number of JSX things easier to implement, so I quite like the idea. The problem is that this just can't be done with regexes - no way. Hence #2595.

@karlhorky
Copy link
Contributor Author

there are some tags that are self-closing as per HTML spec without the self-closing syntax

I don't know if JSX/TSX have some restrictions here and actually demand valid XML but I doubt it.

Yes, JSX demands self closing tags to have the slash.

Secondly, nested tags

Yeah, this is why I was thinking it may be prohibitively expensive.

@karlhorky
Copy link
Contributor Author

karlhorky commented Oct 17, 2020

However, this means that Prism wouldn't be able highlight beginner-level JSX code like ...................... <p>foo <strong>bar</strong></p>

Actually, the example you posted above would work. There's a space before the opening strong tag, so Prism can be certain it's not a generic.

A broken example would look like:

<p>foo<strong>bar</strong></p>

In my experience of looking at code examples since years across many platforms (also beginner platforms like Stack Overflow), I have rarely if ever seen this pattern. I think it seems like a good trade-off if the user is using TSX - JSX and TypeScript together (I'm proposing to NOT change it for the JSX or TypeScript or JavaScript languages on their own) that they format this very uncommon pattern to this code to this equivalent code for proper highlighting (JSX strips the whitespace after new lines). Probably if a user is using all of these technologies, they are not a beginner anyway:

<p>
  foo
  <strong>bar</strong>
</p>

@karlhorky
Copy link
Contributor Author

Oh and super interesting about the context-free grammar issue you opened.

Regardless of which way this one request goes, I'll be watching #2595 :)

@RunDevelopment
Copy link
Member

Yes, JSX demands self closing tags to have the slash.

That's really good. It makes things a lot easier for us.

In my experience of looking at code examples since years across many platforms (also beginner platforms like Stack Overflow), I have rarely if ever seen this pattern. I think it seems like a good trade-off if the user is using TSX - JSX and TypeScript together (...) that they format this very uncommon pattern to this code to this equivalent code for proper highlighting (...).

I can only second that. Formatters like Prettier will automatically insert the line breaks you added, so that we will ever come across the one-liner in the wild is very unlikely indeed.
The main reason I'm somewhat against it is that it means that Prism's highlighting will be very whitespace-sensitive. The whitespace-sensitivity goes both ways: Foo <Bar> should be a simple generic type but it won't be highlighted as such.

Nevertheless, I'll make a quick PR for TSX with the lookbehind you proposed. The method isn't perfect but I think the tradeoffs are worth it. (Let's keep this issue open even after the PR is merged.)

PS. I also want to point out that inferred type parameters of function expressions (e.g. type Foo = <T>(x: T) => T) will most likely still be highlighted incorrectly even with a CF grammar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants