Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it ok to reuse Prism object when loading multiple languages? #1393

Closed
CoenraadS opened this issue Apr 15, 2018 · 14 comments
Closed

Is it ok to reuse Prism object when loading multiple languages? #1393

CoenraadS opened this issue Apr 15, 2018 · 14 comments
Assignees
Labels

Comments

@CoenraadS
Copy link

Hey

I noticed that if I do:

loadLanguages(["javascript"]);

and then later loadLanguages(["markup", "javascript"]);

Then the second one won't parse the javascript. However if ["markup", "javascript"] is loaded first it works correctly.

@Golmote
Copy link
Contributor

Golmote commented Apr 15, 2018

Hi, can you provide a full runnable snippet to illustrate the issue please? I'm not sure I get it.

@CoenraadS
Copy link
Author

var Prism = require("prismjs/components/prism-core.js");
var loadLanguages = require("prismjs/components/index.js");

loadLanguages("javascript"); // This line causes script parsing to break in the following html file
loadLanguages(["markup", "javascript"]);

var someFile = 
`
<!DOCTYPE html>
<html lang="en"> 
  <head>
    <script>
     export default 
     {
       v
     }
    </script>
</html>
`

var tokenized = Prism.tokenize(someFile, Prism.languages["markup"]);
console.log(tokenized);

@Golmote
Copy link
Contributor

Golmote commented Apr 15, 2018

Oh, OK I see now. The JS components adds support for JS in Markup only if the Markup component is loaded before the JS component is. Hence, when you first load the JS component, the Markup component is not loaded and the support is not added. When you then load Markup + JS, it too late because of require()'s cache... and the JS component won't be executed again.
We should find a way to invalidate the cache here.

@mAAdhaTTah
Copy link
Member

Basic idea is this:

delete require.cache[require.resolve(pathToLanguage)]

Would make a Good First Issue™.

@Golmote
Copy link
Contributor

Golmote commented Apr 16, 2018

Should we do this on every call? Or should we provide a second parameter to force the cache invalidation?
First one sounds quite inefficient. But the second one requires some knowledge of the inner behaviour of the components...

@CoenraadS
Copy link
Author

If its done that way, will previously loaded languages still work?

@Golmote
Copy link
Contributor

Golmote commented Apr 17, 2018

Yes, they should. I don't think reloading a component can break any previously loaded component.

@mAAdhaTTah
Copy link
Member

I don't think it's that inefficient. We're not loading that many files. Users also probably shouldn't be loading the same language multiple times anyway. Maybe document all of this and just invalidate every time?

Adding a second parameter seems like the type of thing people will just throw in when things aren't working.

@Golmote
Copy link
Contributor

Golmote commented Apr 17, 2018

Alright, agreed.

@Golmote
Copy link
Contributor

Golmote commented Apr 22, 2018

I don't think reloading a component can break any previously loaded component.

Actually, I am wrong about this.

Consider we do implement the cache clearing, and a user wants to highlight JS in HTML:

This will work, obviously:

loadLanguages(['markup', 'javascript']);

This will work too, thanks to the cache being cleared:

loadLanguages('javascript'); loadLanguages(['markup', 'javascript']);

But this will break, because of the cache being cleared...

loadLanguages(['markup', 'javascript']); loadLanguages('markup');

Slightly more subtle, this won't work either, because markdown requires markup and thus refreshes it:

loadLanguages(['markup', 'javascript']); loadLanguages('markdown');

I'm really reconsidering the idea of the second parameter...

@mAAdhaTTah
Copy link
Member

Wow, nice catch.

If we wanted to still avoid said second parameter, we'd have to resolve them in the loadLanguages function, e.g. when we do this:

loadLanguages(['markup', 'javascript']); loadLanguages('markup');

We'd have to reload javascript after we've loaded markup the second time.

Is that "optional dependency" made explicit anywhere? We could add optional dependencies to components.js, and reload them conditionally based on whether they've been loaded already or not.

It's not ideal, from an implementation perspective, but I'd much rather pull that complexity into Prism rather than expecting users to understand whether or not they should be invalidating the cache. Understanding whether to do so requires users to understand Prism's internals, which I'd like to avoid even at the expense of making loadLanguages more complex.

@Golmote Golmote self-assigned this Apr 28, 2018
@Golmote
Copy link
Contributor

Golmote commented May 1, 2018

Ok, I dug into this.

Here is the list of the "optional dependencies" I've come up with:

Language Optional dependencies
CSS Markup
JavaScript Markup
ActionScript Markup
Django CSS, JavaScript
Haml CSS, CoffeeScript, ERB, JavaScript, Less, Markdown, Ruby, SCSS, Textile
HTTP JavaScript, Markup
OpenCL C, C++ (***)
Pug CoffeeScript, EJS, Handlebars, (Hogan), Less, LiveScript, Markdown, (Mustache), (Plates), SCSS, Stylus, (Swig)
Pure C, C++, Fortran, (ATS), (DSP)
Textile CSS

(*** This is a weird one: OpenCL extends C++ which extends C, but it also add keywords to both languages. Apparently this was done so that C/C++ code blocks could use the OpenCL host API and it would be properly highlighted. The thing that worries me the most here is the name optionalDependencies... since they are not optional at all in that specific case.)

As you can see, this is not such a long list. Yet, it may have huge impacts on the loading order.

One of the worst case scenario I've found is:

loadLanguages(['actionscript', 'http', 'textile', 'django', 'haml', 'css', 'javascript', 'markup', 'xeora']);

which will lead to the following files being actually required:

Click to expand the "graph".
  • c-like
    • javascript
      • actionscript
  • http
  • markup
    • -> reload dependents
      • javascript
      • actionscript
      • http
    • textile
  • markup
    • -> reload dependents
      • javascript
      • actionscript
      • http
    • django
  • c-like
    • ruby
      • haml
  • css
    • -> reload dependents
      • django
      • haml
      • textile
        • -> reload dependents
          • haml
  • c-like
    • javascript
      • -> reload dependents
        • django
        • haml
        • http
  • markup
    • -> reload dependents
      • css
        • -> reload dependents
          • django
          • haml
          • textile
            • -> reload dependents
              • haml
      • javascript
        • -> reload dependents
          • django
          • haml
      • actionscript
      • http
  • markup
    • -> reload dependents
      • css
        • -> reload dependents
          • django
          • haml
          • textile
            • -> reload dependents
              • haml
      • javascript
        • -> reload dependents
          • django
          • haml
      • actionscript
      • http
    • xeora

(That's 50 files require'd, for 9 languages)

Obviously, calling loadLanguages() without paremeters also leads to a huge amounts of dependencies. (413 files required, for our 148 languages)

Am I overthinking an edge case?
Do you have alternative names in mind, for the OpenCL case?

@mAAdhaTTah
Copy link
Member

mAAdhaTTah commented May 1, 2018

I don't know if that's a big deal for node applications. A standard node_modules is pretty huge and has a ton of files, you're often installing a couple hundred dependencies and require'ing a boatload of files on startup.

Are people going to be re-require'ing language definitions often enough for this to be a problem? I'm inclined to think the surprise of it not working as expected and needing to dig into why would be worse than tracking down why so many files are loaded. That's also an easier problem to solve: load up all the required languages up front, then you don't get slowdown while your app is running.

Edit: That said, I'm not writing a lot of node applications, so I don't have a lot of experience here to lean on.

@CoenraadS
Copy link
Author

CoenraadS commented May 1, 2018

Loading all languages is ~300ms for me (on demand loading was ~4ms). I can live with this since its a one time cost if everything else works correctly. If I'm the only one encountering this issue then you guys don't need to follow this up any further.

Previously I was calling loadLanguages('language') for every document the user opened, so it would be worse for my current situation if that got slower. One time slow or multiple times fast, but not multiple times slow 👍

Maybe when the node global issue is fixed, you can just keep a dictionary of languages and prism objects, and not worry about them conflicting since each object is independent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants