Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix all uses of String.prototype.substr #408

Open
hildjj opened this issue Apr 6, 2023 · 5 comments
Open

Fix all uses of String.prototype.substr #408

hildjj opened this issue Apr 6, 2023 · 5 comments

Comments

@hildjj
Copy link
Contributor

hildjj commented Apr 6, 2023

See the deprecation warning on MDN.

It looks like we can substitute String.prototype.slice, including in the generated code. Think about edge cases of NaN, negative, and swapped start and end in all cases.

@hildjj hildjj added the blocks-release Blocks an imminent release. High Priorty. label Apr 6, 2023
@reverofevil
Copy link

Why? Half the web depends on it, and it's often more convenient for parsers than slice or substring. At the very least it shouldn't block any releases.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 7, 2023

I'm not sure why it's more convenient than slice or substring? Can you say more?

@reverofevil
Copy link

reverofevil commented Apr 7, 2023

To check that a contains b starting with a certain position (with correct handling of potential end of string) is usually done as a.substr(peg$pos, b.length) === b. The other two methods take the end position instead of length, so with no substr it should be computed as peg$pos + b.length for no good reason.

As I understand, substr will be there forever in JS implementations along with ordering of fields in objects, <marquee> tag, with operator and quirky behavior of sort, because

  • otherwise early websites would stop operating;
  • there is no graceful degradation in case it ceases to exist;
  • removing it from an implementation of a JS engine produces barely any profit;
  • prototypes of globals can't really change in strict mode.

TLDR I don't see how following this deprecation might improve anything.

@markw65
Copy link

markw65 commented Sep 2, 2023

Not sure if this was just abandoned on the grounds that substr will never be removed, but I did some investigation.

  • the only place substr is used now is in generated code (and one reference from docs/vendor/codemirror)
  • if I add a helper function, peg$substr at the same level as peg$currPos that just takes a length parameter, and returns input.slice(peg$currPos, peg$currPos + length), and use that instead of input.substr everywhere, the generated parser is smaller (peg$substr(5) vs input.substr(peg$currPos, 5)), and I can't measure a difference in performance in any of the parsers that I've generated (and I've tried with parser/input combos that take hundreds of ms).
  • I wrote a small test case that sets up something comparable to the parser, and just calls my new substr, or input.substr in a loop, and there I see a very small slowdown. Maybe as much as 10%, but there's so much noise it's hard to tell (on some runs my helper is actually faster, but its slower more often than not).

In summary, it seems like switching over is easy, there's a code size win (not a big deal), and maybe a very small performance penalty, but not measurable in real cases.

Here's the test case, in case I'm making a systematic error of some sort:

const fs = require("node:fs/promises");

function wrapper(input) {
  let currPos = 0;
  function substr(n) {
    return input.slice(currPos, currPos + n);
  }

  function parseOld() {
    currPos = 0;
    while (input.substr(currPos, 5) !== "") {
      currPos += 5;
    }
  }
  function parseNew() {
    currPos = 0;
    while (substr(5) !== "") {
      currPos += 5;
    }
  }
  return [parseOld, parseNew];
}

fs.readFile("<a 20k text file>").then(
  (data) => {
    const [parseOld, parseNew] = wrapper(data.toString());
    function test(testFn, which) {
      const start = Date.now();
      for (let i = 0; i < 10000; i++) {
        testFn();
      }
      const end = Date.now();
      console.log(`${which} Took ${end - start}ms`);
    }
    test(parseOld, "parseOld");
    test(parseNew, "parseNew");
  }
);

@mikeaustin
Copy link

I agree that substr() will live forever, but there are ways of writing "modern" JavaScript. .slice() is mirrored by Array, which makes it more consistent. You can still use "var" and "==" all you want, but I wouldn't allow it in a recent PR. I think modernizing a code-base is a good goal to have. It's just confusing to have 3 functions that basically do the same thing.

https://stackoverflow.com/questions/2243824/what-is-the-difference-between-string-slice-and-string-substring

@hildjj hildjj removed the blocks-release Blocks an imminent release. High Priorty. label Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants