Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input mapping is buggy #310

Open
gibson042 opened this issue Jan 15, 2025 · 4 comments · May be fixed by #311
Open

Input mapping is buggy #310

gibson042 opened this issue Jan 15, 2025 · 4 comments · May be fixed by #311

Comments

@gibson042
Copy link

This package commits the cardinal sin of escaping: failing to fully exempt non-escaped input from unescaping (e.g., escapeKey escapes each . as \. but fails to escape \ itself). As a result, there are some injection bugs: https://jsfiddle.net/euqLwk4r/

input

({
  a: [ "/'a'[0]: string that becomes a regex/" ],
  'a.0': /'a.0': regex that becomes a string/,
  'b.0': "/'b.0': string that becomes a regex/",
  'b\\': [ /'b\\'[0]: regex that becomes a string/ ],
})

encodes to

{
  "json": {
    "a": [ "/'a'[0]: string that becomes a regex/" ],
    "a.0": "/'a.0': regex that becomes a string/",
    "b.0": "/'b.0': string that becomes a regex/",
    "b\\": [ "/'b\\\\'[0]: regex that becomes a string/" ]
  },
  "meta": {
    "values": {
      "a.0": [ "regexp" ],
      "b\\.0": [ "regexp" ]
    }
  }
}

which decodes as the input-dissimilar

({
  a: [ /'a'[0]: string that becomes a regex/ ],
  'a.0': "/'a.0': regex that becomes a string/",
  'b.0': /'b.0': string that becomes a regex/,
  'b\\': [ "/'b\\\\'[0]: regex that becomes a string/" ],
})
@Skn0tt
Copy link
Collaborator

Skn0tt commented Jan 15, 2025

Thanks for the report, this makes sense to me. You're the first in quite a long time to report this, so at least it's not a giant impact. Do you see a way for us to fix this without breaking changes?

@gibson042
Copy link
Author

You're the first in quite a long time to report this, so at least it's not a giant impact.

So basically, no one is using this with data in which a property name contains dots or backslashes and the value is not already JSON-compatible, which seems likely.

Do you see a way for us to fix this without breaking changes?

I think that depends upon what you consider breaking. But in the strictest sense, breakage seems necessary because a meta name of "a.0" MUST be applied to json.a[0] (even though it could have come from input["a.0"]) and a meta name of "b\.0" MUST be applied to json["b.0"] (even though it could have come from input["b\"]).

But if you're trying to preserve backwards compatibility such that unaffected { json, meta } records produced by an old version can still be correctly decoded by a new version that fixes this issue, that would be possible—any meta name containing at least one dot would be subject to an O(n²) search for which dots are part of property names vs. path traversal (for efficiency, presumably starting with the assumption that they all represent path traversal if an interpretation as the new style1 fails to find a json node) and any path segment containing consecutive backslashes would be subject to double interpretation (presumably starting with new-style \\\). It would also make sense to require that every property in meta always finds a json node, and further support opting into a strict mode that uses only the new interpretation. And no matter what, I strongly recommend fuzzing to find any inputs that fail to round-trip.

Footnotes

  1. suggestion for new style:

    const pathSegments = pathString.match(
      // Match any possibly-empty combination of (non-dot, backslash with
      // optional escapee) that follows either start-of-string or a
      // non-escaped dot (i.e., that does not follow an escaped dot).
      /(?<=^|(?:^|[^\\])(?:\\\\)*[.])(?:[^.\\]|\\.?)*/gs,
    ).map(segment => segment.replace(/\\(.?)/gs, (_, ch) => {
      // The only valid escapes are `\.` and `\\`.
      if (ch === "." || ch === "\\") return ch;
      throw Error(`invalid path`);
    }));
    

    Note that this specifically works for path strings that start or end with dots and/or backslashes, that contain invalid escapes (including "…\\\n…" <U+005C BACKSLASH, U+000A LINE FEED>), and that contain consecutive dots (i.e., empty path segments).

@Skn0tt Skn0tt linked a pull request Jan 17, 2025 that will close this issue
@Skn0tt
Copy link
Collaborator

Skn0tt commented Jan 17, 2025

Thanks for the recommendation. The second form of backwards compat is what i'm after, I don't want the gazillion of persisted SuperJSON strings to suddenly start failing. I'd also like to keep usage of this library simple and straight-forward, so having an opt-in for something of this bugs level of detail is a no-go. I think we can ensure backwards compatibility by introducing a version tag that toggles old vs new parsing behaviour. #311 implements that - could you give it a look?

@Skn0tt
Copy link
Collaborator

Skn0tt commented Jan 17, 2025

Fuzzing sounds like a great idea btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants