Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the parser somewhat #9776

Merged
merged 10 commits into from
Jan 27, 2024
Merged

Refactor the parser somewhat #9776

merged 10 commits into from
Jan 27, 2024

Conversation

pennae
Copy link
Contributor

@pennae pennae commented Jan 15, 2024

Motivation

the parser is currently not in great shape. it's tightly coupled to EvalState, it does more than just parsing, and it contains heaps of not entirely related code in the core parser.y file. let's refactor it a bit to make it easier to work with.

Context

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

this is a proper subset of Formals anyway, so let's just use those and
avoid the extra allocations and moves.
since nix doesn't use the bison `error` terminal anywhere any invocation
of yyerror will immediately cause a failure. since we're *already*
leaking tons of memory whatever little bit bison allocates internally
doesn't much matter any more, and we'll be replacing the parser soon anyway.

coincidentally this now also matches the error behavior of URIs when
they are disabled or ~/ paths in pure eval mode, duplicate attr
detection etc.
all of them need access to parser state in some way. make them members
to allow this without fussing so much.
ParserState better describes what this struct really is. the parser
really does modify its state (most notably position and symbol tables),
so calling it that rather than obliquely "data" (which implies being
input only) makes sense.
most instances of this being used do not refer to the "current"
position, sometimes not even to one reasonably close by. it could also
be called `makePos` instead, but `at` seems clear in context.
most EvalState and Expr members defined here could be elsewhere, where
they'd be easier to maintain (not being embedded in a file with arcane
syntax) and *somewhat* more faithfully placed according to the path of
the file they're defined in.
there's no reason the parser itself should be doing semantic analysis
like bindVars. split this bit apart (retaining the previous name in
EvalState) and have the parser really do *only* parsing, decoupled from
EvalState.
these symbols are used a *lot*, so it makes sense to cache them. this
mostly increases clarity of the code (however clear one may wish to call
the parser desugaring here), but it also provides a small performance
benefit.
@pennae pennae requested a review from edolstra as a code owner January 15, 2024 15:53
src/libexpr/eval.cc Outdated Show resolved Hide resolved
Co-authored-by: John Ericson <[email protected]>
@Ericson2314 Ericson2314 changed the title refactor the parser somewhat Refactor the parser somewhat Jan 22, 2024
}

else if (hasPrefix(value, "flake:")) {
experimentalFeatureSettings.require(Xp::Flakes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware of this feature. (Ref: #7026)

@@ -20,6 +20,9 @@
#include "gc-small-vector.hh"
#include "url.hh"
#include "fetch-to-store.hh"
#include "tarball.hh"
#include "flake/flakeref.hh"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This oddity was already in the parser. Moving it here is directionally correct, further future work might refactor this more. It is for the NIX_PATH handling of "flake:something" URLs.

Copy link
Contributor

@tomberek tomberek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be exactly as advertised, refactoring the parser and separting concerns a bit better than before.

@thufschmitt
Copy link
Member

Discussed during the Nix maintainers meeting on 2024-01-22.
Idea approved, assigned to @Ericson2314 and @tomberek for further review

- Talked about @pennae's larger plans for a parser rewrite
  • Exciting stuff: correctness, big performance improvements too.

    • Often 20% faster (just parsing, to be clear)
    • Not more memory usage
  • @edolstra says in some ways this is similar to previous iterations of the parser implementation

  • @roberth: We'll hold of changes to the parser if needed and sync with them and pennae. (also parser PRs are rare)

  • @tomberek, @roberth: Let's make a sort of announcement issue.

    • Explain the intent
    • Make sure other Nix implementations (tvix, rnix) are informed.
    • We don't get too many parser PRs, but this could be hint for anyone thinking of making one to wait while such big changes are in flight.
  • Refactor the parser somewhat #9776 assigned to @tomberek & @Ericson2314 for joint review.

  • @pennae: notes more prep PRs are on the way; looking forward to seeing them!

  • @tomberek: are we OK dependencies wise?

    • Loose 2 gain 1 sounds like a good trade!
  • @roberth: How is compile time performance?

    • @pennae: Haven't tested, but feels fine
    • Splitting the very large file (from this prep PR) also helps a lot.
    • @roberth: sounds good. If it feels reasonable, that's good enough :).
  • @Ericson2314: Also excited for the test coverage improvements, independent of parser rewriting.

Goals/requirements

  • Do not allow users to write new syntax that's currently invalid
    • at least not without any warning
  • Unblock the new parser

Decision Options (decision is #2):

  1. Merge 9722 first
    • easiest
  2. Close 9722 (picked!!!)
    • next-easiest, but creates larger parser rules
  3. re-write 9722 to disallow arbitrary numbers of "/"s
    • most work and changes behavior

Decision: the new PEGTL parser is "Idea Approved"

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-01-22-nix-team-meeting-minutes-117/38838/1

Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried confirming things were just moved around with git diff --color-moved. Looks good!

@Ericson2314 Ericson2314 merged commit b83a2fb into NixOS:master Jan 27, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants