Allow passing `x-opam-monorepo-*` solver config fields via the command line. #307

NathanReb · 2022-05-25T11:02:39Z

This PR adds command line arguments to either overwrite or complement x-opam-monorepo-* in local opam files.

Each existing field gets a --<field-name> option. When this option is passed, its value is used, ignoring local opam files fields.
It also gets a --add-<field-name> option to add extra on top of the local opam files fields. It is combined in the same way two fields from different opam files are combined.

It can be useful for experimenting or in certain scenarios where those configuration bits have no natural opam file to be written in.

Implementation

To be able to do this without writing too much repetitive code I introduced Serial_shape which allows to derive both opam and cmdliner parsers and printers from a single value. It simplifies the code of Source_opam_config quite a lot.

This required moving the URL rewriting a bit further away from the parsing code but overall it makes things a bit clearer and easier to test.

You'll note the PR comes with an extensive set of tests.

For now handles opam values and Cmdliner arg converters. The Cmdliner part is a bit harder to deal with due to the lack of structure of the cmdliner arguments which are just parsed from raw strings. We might have to later enforce a more structured format to work around the ambiguities in the cmdliner parsers. Signed-off-by: Nathan Rebours <[email protected]>

This is required for `opam-provided` convenience format Signed-off-by: Nathan Rebours <[email protected]>

Signed-off-by: Nathan Rebours <[email protected]>

Each opam extension now has a pair of corresponding CLI options that can be used to either complement or overwrite the values defined in your local opam files. Signed-off-by: Nathan Rebours <[email protected]>

Signed-off-by: Nathan Rebours <[email protected]>

This was not the case to allow lighter syntax but having a stricter syntax simplifies things quite a lot. Signed-off-by: Nathan Rebours <[email protected]>

NathanReb · 2022-05-25T11:03:24Z

I'm opening as draft as I need to write the documentation first, especially since the command line arguments format needs to be properly documented.

Signed-off-by: Nathan Rebours <[email protected]>

CHANGES.md

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

NathanReb · 2022-05-25T14:09:36Z

It should be good now!

Leonidas-from-XIV

My opinion on this is two-fold: I think technically it is a nice solution, with some nice abstractions to keep duplication to a minimum. There's some bits that I would hope to improve, e.g. I'm thinking on how to structure the list parser in a more readable way, but I believe this can be improved over time. The large amount of tests is great, especially that as they test the less readable parts of the code, where I have a much lower confidence of having a complete understanding of it (and my attempts to improve the code can be tested on examples directly). On the other hand, this creates a non-trivial amount of abstraction, introducing another kind of converter.

The reason why I am sceptical is because I am not sure this is a good feature. One of the nice things about opam-monorepo in my opinion is precisely the fact that its configuration is (for the most part) in the configuration file (and we plan to move towards this and away from inheriting the switch/global state). As such making changes to the configuration in the CLI seems to me to be the antithesis to that because users now have the monorepo setup not contained in their opam file but also need to remember to pass the same options as were used last time when the lock file was created to create a reproducible solution. So they either have to check their history, ask a colleague how they ran opam-monorepo or have the call to opam-monorepo in some kind of Makefile or documentation. This spreads the configuration over potentially multiple places, which will make it worse from a user perspective.

I think one of the big downsides of opam at the moment is specifically that it is overly configurable and a project has to document what complex set of flags is required to lead to the desired state of the project.

CHANGES.md

doc/lock.mld

test/lib/test_serial_shape.ml

test/lib/test_source_opam_config.ml

Co-authored-by: Marek Kubica <[email protected]>

Leonidas-from-XIV · 2022-05-30T09:44:23Z

I've been pondering how to make the list parsing easier to follow and make sure it works, I ended up with https://github.com/Leonidas-from-XIV/opam-monorepo/tree/config-via-cli-lexer (specifically d92ff9e at time of writing) which uses ocamllex to generate a list of tokens and then a fold to construct the split list. This two-stage approach has the advantage that the joining of characters is a bit more factored out (happening mostly in the lexer) while the verification of the validity in presence of [, ] and , is now in the parser.

After fixing some fairly obvious errors (forgetting to reverse, forgetting to add to buffer) it passed the extensive suite of tests which I would take as a good sign - both of the code and the test suite.

Signed-off-by: Nathan Rebours <[email protected]>

…ation Signed-off-by: Nathan Rebours <[email protected]>

Signed-off-by: Nathan Rebours <[email protected]>

NathanReb · 2022-06-07T08:07:01Z

I took a look at your alternative parser. I find the parser code to be quite similar although slightly harder to read in my opinion. It adds more code and that code spreads over two files.

I think it's mostly a matter of personal preference here.

I don't see any clear benefit switching to from one to the other so unless you have very strong opininons on this, I think we should stick to the original code.

I commented the split_list function and renamed a couple things hoping it would make it a bit clearer.

NathanReb · 2022-06-07T08:24:23Z

The reason why I am sceptical is because I am not sure this is a good feature. One of the nice things about opam-monorepo in my opinion is precisely the fact that its configuration is (for the most part) in the configuration file (and we plan to move towards this and away from inheriting the switch/global state). As such making changes to the configuration in the CLI seems to me to be the antithesis to that because users now have the monorepo setup not contained in their opam file but also need to remember to pass the same options as were used last time when the lock file was created to create a reproducible solution. So they either have to check their history, ask a colleague how they ran opam-monorepo or have the call to opam-monorepo in some kind of Makefile or documentation. This spreads the configuration over potentially multiple places, which will make it worse from a user perspective.

I understand your concerns but I think those CLI option serves multiple purposes, as I described. They are useful for experimentation and debugging and can be used in Makefiles in contexts where there is no obivous opam file where to encode them. Some people do use Makefiles and I think they will appreciate having this ability. Tezos is one of those users.

It's also worth noting that these values are included in the generated lockfile so changing them by mistake will show in review!

Since we're also moving toward offering an update functionality to the tool, I think it's nice that it will read them from the existing lockfile so that as long as you simply want to update things, you're sure to re-use the initial configuration.

Leonidas-from-XIV · 2022-06-07T08:54:00Z

It's also worth noting that these values are included in the generated lockfile so changing them by mistake will show in review!

The problem is, that the changed file will make it very hard to determine what CLI invocation produced it, so you'll know your new file is somehow wrong but aren't able to fix this state without asking whoever created it or your past self (or shell history).

While thinking about it, I had a random idea that could be a good compromise: we could write the invocation of opam-monorepo as a comment into the lockfile. From what I see this has several advantages:

It allows people who want to reproduce the lock file to reproduce it. At least provided they use the same version of opam-monorepo but we don't really provide cross-version reproducibility for plenty of good reasons.
It serves as documentation how the lockfile came to be, so newcomers to a project using opam-monorepo are aware of it and due to the invocation written there, they have a pointer to see how to use opam-monorepo

What do you think?

NathanReb · 2022-06-07T09:02:50Z

Sure, that sounds reasonable! Do you mind if we do this in a separate PR as it will likely dump the content of argv and is therefore not strictly related to this?

Leonidas-from-XIV · 2022-06-07T09:03:47Z

I find the parser code to be quite similar although slightly harder to read in my opinion.

I think the big advantage for me to read is the distinction because lexing and parsing (although calling what my split_lines does "parsing" is slightly ambitious), but being able to see the rules how the tokens are converted into a data structure without the distraction of managing the input buffer at the same time helps me to understand how this is constructed.

However, I totally agree that the separate file is quite distracting - I had hoped ocamllex would be more helpful in this situation, being able to parse longer sequences but alas, it is very similar to your implementation and also just reads a single character at a time. Do you think it would be more readable for you if I replaced the ocamllex tokenizer with a hand-written one? It might actually end up simpler and within one file.

NathanReb · 2022-06-07T09:44:55Z

Ok yeah I think I'm starting to see what you'd like to simplify there. If I understand correctly, what you like from the lexer based implementation is the tokens abstraction, which is free from String.sub invocations.

The problem I see with this, which to be fair is not that much of a problem, is that you have to chose between the complexity in sub string extraction or the complexity of recomposing those strings correctly from the tokens. What I like about the first is that we don't go through temporary allocations as much, which is why it seemed more straightforward to me.

What would you think about merging this with the current parser and then open a PR with the proposed improved implementation with the hand-rolled tokenizer to see if we can bring it to something better?

I think the current state of this parser is good enough for merging and I'd like it if improving it wouldn't delay the merge of this PR which is otherwise quite large.

Overall I think we could quite significantly improve the cmdliner parsing here but I wanted to go for something that was good enough and worked. Ideally I did not want to impose the use of [ and ] delimiters when not necessary but ended up being limited by the implementation so that's something we could also tackle when improving it!

Leonidas-from-XIV

Ok yeah I think I'm starting to see what you'd like to simplify there. If I understand correctly, what you like from the lexer based implementation is the tokens abstraction, which is free from String.sub invocations.

Exactly. While it goes over the stream once it the tokenizer (be it ocamllex or hand-written) has one responsibility: constructing the stream of relevant tokens out of the input stream. Then the parser uses this token list to construct the parsed data structure or fail, without having to deal with control of the input stream. As you can see in my branch the parser is quite trivial, it is essentially the same logic as your code but without the input stream logic intertwined.

What would you think about merging this with the current parser and then open a PR with the proposed improved implementation with the hand-rolled tokenizer to see if we can bring it to something better?

Yes, that's ok with me, so approving the PR. We deferred the other observation to (an much better) proposal in #308 and besides these two concerns I had the code is good.

@NathanReb

CHANGES: ### Added - Add a `--minimal-update` flag to `lock` to generate a lockfile with minimum dependency changes from a previous lockfile. (tarides/opam-monorepo#305, @NathanReb) - Add command line options to complement or overwrite `x-opam-monorepo-*` fields. (tarides/opam-monorepo#307, @NathanReb) - Save the `lock` CLI arguments in `x-opam-monorepo-cli-args` when generating a lock file. (tarides/opam-monorepo#309, @NathanReb)

NathanReb added 7 commits May 23, 2022 16:31

Add choice2 to Serial_shape

b924854

This is required for `opam-provided` convenience format Signed-off-by: Nathan Rebours <[email protected]>

Report conv errors with opam file locations in Serial_shape

673bebd

Signed-off-by: Nathan Rebours <[email protected]>

Use Serial_shape to parse source config from opam extensions

b843b25

Signed-off-by: Nathan Rebours <[email protected]>

Derive cmdliner arguments for the source config

6dbf507

Each opam extension now has a pair of corresponding CLI options that can be used to either complement or overwrite the values defined in your local opam files. Signed-off-by: Nathan Rebours <[email protected]>

Extract cram tests and fix a bug with CLI configs not being rewritten

8266704

Signed-off-by: Nathan Rebours <[email protected]>

Enforce serial_shape derived cmdliner parsers to delimit lists

10b2a49

This was not the case to allow lighter syntax but having a stricter syntax simplifies things quite a lot. Signed-off-by: Nathan Rebours <[email protected]>

NathanReb requested review from Leonidas-from-XIV and tmattio May 25, 2022 11:02

NathanReb added 2 commits May 25, 2022 14:36

Add changelog entry for config options

57ec653

Signed-off-by: Nathan Rebours <[email protected]>

Document source config CLI and fields

a95c830

Signed-off-by: Nathan Rebours <[email protected]>

github-actions bot reviewed May 25, 2022

View reviewed changes

CHANGES.md Outdated Show resolved Hide resolved

Update CHANGES.md

28f7806

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

NathanReb marked this pull request as ready for review May 25, 2022 14:09

Leonidas-from-XIV reviewed May 27, 2022

View reviewed changes

NathanReb and others added 2 commits May 30, 2022 10:48

Update doc/lock.mld

6f1db5b

Co-authored-by: Marek Kubica <[email protected]>

Update test/lib/test_serial_shape.ml

ab52067

Co-authored-by: Marek Kubica <[email protected]>

NathanReb added 10 commits June 1, 2022 16:44

Improve changelog entry for solver config options

ae240c5

Signed-off-by: Nathan Rebours <[email protected]>

Fix example invocation of --add-* options in solver config document…

eab7959

…ation Signed-off-by: Nathan Rebours <[email protected]>

Simplify let* binding in Serial_shape.from_opam_val

a98ae7e

Signed-off-by: Nathan Rebours <[email protected]>

Wrap overwrite and add configs together

b27403d

Signed-off-by: Nathan Rebours <[email protected]>

Improve doc for config adjustment CLI options

db23a6b

Signed-off-by: Nathan Rebours <[email protected]>

Clarify URL rewriting code

3cada3c

Signed-off-by: Nathan Rebours <[email protected]>

Remove unnecessary parens in Test_source_opam_config

4b4f2d0

Signed-off-by: Nathan Rebours <[email protected]>

Use existing opam package name set pretty printer for testable

f46d17f

Signed-off-by: Nathan Rebours <[email protected]>

Use OpamUrl.Set.map instead of using a Seq

0051e0b

Signed-off-by: Nathan Rebours <[email protected]>

Attempt to clarify Serial_shape.split_list

b0acb7f

Signed-off-by: Nathan Rebours <[email protected]>

NathanReb requested a review from Leonidas-from-XIV June 7, 2022 08:07

Leonidas-from-XIV mentioned this pull request Jun 7, 2022

Write opam-monorepo invocation into lock file #308

Closed

Leonidas-from-XIV approved these changes Jun 7, 2022

View reviewed changes

NathanReb merged commit a76c324 into tarides:main Jun 7, 2022

Leonidas-from-XIV mentioned this pull request Jun 8, 2022

Use a tokenization approach when parsing the list #311

Merged

NathanReb mentioned this pull request Jun 9, 2022

[new release] opam-monorepo (0.3.2) ocaml/opam-repository#21498

Merged

NathanReb deleted the config-via-cli branch June 10, 2022 15:08

NathanReb mentioned this pull request Jun 13, 2022

Add CLI options as an alternative to opam extensions to configure lock #278

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing `x-opam-monorepo-*` solver config fields via the command line. #307

Allow passing `x-opam-monorepo-*` solver config fields via the command line. #307

NathanReb commented May 25, 2022

NathanReb commented May 25, 2022

NathanReb commented May 25, 2022

Leonidas-from-XIV left a comment

Leonidas-from-XIV commented May 30, 2022

NathanReb commented Jun 7, 2022

NathanReb commented Jun 7, 2022

Leonidas-from-XIV commented Jun 7, 2022

NathanReb commented Jun 7, 2022

Leonidas-from-XIV commented Jun 7, 2022

NathanReb commented Jun 7, 2022

Leonidas-from-XIV left a comment

Allow passing x-opam-monorepo-* solver config fields via the command line. #307

Allow passing x-opam-monorepo-* solver config fields via the command line. #307

Conversation

NathanReb commented May 25, 2022

Implementation

NathanReb commented May 25, 2022

NathanReb commented May 25, 2022

Leonidas-from-XIV left a comment

Choose a reason for hiding this comment

Leonidas-from-XIV commented May 30, 2022

NathanReb commented Jun 7, 2022

NathanReb commented Jun 7, 2022

Leonidas-from-XIV commented Jun 7, 2022

NathanReb commented Jun 7, 2022

Leonidas-from-XIV commented Jun 7, 2022

NathanReb commented Jun 7, 2022

Leonidas-from-XIV left a comment

Choose a reason for hiding this comment

Allow passing `x-opam-monorepo-*` solver config fields via the command line. #307

Allow passing `x-opam-monorepo-*` solver config fields via the command line. #307