Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pants-plugins/schemas to streamline regenerating contrib/schemas #5847

Merged
merged 13 commits into from
Jan 10, 2023

Conversation

cognifloyd
Copy link
Member

@cognifloyd cognifloyd commented Dec 12, 2022

Background

This is another part of introducing pants, as discussed in various TSC meetings.

Related PRs can be found in:

Overview of this PR

This PR improves the DX (developer experience) by wiring up the fmt and lint goals to generate contrib/schemas/*.json if needed (or for lint, warn that it needs to be done).

This includes creating a schemas plugin for pants that adds a schemas target.

Developer Experience

Today, if someone changes one of the files used to generate the schemas in contrib/schemas, they will probably forget to run the Makefile target the regenerates them: make schemasgen.

I've missed running that several times. If I'm lucky, CI complains with an error telling me to run that. I've seen CI logs print an error without actually failing the build. I believe that particular Makefile bug is fixed, but that anecdote highlights a usability issue: friction around generated files.

With this PR, we add schema generation into the fmt goal. This also adds it to the lint goal which will tell people if fmt needs to run to update any code (or in this case, regenerate the schema). Then, we only need simple instructions that say something like:

Please run ./pants fmt lint :: before submitting your PR. This will reformat code, if needed, and warn you about any other errors.

Fine grained results cache

After someone runs ./pants fmt :: once, they will benefit from the pants' results cache. For the schemas plugin, here's what that means:

First, the schemas target in contrib/schemas depends on st2common/bin/st2-generate-schemas:

schemas(
dependencies=[
"st2common/bin/st2-generate-schemas",
],
)

If the st2-generate-schemas script changes, then pants now knows that the schemas need to be regenerated. Or, in other words, the st2-generate-schemas file is part of the cache key for contrib/schemas/*.json. If that file changes, then pants will re-run the generator.

st2-generate-schemas is a python script, and we use pants' python dependency inference. So pants (effectively) also adds the python files imported by st2-generate-schemas to the cache key. Stepping through the code, we can see what pants would infer:

from st2common.cmd import generate_schemas

from st2common.models.api import action as action_models
from st2common.models.api import pack as pack_models
from st2common.models.api import policy as policy_models
from st2common.models.api import rule as rule_models
from st2common.models.api import sensor as sensor_models

Here we reach the primary definition for the schemas that end up in contrib/schemas: the definition comes from the st2common.model.api.* modules. Now, if anyone changes any of these files, or any of the python bits that get imported to help define things in them, then the next time someone runs the lint goal (or in CI) they will find out that they need to run the fmt goal.

This PR actually only wires up the fmt goal and took advantage of the fact that pants also runs formatters whenever it runs linters, which are side-effect free. And because pants runs the formatters in a sandbox, it doesn't have to materialize any changes during lint.

Continuing the example, since pants cached the results when running the lint goal, running fmt will be very fast. Pants will just materialize the already-generated schemas from its cache.

Developing pants plugins

Pants has extensive documentation on the plugin API, including how to create targets, how to write rules, and there's even a mini tutorial about how to add a formatter (adding formatters is a very common thing for plugins to do).

Why not codegen?

Pants has something called codegen. But it is for code that gets generated without getting checked into the repo. You can export those changes, but they are always exported in a directory under dist/, so we would have to add some custom logic somewhere that copies from dist/ to the contrib/schemas directory.

These schemas are only available in this repo; if someone wants to use them, they have to clone the repo or grab the raw file from github. So, they have to be checked into the repo. We might be able to do something different in the future, but I'm trying not to change more than I have to to introduce pants.

@cognifloyd cognifloyd added this to the pants milestone Dec 12, 2022
@cognifloyd cognifloyd self-assigned this Dec 12, 2022
@pull-request-size pull-request-size bot added the size/L PR that changes 100-499 lines. Requires some effort to review. label Dec 12, 2022
],
output_filename=f"{CMD}.pex",
internal_only=True,
main=EntryPoint.parse("st2common.cmd.{CMD}:main"),
Copy link
Member Author

@cognifloyd cognifloyd Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using st2common.cmd.generate_schemas:main directly instead of using the st2-generate-schemas script because pants can't run python with a - in the name (because pants relies on importing from the target script, and python cannot import from files with a - in the name).

Eventually I'll get around to plumbing a change in pants to make that possible (it already works with the underlying pex, but pants hasn't learned about that arg yet). In the meantime, we just use the entrypoint itself instead of the script.

@cognifloyd cognifloyd changed the title Add pants-plugins/schemas to improve simplify regenerating contrib/schemas Add pants-plugins/schemas to simplify regenerating contrib/schemas Dec 12, 2022
@pull-request-size pull-request-size bot added size/XL PR that changes 500-999 lines. Consider splitting work into several ones that easier to review. and removed size/L PR that changes 100-499 lines. Requires some effort to review. labels Dec 17, 2022
@cognifloyd cognifloyd force-pushed the pants-plugins-schemas branch 5 times, most recently from 1b0ce6f to 3a9764a Compare December 18, 2022 21:29
@cognifloyd cognifloyd force-pushed the pants-plugins-schemas branch 2 times, most recently from 3d90bcc to 3fc8418 Compare December 30, 2022 18:22
@pull-request-size pull-request-size bot added size/L PR that changes 100-499 lines. Requires some effort to review. and removed size/XL PR that changes 500-999 lines. Consider splitting work into several ones that easier to review. labels Jan 3, 2023
@cognifloyd cognifloyd marked this pull request as ready for review January 3, 2023 19:44
@cognifloyd cognifloyd changed the title Add pants-plugins/schemas to simplify regenerating contrib/schemas Add pants-plugins/schemas to streamline regenerating contrib/schemas Jan 3, 2023
@cognifloyd cognifloyd enabled auto-merge (squash) January 3, 2023 21:04
@cognifloyd cognifloyd requested a review from a team January 5, 2023 17:55
Copy link
Member

@rush-skills rush-skills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cognifloyd cognifloyd merged commit cd4ab0b into master Jan 10, 2023
@cognifloyd cognifloyd deleted the pants-plugins-schemas branch January 10, 2023 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure: ci/cd pantsbuild size/L PR that changes 100-499 lines. Requires some effort to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants