-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rush] Design proposal: "phased" custom commands #2300
Comments
Addendum: The above design inspires a number of additional features that we also want to implement, but which do NOT need to be part of the "MVP" (minimum viable product) initial implementation. Replacing "rush rebuild" with a generalized syntaxIn our meeting, it was pointed out that two closely related features often get confused:
Today there is no
CLI syntax for skipping phasesWe could provide a standard way to define a custom parameter Because of phase dependencies, not EVERY phase should be automatically skippable. Rather, the person who is designing custom-commands.json would define specific CLI switches that make sense for users to invoke. This fits with Rush's goal of providing a CLI that is intuitive for newcomers, tailored around meaningful everyday operations, well documented, and validated. Our CLI is not trying to be an elaborate compositional algebra that allows users to creatively string random tasks together. Support for rigsThe above package.json files have a lot of boilerplate that would need to be copy+pasted into each project: "scripts": {
"build": "heft build",
"test": "heft test",
"docs": "doc-tool",
"my-bulk-command": "./scripts/my-command.js",
"clean": "heft clean",
"_phase:compile": "heft build --lite",
"_phase:lint": "heft lint",
"_phase:test": "heft test --no-build",
"_phase:docs": "doc-tool"
}, The |
Makes sense to me. For what it's worth, two things immediately come to mind:
|
@D4N14L But wouldn't it be confusing for the command to be called As an analogy, when an ESLint configuration refers to a plugin package, they allow you to write |
There's an interesting tradeoff between making sure developers didn't forget to specify a command, versus requiring needless boilerplate in many places (which motivated ignoreMissingScript in command-line.json.) If we implement support for rigs, it might change this tradeoff. |
A couple of observations a) The _phase:* scripts feel really out of place in the package.json. "scripts": {
"build": "heft build",
"test": "heft test",
"docs": "doc-tool",
"my-bulk-command": "./scripts/my-command.js",
"clean": "heft clean",
"_phase:compile": "heft build --lite",
"_phase:lint": "heft lint",
"_phase:test": "heft test --no-build",
"_phase:docs": "doc-tool"
}, Consider declaring the phase commands in a per project (riggable) rush file instead - where you can have clear, rich, declarative semantics. Consider having b) You seem to be slowly and adhoc working towards a constrained makefile. One that rush can reason over and orchestrate. What the world could look like by implementing the two observation above... One would introduce a new {
"commands": {
"clean": {
"script": "heft clean",
"phase": "clean"
},
"compile": {
"script": "heft build",
"phase": "compile",
"dependencies": "compile_dependent_projects"
},
"lint": {
"script": "heft lint",
"phase": "???",
"dependencies": "compile_this_project"
},
"test": {
"script": "heft test",
"phase": "test",
"dependencies": "compile_this_project"
},
"doc": {
"script": "doc-tool",
"phase": "???",
"dependencies": "compile_this_project"
},
"build": {
"script": "doc-tool",
"dependencies": "clean_compile_lint_test_doc_this_project"
}
}
}
"scripts": {
"build": "rushx lint",
"buildall": "rushx build",
"test": "rushx test",
"docs": "rushx doc",
"clean": "rushx clean"
} Together both changes map far more closely the behavior I've come to expect from something like hitting F5 in visual studio Or
"scripts": {
"build": "heft build",
"test": "heft test",
"docs": "doc-tool",
"my-bulk-command": "./scripts/my-command.js",
"clean": "heft clean",
} In the absence of a rush config, P.S It's not clear to me that phases are needed for what is described above. |
Update: I missed where that bit of code for ordering was defined, since I was only looking in TaskRunner. Only change would be if we want to add a second sort criteria after the first. |
This is a good idea. Could you open a separate GitHub issue proposing a specific improvement? The same task scheduler will likely be used for both phased and non-phased operations, so this improvement probably isn't specific to phased commands. FYI today Rush already does some basic optimization: rushstack/apps/rush-lib/src/logic/taskRunner/Task.ts Lines 43 to 72 in cb83477
...but it could be improved. |
We'll want to investigate supporting |
Really excited by this feature and am looking for any ways I could help chip in. A couple thoughts: Opt-in for projectsWhile enabling the build cache feature recently in our monorepo, something that made life easier was that you could turn it on and configure it without touching all the projects. Since most of our projects had no I'd like it if this feature worked the same way. For example -- you turn on phased commands, you define your phases, but projects that haven't defined their phases yet continue to operate as a single block. (To other projects with phases enabled, this affects nothing but the timing -- it just means that whether Project A is depending on You'd need to decide what "opting in" means here... Maybe it's "has defined at least one _phase task in package.json". (Our monorepo is a good use case here... we have a big majority of projects that are classic Ease understanding of package.jsonIf What if instead of {
"scripts": {
"build": "heft build",
"test": "heft test",
"docs": "doc-tool",
"my-bulk-command": "./scripts/my-command.js",
"_build:compile": "heft build --lite",
"_build:lint": "heft lint",
"_build:docs": "doc-tool",
"_test:compile": "heft build --lite",
"_test:lint": "heft lint",
"_test:test": "heft test --no-build",
"_test:docs": "doc-tool",
}
} Now without even knowing about phases (much less where to find the configuration for them), the developer will immediately intuit that in some context, there's a series of steps for building and a series of steps for testing. Note that in |
Expanding on @octogonz comment around "CLI syntax for skipping phases". We've recently found ourselves requiring the ability to selectively skip the Webpack Bundle Phase as the bundled output isn't required for us to run tests. I've added a suggestion for enhancement to how Heft manages this currently #2778 |
In this thread @josh-biddick asked for This suggests that the phase dependencies may sometimes depend on the project (or rig profile). |
I would argue that if unit tests have a custom webpack bundle, said bundle should be generated as part of the test phase rather than the bundle phase. |
I was referring to a multi-project setup, like this:
Thus a Reading the original thread more closely, it sounds like @josh-biddick's case was more like he wants to run tests for |
We did a "Rush Hour breakout session" today to discuss the status of PR #2299, as well as its interaction with the Rush build cache feature. The build cache brings some new requirements that weren't considered in the original design for #2300. The consensus was to handle it as follows:
This discussion also highlighted some minor related fixes for the build cache:
The next steps are as follows:
|
I forgot to mention during the meeting, but I think there's a worthy (7):
For the user output, we could continue to collate at the project level, perhaps with a "sub" Another option is to adjust the collation down to the phase level, but this is probably only interesting to people really into working on rush and not to people just building their projects. Last, I would love to see some machine output at the end of a build -- maybe a JSON file -- that lists every phase that was run, the number of seconds each took, and the number of seconds into the build it was started. This would be enough data to reconstruct the timeline of a build in an external tool or program, to compare different changes to the task scheduler down the road, etc. (The format of this data, some day, could also be used as an input to the task scheduler... I imagine that being able to weight each node in the DAG with estimated run times could make for a much better prioritization than just number of child nodes.) |
Design update: In PR #3103, we've moved the build logs to a new folder:
For a phased command, multiple files will be created, for example:
This redesign applies to all commands (whether phased or not). For now, the new folder location is only used if the |
@octogonz what are the reasons to put logs into a new |
@deflock - I thought about dropping logs in the |
@octogonz, @elliot-nelson - point of clarification. For For example, in @elliot-nelson's example repo, there's a |
Proposing a solution for the flag question here: #3124 |
@octogonz , @iclanton , @elliot-nelson , I'm working on updating the internals of the task scheduler right now and ran across a bit of a gap in the design with respect to the action of parameters that add or remove phases, particularly in the face of the statement in the comments that phases for commands are implicitly expanded (ala the "--to" CLI operator): My current thoughts are something like:
|
I think at Rush Hour the consensus was to postpone this feature until after initial release, since the design discussion will be easier after people have some firsthand experience using the feature. |
RequirementsRegarding the design questions in #3144 (comment), @elliot-nelson and I had a big discussion about this today. We focused on the following requirements:
Problems with current designThe config file currently looks like this: <your project>/config/rush-project.json (riggable) {
"buildCacheOptions": {
// Selectively disables the build cache for this project. The project will
// never be restored from cache. This is a useful workaround if that project's
// build scripts violate the assumptions of the cache, for example by writing
// files outside the project folder. Where possible, a better solution is
// to improve the build scripts to be compatible with caching.
"disableBuildCache": true,
// Allows for fine-grained control of cache for individual Rush commands.
"optionsForCommands": [
// The Rush command name, as defined in custom-commands.json
"name": "your-command-name",
// Selectively disables the build cache for this command. The project will never
// be restored from cache. This is a useful workaround if that project's
// build scripts violate the assumptions of the cache, for example by
// writing files outside the project folder. Where possible, a better solution
// is to improve the build scripts to be compatible with caching.
"disableBuildCache": true
]
},
// Specify the folders where your toolchain writes its output files.
// If enabled, the Rush build cache will restore these folders from the cache.
// The strings are folder names under the project root folder. These folders
// should not be tracked by Git. They must not contain symlinks.
"projectOutputFolderNames": [ "lib", "dist" ],
// The incremental analyzer can skip Rush commands for projects whose
// input files have not changed since the last build. Normally, every Git-tracked
// file under the project folder is assumed to be an input.
// Set incrementalBuildIgnoredGlobs to ignore specific files, specified as globs
// relative to the project folder. The list of file globs will be interpreted the
// same way your .gitignore file is.
"incrementalBuildIgnoredGlobs": [],
// Options for individual phases.
"phaseOptions": [
{
// The name of the phase. This is the name that appears in command-line.json.
"phaseName": "_phase:build",
// Specify the folders where this phase writes its output files. If enabled,
// the Rush build cache will restore these folders from the cache. The strings
// are folder names under the project root folder. These folders should not be
// tracked by Git. They must not contain symlinks.
"projectOutputFolderNames": [ "lib", "dist" ]
}
]
} Issues encountered with this design:
Proposed new designExplanation is in the file comments: <your project>/config/rush-project.json (riggable) {
"incrementalBuildIgnoredGlobs": [],
// Let's eliminate the "buildCacheOptions" section, since now we seem to
// only need one setting in that category
"disableBuildCacheForProject": true,
// INSIGHT: We have a bunch of settings that apply to the shell scripts
// defined in package.json, irrespective of whether they are used by phases
// or classic commands. So instead of inventing an abstract jargon that
// means "phase or command", let's just call them "projectScripts".
"projectScripts": [
{
// This is the key from the package.json "scripts" section.
// To support rigs, it is OKAY to provide configuration for scripts that
// do not actually exist in package.json or are not actually mapped to
// a Rush command.
"scriptName": "_phase:build",
// These are the folders to be cached. Their cache keys must not overlap,
// HOWEVER that validation can safely ignore: (1) scripts that aren't mapped
// to a Rush command in a given repo, (2) scripts that have opted out of
// caching, e.g. via disableBuildCacheForProject or disableBuildCacheForScript
"outputFolderNames: ["lib", "dist"],
// Allows you to selectively disable the build cache for just one script
"disableBuildCacheForScript": true,
// FUTURE FEATURE: If your shell command doesn't support a custom parameter
// such as "--lite" or "--production", you can filter it here. This avoids
// having to replace "rm -Rf lib && tsc" with "node build.js" simply to
// discard a parameter.
"unsupportedParameters": [ "--lite" ],
// FUTURE FEATURE: We could optionally allow rigs to define shell scripts in
// rush-project.json, eliminating the need for projects to copy+paste
// these commands into package.json.
"shellScript": "heft build --clean"
},
{
// Addressing the question from PR #3144, the Rush Stack rig could provide
// both "build" and "_phase:build" definitions, allowing the rig to be used
// in any monorepo regardless of whether they are using phased commands.
"scriptName": "build",
"outputFolderNames: ["lib", "dist"]
}
]
} This redesign would be a breaking change, but both the build cache and phased commands features are still "experimental." |
Love the new design, although a little worried about the transition of the fields in rush-project.json when people install a new Rush version. (maybe we could include a customized error or honor the projectBuildOutputFolders etc as a stopgap) |
A helpful error message seems sufficient. As long as you know how to update rush-project.json, the actual work of updating it seems pretty easy, and can be procrastinated until you are ready to upgrade Rush. |
@octogonz , @elliot-nelson , I've been thinking over this a lot in the process of writing #3043, fiddling with the scheduler and the generation of the Rush task graph, as well as how it interacts with Heft, etc., and my current thoughts for North star design have changed a lot. I very much want rush-project.json {
// The set of build tasks (essentially subprojects) that exist for this project.
// "npm run build" becomes an alias for "run these in the correct order"
"tasks": {
// Convert SCSS to CSS and .scss.d.ts
"sass": {
// Filter to apply to source files when determining if the inputs have changed.
// Defaults to including all files in the project if not specified
"inputGlobs": ["src/**/*.scss"],
// What files this task produces. If specified, will be used for cleaning and writing the build cache
// Ideally these are just folder paths for ease of cleaning
"outputGlobs": ["lib/**/*.css", "temp/sass-ts"],
// What local (or external) tasks this task depends on.
// If not specified, has no depenencies.
// Not quite sure on structure here yet, but want to be able to specify in any combination:
// - Depends on one or more tasks in specific dependency projects (which must be listed in package.json)
// - Depends on one or more tasks in this project
// - Depends on one or more tasks in all dependencies in package.json
"dependencies": {
"sass-export": ["#upstream"]
},
// What Rush task runner to invoke to perform this task. Options:
// - "heft": Rush will send an IPC message to a Heft worker process to invoke the task with this name for this project.
// Said process will load a task-specific Heft configuration and perform a build with the relevant plugin(s).
// - "shell": Rush will start a subprocess using the specified command-line
"taskRunner": "heft"
},
// Copy .scss include files to dist/sass
"sass-export": {
"inputGlobs": ["src/sass-includes/**/*.scss"],
"outputGlobs": ["dist/sass"],
"taskRunner": "heft"
},
// Emit JS files from source
"compile": {
"inputGlobs": ["src/**/*.tsx?"],
"outputGlobs": ["lib/**/*.js", "lib/**/*.js.map"],
"dependencies": {
// This task will invoke this plugin, so make sure it has been built
"bundle": ["@rushstack/heft-transpile-only-plugin"]
},
"taskRunner": "heft"
},
// Type-Check, Lint
"analyze": {
"inputGlobs": ["src/**/*.tsx?"],
// Specified as having no outputs, meaning cacheable but doesn't write anything
"outputGlobs": [],
"dependencies": {
// Depends on .d.ts files in npm dependencies
"analyze": ["#upstream"],
// Depends on .scss.d.ts files to type-check
"sass": ["#self"]
},
"taskRunner": "heft"
},
// Webpack
"bundle": {
// Omit inputGlobs to assume everything is relevant
"outputGlobs": ["release"],
"dependencies": {
// Need the JS files from this project and all of its dependencies
"compile": ["#self", "#upstream"],
}
},
"taskRunner": "heft"
}
} Essentially my main premise is that the concept of a "Build Task" should be the entity Rush is primarily concerned with (since that's what the scheduler operates on). The current hybrid of phases + projects results in a lot of unnecessary edges in the dependency graph and limits the ability for projects to take advantage of more or fewer build tasks as appropriate to the project. |
Interesting - and very configurable! @dmichon-msft In your comment you say that "npm run build" would be an alias for running these tasks in order, but where is that controlled? (that is, what tells "npm run build" -- or "rush build" for that matter -- which of these phases are relevant to the build command, and which might be relevant to a totally different custom command?) |
Hmm... As things stand today,
It's annoying to have two approaches to the same problem, but they are optimized for different needs. Your multi-project watch improvements are aligning the two worlds more closely, at least for watching. However it seems unclear whether really great watching will have the same topology as really great phased/cached building. Maybe for now the IPC A couple specific points of feedback:
|
The Regarding string keys, are you suggesting that the reason various rushstack schemas use a semantically incorrect data type (an array) to represent a dictionary (we literally convert it into one immediately after parsing and have to compensate for the language not enforcing uniqueness of keys, which it would if it were written as a dictionary to begin with) is to work around a text editor autocomplete bug? If so, The main issue I've been encountering is how I specify that a project phase depends on some tooling projects, but doesn't require any of the other npm dependencies to have been built. This is a situation that will occur as soon as we start trying to leverage isolatedModules, since the only relevant dependencies will be on Heft and the TypeScript plugin, not on any of the imported product code. The best I can do in the current system is to inject a bogus extra "tooling" phase at the beginning of the compilation that all other phases depend on, and have that be the only phase defined in the tooling project. |
@dmichon-msft and I had a long discussion today, and he persuaded me that watch mode can be defined around phases. So we're amending the proposal to introduce an abstraction term after all: An
We avoided words like "task", "action", "stage", etc because those words already have special meanings for Heft. Proposed new designExplanation is in the file comments: <your project>/config/rush-project.json (riggable) {
"incrementalBuildIgnoredGlobs": [],
// Let's eliminate the "buildCacheOptions" section, since now we seem to
// only need one setting in that category
"disableBuildCacheForProject": true,
// Note: these "settings" have no effect unless command-line.json defines the operation
"operationSettings": [ // 👈👈👈 revised
{
// This is the key from the package.json "scripts" section.
// To support rigs, it is OKAY to provide configuration for scripts that
// do not actually exist in package.json or are not actually mapped to
// a Rush command.
"operationName": "_phase:build", // 👈👈👈 revised
// These are the folders to be cached. Their cache keys must not overlap,
// HOWEVER that validation can safely ignore: (1) scripts that aren't mapped
// to a Rush command in a given repo, (2) scripts that have opted out of
// caching, e.g. via disableBuildCacheForProject or disableBuildCacheForOperation
"outputFolderNames: ["lib", "dist"],
// Allows you to selectively disable the build cache for just one script
"disableBuildCacheForOperation": true, // 👈👈👈 revised
// FUTURE FEATURE: If your shell command doesn't support a custom parameter
// such as "--lite" or "--production", you can filter it here. This avoids
// having to replace "rm -Rf lib && tsc" with "node build.js" simply to
// discard a parameter.
// (We'll have a separate design discussion for this idea.)
"unsupportedParameters": [ "--lite" ],
// FUTURE FEATURE: We could optionally allow rigs to define shell scripts in
// rush-project.json, eliminating the need for projects to copy+paste
// these commands into package.json.
// (We'll have a separate design discussion for this idea.)
"shellScript": "heft build --clean"
},
{
// Addressing the question from PR #3144, the Rush Stack rig could provide
// both "build" and "_phase:build" definitions, allowing the rig to be used
// in any monorepo regardless of whether they are using phased commands.
"operationName": "build",
"outputFolderNames: ["lib", "dist"]
}
]
} This redesign would be a breaking change, but both the build cache and phased commands features are still "experimental." |
I kind of curious about the difference between builkScript and phaseCommand. I really like the idea to avoid xxx words since they have special meanings. So, the question is could we avoid the concept phase to rush user too? Like the internal idea for each phases is a part of script. Such as build script defines heft build, so phases might be lint, build, test… it has been already known by heft. While if i define a script names lint, ‘eslint —fix’ ,i can also defines lint script itself as phase command. My idea is to avoid phase concept totally and treat ‘phase’ as script in rush and stage(task? or other special meaning word) in heft. i.e phase commands is a feature of bulkScript, or even other xxxScript. If we need make script uses heft phase-able, it can be implemented in heft side.
Things seems be easier. Correct me if my understanding is wrong. |
If I remember right, the main idea was that "commands" are something a user invokes from the CLI, whereas "phases" are implementation details of a command. Technically a bulk command is equivalent to a phased command with only one phase. We considered deprecating bulk commands entirely, however because phases are more complicated to set up, it seemed better to provide a simple "bulk command" to make it easier for developers.
We cannot assume that developers are using Heft. This is why a phase has:
Heft can be optimized with additional behaviors (for example the proposed IPC protocol that enables a single Heft process to be reused for multiple operations, and special interactions for watch mode). But these will always be modeled as contracts that other toolchains could easily implement. We recommend Heft, but developers should not be forced to use Heft in order to take advantage of these features. |
Thanks for your reply, Pete.
This makes sense that a new concept in Rush.js. |
Closing this issue out, as the feature has been out for several years at this point. |
This is a proposal for "phased" Rush commands. It's based on two design meetings from March 2020 and later in October 2020.
Goal 1: Pipelining
Today
rush build
builds multiple projects essentially by runningnpm run build
separately in each project folder. Where possible (according to the dependency graph), projects can build in parallel. The operation performed on a given project is atomic from Rush's perspective.For example, suppose
npm run build
invokes the following tasks:If project
B
depends on projectA
, it's not necessary forB
to wait for all ofnpm run build
to complete. For example:B
can start compiling as soon asA
finishes compilingB
can start linting as soon asA
finishes compilingB
andA
can start testing as soon as their own compiling completesB
can start docs as soon asB
finishes compiling ANDA
finishes docsThis sort of pipelining will speed up the build, particularly in bottlenecks of the dependency graph, such as a "core library" that many other projects depend upon.
Goal 2: Fine-grained incremental builds
Today
rush build
supports an incremental feature, that will skip building projects if their source files have not changed (based on the package-deps-hash state file). Recently this feature was extended to work with Rush custom commands, with each command maintaining its own state file. This works correctly for purely orthogonal operations, but the incremental analysis does not consider relationships between commands.For example, suppose we define two commands:
rush build
does compiling and linting and docsrush test
does compiling and linting and testing and docsRush would maintain two state files
package-deps_build.json
andpackage-deps_test.json
, but the analysis would not understand thatrush build
has already performed 50% of the work. Even worse, it will not understand whenrush build
has invalidated the incremental state forrush test
.Goal 3: An intuitive CLI syntax for skipping tasks
Today you can define a parameter such as
rush build --no-docs
that would skip the docs task, but the meaning of this flag would be determined by code inside the build scripts. And Rush's incremental build logic has no way to understand that compiling and linting are up-to-date but docs is outdated.Design spec
The multiphase-rush repo sketches out example of the the proposed design for phased builds.
The phase scripts are defined in package.json like this:
project1/package.json
The
_phase:
scripts implement phases that will be globally defined in command-line.json for the monorepo. Unlike Makefile dependencies, our proposed phase graph is a standardized model that every project will conform to. When a new project is moved into the monorepo, its toolchain may need to be adapted to match the phase model. Doing so ensures that the multi-projectrush
CLI syntax will perform meaningful operations for each project. Example:common/config/rush/command-line.json
Example command line
The above configuration defines these multi-project commands:
rush build
invokes shell commands:heft build --lite
(_phase:compile)heft lint
(_phase:lint)doc-tool
(_phase:docs)rush build --production
invokes shell commands:heft build --lite --production
(_phase:compile)heft lint
(_phase:lint)heft test --no-build
(_phase:test)rush test
invokes shell commands:heft build --lite
(_phase:compile)heft lint
(_phase:lint)heft test --no-build
(_phase:test)doc-tool
(_phase:docs)rush test --production
invokes shell commands:heft build --lite --production
(_phase:compile)heft lint
(_phase:lint)heft test --no-build
(_phase:test)doc-tool --production
(_phase:docs)rush my-bulk-command
invokes shell commands:./scripts/my-command.js
(my-bulk-command)rush my-bulk-command --production
invokes shell commands:./scripts/my-command.js --production
(my-bulk-command)It also defines these commands:
rushx build
invokes:heft build
rushx build --some --arbitrary=stuff
invokes:heft build --some --arbitrary=stuff
(rushx
currently passes CLI parameters unfiltered to the underlying script)rushx test
invokes:heft test
rushx docs
invokes:doc-tool
my-bulk-command
invokes:./scripts/my-command.js
clean
invokes:heft clean
Note that
rush test
andrushx test
conceptually perform similar actions, but their implementation is totally different. It's up to the implementation to ensure consistent behavior. We considered conflating their implementation, but that runs into a couple problems: First, Heft's "watch mode" implementation is not generalizable to multiple projects; the planned multi-project watch mode will have a different architecture (that coordinates multiple Heft watch modes). And secondly,rushx
supports a lot of very useful CLI parameters (via Heft) that cannot meaningfully be generalized torush
. Thus for now at least, it seems like the right design to keeprush
andrushx
similar in spirit only, not implementation.The text was updated successfully, but these errors were encountered: