-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow repositories that they depend on the contents of files/directories on the local file system #20952
Comments
This is a blocker for #18285 |
Linking the design doc here again: https://docs.google.com/document/d/17RZKMuMjAIgNfFdrsLhqNsMTQDSS-BBA-S-fCSBPV7M/edit# I'll start impl work now. |
@ismell brought up an interesting point today: if your level of abstraction is system calls ( If your abstraction was |
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the `TYPE:KEY VALUE` format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods in `RepositoryFunction`, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache). - The StarlarkSemantics object is no longer treated as a recorded input (only recorded for Starlark repo rules, ignored for native repo rules), but as a predeclared input instead (i.e. hashed on the first line). - This slightly simplifies logic, and since the existing native repo rules are either local (local_repository, new_local_repository, local_config_platform) or being Starlarkified (the two Android repo rules), it will have minimal visible impact. - Each type of recorded inputs is a subclass of `RepoRecordedInput`, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string. - We also try to collect as many SkyKeys needed to verify up-to-date-ness as possible in one go and do a mass Skyframe evaluation. This avoids a fair amount of Skyframe restarts (unlikely to have super big impact on performance, but is a nice thing to do). Work towards #20952 and #12227.
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the `TYPE:KEY VALUE` format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods in `RepositoryFunction`, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache). - The StarlarkSemantics object is no longer treated as a recorded input (only recorded for Starlark repo rules, ignored for native repo rules), but as a predeclared input instead (i.e. hashed on the first line). - This slightly simplifies logic, and since the existing native repo rules are either local (local_repository, new_local_repository, local_config_platform) or being Starlarkified (the two Android repo rules), it will have minimal visible impact. - Each type of recorded inputs is a subclass of `RepoRecordedInput`, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string. - We also try to collect as many SkyKeys needed to verify up-to-date-ness as possible in one go and do a mass Skyframe evaluation. This avoids a fair amount of Skyframe restarts (unlikely to have super big impact on performance, but is a nice thing to do). Work towards #20952 and #12227. Closes #21182. PiperOrigin-RevId: 604522692 Change-Id: Idc18ab202adb601cda47914c48642a6c9039da40
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional boolean parameter `watch` that defaults to True. This causes the file read to be watched as well by default, even if it's not addressed by a label. * Both changes apply to `mctx` as well. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object. Refactored code to reflect that. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional boolean parameter `watch` that defaults to True. This causes the file read to be watched as well by default, even if it's not addressed by a label. * Both changes apply to `mctx` as well. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional boolean parameter `watch` that defaults to True. This causes the file read to be watched as well by default, even if it's not addressed by a label. * Both changes apply to `mctx` as well. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional boolean parameter `watch` that defaults to True. This causes the file read to be watched as well by default, even if it's not addressed by a label. * Both changes apply to `mctx` as well. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional boolean parameter `watch` that defaults to True. This causes the file read to be watched as well by default, even if it's not addressed by a label. * Both changes apply to `mctx` as well. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional boolean parameter `watch` that defaults to True. This causes the file read to be watched as well by default, even if it's not addressed by a label. * Both changes apply to `mctx` as well. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional string parameter `watch` that defaults to `auto`. This causes the file read to be watched as well by default, even if it's not addressed by a label. See `rctx.read()` docs for the semantics of `watch`. * Both changes apply to `mctx` as well, except that no files outside the current Bazel workspace may be watched for `mctx`. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional string parameter `watch` that defaults to `auto`. This causes the file read to be watched as well by default, even if it's not addressed by a label. See `rctx.read()` docs for the semantics of `watch`. * Both changes apply to `mctx` as well, except that no files outside the current Bazel workspace may be watched for `mctx`. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional string parameter `watch` that defaults to `auto`. This causes the file read to be watched as well by default, even if it's not addressed by a label. See `rctx.read()` docs for the semantics of `watch`. * Both changes apply to `mctx` as well, except that no files outside the current Bazel workspace may be watched for `mctx`. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional string parameter `watch` that defaults to `auto`. This causes the file read to be watched as well by default, even if it's not addressed by a label. See `rctx.read()` docs for the semantics of `watch`. * Both changes apply to `mctx` as well, except that no files outside the current Bazel workspace may be watched for `mctx`. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. Work towards #20952.
Wouldn't expanded archives with fixed mtimes break the optimization, similar to what used to be broken with #14723? |
They would (in fact, @tjgq and myself just ran into this yesterday). This is why I listed mtime, ctime and inode number in my comment above when calculating the space requirements for the lockfile. |
I think maybe we can use it to watch the cc toolchain directory, so that |
For the cc toolchain use case, isn't what we are ultimately interested in the contents? (it looks like @ismell says that hat's the case for them, too, so I think the point is moot) |
- Added `rctx.watch_tree()` to watch a directory tree, which includes all transitive descendants' names, and if they're files, their contents. - Added a new SkyFunction DirectoryTreeDigestFunction to do the heavy lifting. - In the future, for performance, we could try to get this skyfunction to have a mode where it only digests stat(), to use as heuristics (similar to #21044) Work towards #20952.
- `rctx.watch()` now supports watching a path even if it's a directory or nonexistent. - a path's status changing counts triggers a refetch. - added `path.is_dir` so that users can tell whether a path points to a directory or a file. - added `watch_X` parameters to the following methods, with behavior similar to the `watch` parameter in `rctx.read()` - `rctx.extract()`, `rctx.patch()`, `rctx.template()` (for the template file), `rctx.symlink()` (for the symlink target) Work towards #20952. Closes #21339. PiperOrigin-RevId: 607415094 Change-Id: Iebb6bc28174d05277a034ba3cf5e0c9bf90ce027
- Added a new parameter `watch` to `path.readdir()` that allows one to watch for changes in entries under a given directory. - only names are watched; 'entry types' are not. This somewhat matches the return value of `path.readdir()`, which only contains entry names - same restrictions as `rctx.watch()` in terms of which paths can be watched and which cannot; similarly for `mctx`. - Made a big-ish refactor that eliminated the two `RepoRecordedInput.File` subclasses, and pulled the difference into a separate `RepoRecordedInput.RepoCacheFriendlyPath` instead. This new path class is used by the new `RepoRecordedInput.Dirents` as well. Work towards #20952.
- Added `rctx.watch_tree()` to watch a directory tree, which includes all transitive descendants' names, and if they're files, their contents. - Added a new SkyFunction DirectoryTreeDigestFunction to do the heavy lifting. - In the future, for performance, we could try to get this skyfunction to have a mode where it only digests stat(), to use as heuristics (similar to #21044) Work towards #20952.
- Added `rctx.watch_tree()` to watch a directory tree, which includes all transitive descendants' names, and if they're files, their contents. - In the future we could add glob patterns to this method. - Added a new SkyFunction DirectoryTreeDigestFunction to do the heavy lifting. - In the future, for performance, we could try to get this skyfunction to have a mode where it only digests stat(), to use as heuristics (similar to #21044) Work towards #20952.
- Added `rctx.watch_tree()` to watch a directory tree, which includes all transitive descendants' names, and if they're files, their contents. - In the future we could add glob patterns to this method. - Added a new SkyFunction DirectoryTreeDigestFunction to do the heavy lifting. - In the future, for performance, we could try to get this skyfunction to have a mode where it only digests stat(), to use as heuristics (similar to #21044) Work towards #20952.
- Added a new parameter `watch` to `path.readdir()` that allows one to watch for changes in entries under a given directory. - only names are watched; 'entry types' are not. This somewhat matches the return value of `path.readdir()`, which only contains entry names - same restrictions as `rctx.watch()` in terms of which paths can be watched and which cannot; similarly for `mctx`. - Made a big-ish refactor that eliminated the two `RepoRecordedInput.File` subclasses, and pulled the difference into a separate `RepoRecordedInput.RepoCacheFriendlyPath` instead. This new path class is used by the new `RepoRecordedInput.Dirents` as well. Work towards #20952. Closes #21341. PiperOrigin-RevId: 607772207 Change-Id: Ibba2b3389acd23e0a703818fec2cd58321a9b896
- Added `rctx.watch_tree()` to watch a directory tree, which includes all transitive descendants' names, and if they're files, their contents. - In the future we could add glob patterns to this method. - Added a new SkyFunction DirectoryTreeDigestFunction to do the heavy lifting. - In the future, for performance, we could try to get this skyfunction to have a mode where it only digests stat(), to use as heuristics (similar to #21044) Work towards #20952.
- Added `rctx.watch_tree()` to watch a directory tree, which includes all transitive descendants' names, and if they're files, their contents. - In the future we could add glob patterns to this method. - Added a new SkyFunction DirectoryTreeDigestFunction to do the heavy lifting. - In the future, for performance, we could try to get this skyfunction to have a mode where it only digests stat(), to use as heuristics (similar to #21044) Work towards #20952. Closes #21362. PiperOrigin-RevId: 608667062 Change-Id: Ibacbb7af4cf4d7628fe8fcf06e2c4fa50e811e4e
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the `TYPE:KEY VALUE` format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods in `RepositoryFunction`, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache). - The StarlarkSemantics object is no longer treated as a recorded input (only recorded for Starlark repo rules, ignored for native repo rules), but as a predeclared input instead (i.e. hashed on the first line). - This slightly simplifies logic, and since the existing native repo rules are either local (local_repository, new_local_repository, local_config_platform) or being Starlarkified (the two Android repo rules), it will have minimal visible impact. - Each type of recorded inputs is a subclass of `RepoRecordedInput`, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string. - We also try to collect as many SkyKeys needed to verify up-to-date-ness as possible in one go and do a mass Skyframe evaluation. This avoids a fair amount of Skyframe restarts (unlikely to have super big impact on performance, but is a nice thing to do). Work towards #20952 and #12227. Closes #21182. PiperOrigin-RevId: 604522692 Change-Id: Idc18ab202adb601cda47914c48642a6c9039da40
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional string parameter `watch` that defaults to `auto`. This causes the file read to be watched as well by default, even if it's not addressed by a label. See `rctx.read()` docs for the semantics of `watch`. * Both changes apply to `mctx` as well, except that no files outside the current Bazel workspace may be watched for `mctx`. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. Work towards #20952. Closes #21230. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. PiperOrigin-RevId: 607097422 Change-Id: Ib96a49461deddd7f4c786bd1c227de18b2dd71d7
- `rctx.watch()` now supports watching a path even if it's a directory or nonexistent. - a path's status changing counts triggers a refetch. - added `path.is_dir` so that users can tell whether a path points to a directory or a file. - added `watch_X` parameters to the following methods, with behavior similar to the `watch` parameter in `rctx.read()` - `rctx.extract()`, `rctx.patch()`, `rctx.template()` (for the template file), `rctx.symlink()` (for the symlink target) Work towards #20952. Closes #21339. PiperOrigin-RevId: 607415094 Change-Id: Iebb6bc28174d05277a034ba3cf5e0c9bf90ce027
- Added a new parameter `watch` to `path.readdir()` that allows one to watch for changes in entries under a given directory. - only names are watched; 'entry types' are not. This somewhat matches the return value of `path.readdir()`, which only contains entry names - same restrictions as `rctx.watch()` in terms of which paths can be watched and which cannot; similarly for `mctx`. - Made a big-ish refactor that eliminated the two `RepoRecordedInput.File` subclasses, and pulled the difference into a separate `RepoRecordedInput.RepoCacheFriendlyPath` instead. This new path class is used by the new `RepoRecordedInput.Dirents` as well. Work towards #20952. Closes #21341. PiperOrigin-RevId: 607772207 Change-Id: Ibba2b3389acd23e0a703818fec2cd58321a9b896
- Added `rctx.watch_tree()` to watch a directory tree, which includes all transitive descendants' names, and if they're files, their contents. - In the future we could add glob patterns to this method. - Added a new SkyFunction DirectoryTreeDigestFunction to do the heavy lifting. - In the future, for performance, we could try to get this skyfunction to have a mode where it only digests stat(), to use as heuristics (similar to #21044) Work towards #20952. Closes #21362. PiperOrigin-RevId: 608667062 Change-Id: Ibacbb7af4cf4d7628fe8fcf06e2c4fa50e811e4e
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the `TYPE:KEY VALUE` format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods in `RepositoryFunction`, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache). - The StarlarkSemantics object is no longer treated as a recorded input (only recorded for Starlark repo rules, ignored for native repo rules), but as a predeclared input instead (i.e. hashed on the first line). - This slightly simplifies logic, and since the existing native repo rules are either local (local_repository, new_local_repository, local_config_platform) or being Starlarkified (the two Android repo rules), it will have minimal visible impact. - Each type of recorded inputs is a subclass of `RepoRecordedInput`, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string. - We also try to collect as many SkyKeys needed to verify up-to-date-ness as possible in one go and do a mass Skyframe evaluation. This avoids a fair amount of Skyframe restarts (unlikely to have super big impact on performance, but is a nice thing to do). Work towards #20952 and #12227. Closes #21182. PiperOrigin-RevId: 604522692 Change-Id: Idc18ab202adb601cda47914c48642a6c9039da40
* Starlark API changes * New method `rctx.watch()` that watches a path. As of right now, said path must exist and be a regular file, but that will change in follow-ups. * Existing method `rctx.read()` gets a new optional string parameter `watch` that defaults to `auto`. This causes the file read to be watched as well by default, even if it's not addressed by a label. See `rctx.read()` docs for the semantics of `watch`. * Both changes apply to `mctx` as well, except that no files outside the current Bazel workspace may be watched for `mctx`. * Marker file format changes * `FILE:` marker file entries can now be `FILE:` followed by either an absolute path (starts with a singular '/'), or a label-like ... thing (which is a label but has a slash in place of a colon). This is because we might be watching something inside a repo that is _not_ under a package. * Same change will happen to the `accumulatedFileDigests` sections of the module lockfile. * Code changes * `RepoRecordedInput.File` subclasses are renamed to `FileOutsideWorkspace` (absolute paths) and `FileInsideWorkspace` (label-like things) instead to better reflect their meaning. * Unified the file watching/digest checking logic in SingleExtensionEvalFunction with the RepoRecordedInput stuff. Follow-ups will unify more. * Moved RepoRecordedInput to its own BUILD target as it's being used outside the repo fetching machinery (namely, module extensions) * Since `FileOutsideWorkspace` has to be an absolute path (never relative), we don't need a "base directory" for the RepoRecordedInput parser. What we really need is just the FileSystem object, which we can get from the PathPackageLocator available in Skyframe. Refactored code to reflect that. Work towards #20952. Closes #21230. RELNOTES: Added a new method, `repository_ctx.watch()`, which asks Bazel to watch for changes to an arbitrary file. When said file changes, Bazel will refetch the repo. PiperOrigin-RevId: 607097422 Change-Id: Ib96a49461deddd7f4c786bd1c227de18b2dd71d7
- `rctx.watch()` now supports watching a path even if it's a directory or nonexistent. - a path's status changing counts triggers a refetch. - added `path.is_dir` so that users can tell whether a path points to a directory or a file. - added `watch_X` parameters to the following methods, with behavior similar to the `watch` parameter in `rctx.read()` - `rctx.extract()`, `rctx.patch()`, `rctx.template()` (for the template file), `rctx.symlink()` (for the symlink target) Work towards #20952. Closes #21339. PiperOrigin-RevId: 607415094 Change-Id: Iebb6bc28174d05277a034ba3cf5e0c9bf90ce027
- Added a new parameter `watch` to `path.readdir()` that allows one to watch for changes in entries under a given directory. - only names are watched; 'entry types' are not. This somewhat matches the return value of `path.readdir()`, which only contains entry names - same restrictions as `rctx.watch()` in terms of which paths can be watched and which cannot; similarly for `mctx`. - Made a big-ish refactor that eliminated the two `RepoRecordedInput.File` subclasses, and pulled the difference into a separate `RepoRecordedInput.RepoCacheFriendlyPath` instead. This new path class is used by the new `RepoRecordedInput.Dirents` as well. Work towards #20952. Closes #21341. PiperOrigin-RevId: 607772207 Change-Id: Ibba2b3389acd23e0a703818fec2cd58321a9b896
- Added `rctx.watch_tree()` to watch a directory tree, which includes all transitive descendants' names, and if they're files, their contents. - In the future we could add glob patterns to this method. - Added a new SkyFunction DirectoryTreeDigestFunction to do the heavy lifting. - In the future, for performance, we could try to get this skyfunction to have a mode where it only digests stat(), to use as heuristics (similar to #21044) Work towards #20952. Closes #21362. PiperOrigin-RevId: 608667062 Change-Id: Ibacbb7af4cf4d7628fe8fcf06e2c4fa50e811e4e
A fix for this issue has been included in Bazel 7.1.0 RC1. Please test out the release candidate and report any issues as soon as possible. Thanks! |
Description of the feature request:
An important use case for Starlark repositories is configuring the build for the local machine.
This sometimes requires reading files, calling
readdir()
, i.e. interacting with the local file system. In some cases, it's desirable that the repository is not re-fetched if the local state changes (if it's expensive to do so), but in some other case, it is and we currently don't provide a good mechanism to do so:rctx.read()
and friends can only declare dependencies on files in other repositorieslocal_repository()
andnew_local_repository()
are only invalidated when the top-level directory of the repository changes and they can't pull in individual files (this matters for large directories e.g./etc
). The rationale behind the first limitation is that one wouldglob()
in the repository and declare dependencies on the files below the first level that way, but that doesn't work for other repository rules.The only workaround at the moment is to declare a dependency on an environment variable that always changes, which has other undesirable effects (see #20951).
There is a separate, ancient feature request to always re-run a repository rule (#3041) but I think that's dangerous because it results in sloppiness with declaring dependencies.
Which category does this issue belong to?
No response
What underlying problem are you trying to solve with this feature?
No response
Which operating system are you running Bazel on?
No response
What is the output of
bazel info release
?No response
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: