-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor marker file logic #21182
Refactor marker file logic #21182
Conversation
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the `TYPE:KEY VALUE` format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods in `RepositoryFunction`, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache). - The StarlarkSemantics object is no longer treated as a recorded input (only recorded for Starlark repo rules, ignored for native repo rules), but as a predeclared input instead (i.e. hashed on the first line). - This slightly simplifies logic, and since the existing native repo rules are either local (local_repository, new_local_repository, local_config_platform) or being Starlarkified (the two Android repo rules), it will have minimal visible impact. - Each type of recorded inputs is a subclass of `RepoRecordedInput`, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string. - We also try to collect as many SkyKeys needed to verify up-to-date-ness as possible in one go and do a mass Skyframe evaluation. This avoids a fair amount of Skyframe restarts (unlikely to have super big impact on performance, but is a nice thing to do). Work towards #20952 and #12227.
b2aec50
to
3953be8
Compare
Shouting in from the sidelines here, if you are heavily changing the format of the marker files, consider replacing it with some less ad-hoc file format (maybe JSON, or hell, even Starlark-but-only-data-structures)? |
I would go even further: I see value in aligning the repo marker file format with that of module extension lockfile entries. After all, there doesn't seem to be a conceptual reason for not supporting the same capabilities for module extensions eventually. We don't need to implement this now, but could switch over to Gson/JSON for serialization and possibly even use the same logic for verification of basic file accesses and recorded repo mapping entries. |
src/main/java/com/google/devtools/build/lib/rules/repository/RepositoryDelegatorFunction.java
Outdated
Show resolved
Hide resolved
it is indeed my plan to eventually switch to JSON. it's actually forwards compatible too, since the current marker file format treats the whole first line as the rule hash; so if the first line became And yeah, this PR paves the way to eventually merge this logic with the module extension up-to-date checks. Otherwise, though, this PR is just a refactor -- it doesn't actually change the marker file format at all. |
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the `TYPE:KEY VALUE` format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods in `RepositoryFunction`, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache). - The StarlarkSemantics object is no longer treated as a recorded input (only recorded for Starlark repo rules, ignored for native repo rules), but as a predeclared input instead (i.e. hashed on the first line). - This slightly simplifies logic, and since the existing native repo rules are either local (local_repository, new_local_repository, local_config_platform) or being Starlarkified (the two Android repo rules), it will have minimal visible impact. - Each type of recorded inputs is a subclass of `RepoRecordedInput`, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string. - We also try to collect as many SkyKeys needed to verify up-to-date-ness as possible in one go and do a mass Skyframe evaluation. This avoids a fair amount of Skyframe restarts (unlikely to have super big impact on performance, but is a nice thing to do). Work towards #20952 and #12227. Closes #21182. PiperOrigin-RevId: 604522692 Change-Id: Idc18ab202adb601cda47914c48642a6c9039da40
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the `TYPE:KEY VALUE` format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods in `RepositoryFunction`, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache). - The StarlarkSemantics object is no longer treated as a recorded input (only recorded for Starlark repo rules, ignored for native repo rules), but as a predeclared input instead (i.e. hashed on the first line). - This slightly simplifies logic, and since the existing native repo rules are either local (local_repository, new_local_repository, local_config_platform) or being Starlarkified (the two Android repo rules), it will have minimal visible impact. - Each type of recorded inputs is a subclass of `RepoRecordedInput`, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string. - We also try to collect as many SkyKeys needed to verify up-to-date-ness as possible in one go and do a mass Skyframe evaluation. This avoids a fair amount of Skyframe restarts (unlikely to have super big impact on performance, but is a nice thing to do). Work towards #20952 and #12227. Closes #21182. PiperOrigin-RevId: 604522692 Change-Id: Idc18ab202adb601cda47914c48642a6c9039da40
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the `TYPE:KEY VALUE` format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods in `RepositoryFunction`, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache). - The StarlarkSemantics object is no longer treated as a recorded input (only recorded for Starlark repo rules, ignored for native repo rules), but as a predeclared input instead (i.e. hashed on the first line). - This slightly simplifies logic, and since the existing native repo rules are either local (local_repository, new_local_repository, local_config_platform) or being Starlarkified (the two Android repo rules), it will have minimal visible impact. - Each type of recorded inputs is a subclass of `RepoRecordedInput`, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string. - We also try to collect as many SkyKeys needed to verify up-to-date-ness as possible in one go and do a mass Skyframe evaluation. This avoids a fair amount of Skyframe restarts (unlikely to have super big impact on performance, but is a nice thing to do). Work towards bazelbuild#20952 and bazelbuild#12227. Closes bazelbuild#21182. PiperOrigin-RevId: 604522692 Change-Id: Idc18ab202adb601cda47914c48642a6c9039da40
Marker files today store all predeclared inputs as one hash (on the first line of the file), and then each recorded input as a following line in the
TYPE:KEY VALUE
format. This commit refactors the parsing/stringification logic of recorded inputs so that they're not all clumped in big methods inRepositoryFunction
, to pave the way for more recorded input types (watching directories, etc) and more places to write recorded input data (for the true repo cache).RepoRecordedInput
, which knows how to stringify itself, verify its own up-to-date-ness, and how to parse itself from a string.Work towards #20952 and #12227.