-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
execgen: make file i/o more "unix"-ey #56982
Comments
Hi @irfansharif, please add a C-ategory label to your issue. Check out the label system docs. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
The way `execgen` is constructed, it expects to find the relevant _tmpl.go files in the hardcoded paths (which happen to be relative to the repo root). With bazel the code-gen process occurs within the sandbox, and the template files aren't placed in the exact same paths as in the repo (it's placed within `external/cockroach/pkg/...` instead). When trying to execute pkg/sql/colexec tests through bazel, the template files aren't found in the right paths, and the `execgen` processes subsequently fail. See cockroachdb#56982 for more details. ``` $ bazel test @cockroach//pkg/sql/colexec:colexec_test ... INFO: From Generating pkg/cmd/colexec/and_or_projection.eg.go: ERROR: open pkg/sql/colexec/and_or_projection_tmpl.go: no such file or directory ``` Short of changing out `execgen` to consider template files as data dependencies, for now lets just symlink the relevant files into the "right" path within the bazel sandbox. Release note: None
The way `execgen` is constructed, it expects to find the relevant _tmpl.go files in the hardcoded paths (which happen to be relative to the repo root). With bazel the code-gen process occurs within the sandbox, and the template files aren't placed in the exact same paths as in the repo (it's placed within `external/cockroach/pkg/...` instead). When trying to execute pkg/sql/colexec tests through bazel, the template files aren't found in the right paths, and the `execgen` processes subsequently fail. See cockroachdb#56982 for more details. ``` $ bazel test @cockroach//pkg/sql/colexec:colexec_test ... INFO: From Generating pkg/cmd/colexec/and_or_projection.eg.go: ERROR: open pkg/sql/colexec/and_or_projection_tmpl.go: no such file or directory ``` Short of changing out `execgen` to consider template files as data dependencies, for now lets just symlink the relevant files into the "right" path within the bazel sandbox. Release note: None
It would be easy to change the way this works to be:
would that help? There is a lot of detail in this issue, but if we just added the path to the template to the invocation, and read the template from that path, would that fix your problem? There are some output files that take no template arguments, and therefore have no dependencies. For those, we'd just take 0 template args and that would be fine. |
|
That would help, yes.
It'd be even better for |
This is already what execgen does. |
56874: sql: add notice for interleaved tables r=postamar a=postamar This patch adds a deprecation notice when the user attempts to create an interleaved table by executing a CREATE TABLE ... INTERLEAVE IN PARENT statement. Fixes #56579. Release note (sql change): Added deprecation notice to CREATE TABLE ... INTERLEAVE IN PARENT statements. 57018: roachpb: use kv.kvpb as package name for errors.proto r=knz a=tbg We just started using `EncodedError` in `message Error`, which means that once the first alpha goes out (soon) we will have to face a migration if we wish to change any proto package path that is referenced by an `Error` (i.e. most of `errors.proto`). We do want to dismantle the `roachpb` grab bag at some point. However, I just tried moving `errors.proto` (or even just its relevant parts) to a new `kvpb` package, and failed miserably - this will take a solid chunk of uninterrupted time. So, go for the stop gap: rename the path in `errors.proto` now, so that we're free to make the actual move "whenever" we can. Release note: None 57027: colexec: support running colexec tests through bazel r=irfansharif a=irfansharif The way `execgen` is constructed, it expects to find the relevant _tmpl.go files in the hardcoded paths (which happen to be relative to the repo root). With bazel the code-gen process occurs within the sandbox, and the template files aren't placed in the exact same paths as in the repo (it's placed within `external/cockroach/pkg/...` instead). When trying to execute pkg/sql/colexec tests through bazel, the template files aren't found in the right paths, and the `execgen` processes subsequently fail. See #56982 for more details. ``` $ bazel test @cockroach//pkg/sql/colexec:colexec_test ... INFO: From Generating pkg/cmd/colexec/and_or_projection.eg.go: ERROR: open pkg/sql/colexec/and_or_projection_tmpl.go: no such file or directory ``` Short of changing out `execgen` to consider template files as data dependencies, for now lets just symlink the relevant files into the "right" path within the bazel sandbox. Release note: None Co-authored-by: Marius Posta <[email protected]> Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: irfan sharif <[email protected]>
Now that we've added a way to source a specific template file in `execgen` (instead of relying on hard-coded paths, see cockroachdb#56982), we can simplify how we generate eg.go files. This lets us parallelize the generation of these files, as the more fine-grained dependency tracking lets bazel generate each eg.go file concurrently (previously we had to specify the superset of all template files as dependencies for the generation of each individual eg.go file). There's one exception for the generation of like_ops.eg.go, the generation of which appears to want read from a second template file. We've special-cased the generation of this file into it's own thing. Release note: None
Now that we've added a way to source a specific template file in `execgen` (instead of relying on hard-coded paths, see cockroachdb#56982), we can simplify how we generate eg.go files. This lets us parallelize the generation of these files, as the more fine-grained dependency tracking lets bazel generate each eg.go file concurrently (previously we had to specify the superset of all template files as dependencies for the generation of each individual eg.go file). There's one exception for the generation of like_ops.eg.go, the generation of which appears to want read from a second template file. We've special-cased the generation of this file into it's own thing. Release note: None
Now that we've added a way to source a specific template file in `execgen` (instead of relying on hard-coded paths, see cockroachdb#56982), we can simplify how we generate eg.go files. This lets us parallelize the generation of these files, as the more fine-grained dependency tracking lets bazel generate each eg.go file concurrently (previously we had to specify the superset of all template files as dependencies for the generation of each individual eg.go file). There's one exception for the generation of like_ops.eg.go, the generation of which appears to want read from a second template file. We've special-cased the generation of this file into it's own thing. Release note: None
57075: colexec,bazel: re-work code-gen through bazel r=irfansharif a=irfansharif Now that we've added a way to source a specific template file in `execgen` (instead of relying on hard-coded paths, see #56982), we can simplify how we generate eg.go files. This lets us parallelize the generation of these files, as the more fine-grained dependency tracking lets bazel generate each eg.go file concurrently (previously we had to specify the superset of all template files as dependencies for the generation of each individual eg.go file). There's one exception for the generation of like_ops.eg.go, the generation of which appears to want read from a second template file. We've special-cased the generation of this file into it's own thing. Release note: None 57081: roachtest: update pgjdbc blocklists r=solongordon a=solongordon I made the following updates to the pgjdbc blocklists: - Removed tests which are now passing due to user-defined schema support. - Removed testUpdateSelectOnly since we now support this syntax as a no-op. - Updated some failure reasons to "unknown" for tests which are still failing even though the referenced issue was closed. - Added many BatchExecuteTest.* tests to the ignore list. These tests are flaky due to a combination of #54477 and the fact that the tests do not run in a deterministic order. Fixes #53467 Fixes #53738 Fixes #54106 Release note: None Co-authored-by: irfan sharif <[email protected]> Co-authored-by: Solon Gordon <[email protected]>
Describe the problem
The UX around execgen i/o is a bit non-standard. Let's consider one example:
At first glance it may seem that it's taking the
hash_any_not_null_agg.eg.go
file as input, but in fact the file path is a positional argument tellsexecgen
which specific.eg.go
file to generate, which it then prints to stdout. For it's "input", theexecgen
program has a hardcoded list of_tmpl.go
files relative to the repo root that it knows to read and use in order to generate the relevant.eg.go
file.Expected behavior
Lets consider how
optgen
behaves instead.It's a bit simpler to understand. The
-out
parameter specifies where the output file is to be placed. Theops
argument tells optgen that it needs to generate operator definitions. Thepkg/sql/opt/ops/*.opt
arguments tell optgen where to look to find all relevant "input" files (the opt rules).Additional data / screenshots
N/A
Additional context
The
execgen
UX has worked well for us thus far, but is exceedingly difficult to work with now that we're trying bazel in earnest (#55687). Specifically, bazel wants to auto-generate code within its sandbox where the template files are not necessarily placed in the same path as it does in the crdb repo itself. Because the template paths are hard-coded, it takes a few hacks to get it working just right with Bazel. Compare theoptgen
bazel gen rule with theexecgen
one:cockroach/pkg/sql/opt/BUILD.bazel
Lines 67 to 76 in 9806c1c
cockroach/pkg/sql/colexec/COLEXEC.bzl
Lines 1 to 43 in 9806c1c
Right now the bazel targets for
pkg/sql/colexec
don't work for running tests (they do for building the test binaries, confusingly), as the sandbox places the template files in a location other than whatexecgen
has been taught to expect.If the UX looked something similar to the following:
Where it would print to stdout the intended contents of
pkg/sql/colexec/colexecagg/hash_sum_agg.eg.go
, life would be much easier. Specifically what I'm looking for is for the arguments fully specify where exactly to look for all the "inputs", instead of it internally hardcoding those paths. It'd be more “unix”-ey that way, I think. Specifically for bazel, we’d then be able to declare the full set of template files as data dependencies, while still allowing bazel to place those files in any arbitrary path within the sandbox.To add more words to my word soup above, another reason the status quo feels gross in bazel is because the template files in one package (
colexec
) can be used to generateeg.go
files in another (colconv
for e.g.). By default bazel wants to treatcolexec
as a dependency forcolconv
, but in our code it’s the other way around. For now we've settled for not generating theeg.go~ files in
colconv`, but we'll want to remove this stopgap soon.(Aside: that’s yet another thing that's slightly better in optgen. The rules for each package are wholly contained within that package itself or in children packages.)
cc-ing @jordanlewis for triage/routing.
The text was updated successfully, but these errors were encountered: