Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev: test using datadriven #76189

Merged
merged 1 commit into from
Feb 9, 2022

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented Feb 7, 2022

This commit groups a few changes to speed up dev and make it easier to
test.

  1. Update dev bench to run benchmarks using bazel test directly
    instead of first grepping for benchmarks, querying test targets, and
    only then invoking test binaries using bazel run. The latter
    approach was both slower and more difficult to test. dev bench now
    looks identical to the dev test implementation.

  2. Simplify dev test; now that protobuf targets are directly
    buildable, we don't need any manual querying of build targets and
    filtering for only go_test ones -- we can more easy bazel test
    whatever was top-level package was specified. This reduces how much
    I/O needs to happen with the host environment which both speeds
    things up and makes it easier to test.
    a. It also allows us to partially revert #74808 while still retaining
    informative error messages about missing targets (bazel's ones
    out-of-the-box are pretty good).

  3. Finally, introduce TestDataDriven and TestRecorderDriven.
    a. TestDataDriven makes use of datadriven to capture all operations
    executed by individual dev invocations. The testcases are defined
    under testdata/datadriven/*. DataDriven divvies up these files as
    subtests, so individual "files" are runnable through:

       dev test pkg/cmd/dev -f TestDataDrivenDriven/<fname> [--rewrite]
       OR  go test ./pkg/cmd/dev -run TestDataDrivenDriven/<fname> [-rewrite]
    

    b. TestRecorderDriven makes use of datadriven/recorder to record (if
    --rewrite is specified) or play back (if --rewrite is omitted) all
    operations executed by individual dev invocations. The testcases
    are defined under testdata/recorderdriven/*; each test file
    corresponds to a captured recording found in
    testdata/recorderdriven/*.rec.

       dev test pkg/cmd/dev -f TestRecorderDriven/<fname>
       OR  go test ./pkg/cmd/dev -run TestRecorderDriven/<fname> [-rewrite]
    

    Recordings are used to mock out "system" behavior. When --rewrite
    is specified, attempts to shell out to bazel or perform other OS
    operations (like creating, removing, symlinking filepaths) are
    intercepted and system responses are recorded for future playback.

It's worth comparing TestDataDriven and TestRecorderDriven.
a. TestDataDriven is well suited for exercising flows that don't depend
on reading external state in order to function (simply translating a
dev test <target> to its corresponding bazel invocation for e.g.)
All operations are run in "dry-run" mode when --rewrite is specified;
all exec/os commands return successfully with no error + an empty
response, and just the commands are recorded as test data.
b. TestRecorderDriven in contrast works better for flows that do depend
on external state during execution (like reading the set of targets
from a file for e.g., or hoisting files from a sandbox by searching
through the file system directly). With these, when --rewrite is
specified, we actually do shell out and record the data for future
playback.

We previously (#68514) ripped out the equivalent of TestRecorderDriven
because it was difficult to use (large recordings, execution errors
failing the test, etc.) -- issues this PR tries to address. There are
some real limitations here though that may still make this approach
untenable: When --rewrite is used for TestRecorderDriven, because it
shells out, will:

  • takes time proportional to the actual dev invocations;
  • makes it difficult to run under bazel (through without --rewrite it
    works just fine);

The few remaining tests for this recorder stuff are probably examples of
dev doing too much, when it should instead push the logic down into
bazel rules. As we continue doing so, we should re-evaluate whether this
harness provides much value.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@irfansharif irfansharif force-pushed the 220203.recording-dev branch 4 times, most recently from 7132827 to b4d2ec7 Compare February 7, 2022 23:51
@irfansharif irfansharif marked this pull request as ready for review February 7, 2022 23:51
@irfansharif irfansharif requested a review from a team as a code owner February 7, 2022 23:51
@dt
Copy link
Member

dt commented Feb 8, 2022

Brief skim so far: I'm a big fan of the first two changes mentioned in the commit message. Would it be difficult to separate them at the commit level from the third?

Like you, I'm skeptical that TestRecorderDriven is needed. Like you say, it almost certainly is pointing to the fact dev is doing too much work, if what it does starts to depend on reading state and reacting to it, not just a pure function of it arguments into arguments to a bazel command, and the logic it uses to do it becomes so complex we aren't comfortable leaving it uncovered by tests. You note that we've previously found a test like TestRecorderDriven added ongoing maintenance burden, so I'd argue we're better off putting that time to gettin that stuff out of dev entirely instead (as your first two changes here are doing)?

@irfansharif irfansharif force-pushed the 220203.recording-dev branch 2 times, most recently from e804af5 to 30a0874 Compare February 8, 2022 13:08
@irfansharif
Copy link
Contributor Author

irfansharif commented Feb 8, 2022

Would it be difficult to separate them at the commit level from the third?

Not difficult at all, will do it after more consensus on this PR's direction.

not just a pure function of it arguments into arguments to a bazel command, and the logic it uses to do it becomes so complex we aren't comfortable leaving it uncovered by tests.

Yes, but that said, I think we'll always have some limited from where things aren't a pure function and/or need to interact with external state (though it should scrutinized whenever adding new instances of this); dev {doctor,cache} for e.g. As for cases where we're doing more than dev should "above" the build system, I still think some test coverage is better than what we have today (no coverage and with hand-maintained mocks to make CI happy). If you look at the test cases, most have been converted over to use the "pure" form, and it wasn't much work to keep around the impure form.


I encourage reviewers to pull down the branch, edit the testdata files and see how it feels. It's all very snappy now -- TestDataDriven with and without --rewrite will run in <1s. TestRecorderDriven too in <10s under --rewrite, <1s without.

Copy link
Collaborator

@rickystewart rickystewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked out the code and played around with it. I don't understand everything here 100%, but the changes all seem sensible to me and I agree this should be more maintainable.

Can you bump the DEV_VERSION please?

pkg/cmd/dev/recorderdriven_test.go Show resolved Hide resolved
@rickystewart
Copy link
Collaborator

Can you bump the DEV_VERSION please?

FYI I'm bumping it in #76245, so just keep that in mind so there's no conflict.

This commit groups a few changes to speed up dev and make it easier to
test.

1. Update `dev bench` to run benchmarks using `bazel test` directly
   instead of first grepping for benchmarks, querying test targets, and
   only then invoking test binaries using `bazel run`. The latter
   approach was both slower and more difficult to test. `dev bench` now
   looks identical to the `dev test` implementation.

2. Simplify `dev test`; now that protobuf targets are directly
   buildable, we don't need any manual querying of build targets and
   filtering for only go_test ones -- we can more easy `bazel test`
   whatever was top-level package was specified. This reduces how much
   I/O needs to happen with the host environment which both speeds
   things up and makes it easier to test.
   a. It also allows us to partially revert cockroachdb#74808 while still retaining
      informative error messages about missing targets (bazel's ones
      out-of-the-box are pretty good).

3. Finally, introduce TestDataDriven and TestRecorderDriven.
   a. TestDataDriven makes use of datadriven to capture all operations
      executed by individual dev invocations. The testcases are defined
      under testdata/datadriven/*. DataDriven divvies up these files as
      subtests, so individual "files" are runnable through:

          dev test pkg/cmd/dev -f TestDataDrivenDriven/<fname> [--rewrite]
      OR  go test ./pkg/cmd/dev -run TestDataDrivenDriven/<fname> [-rewrite]

   b. TestRecorderDriven makes use of datadriven/recorder to record (if
      --rewrite is specified) or play back (if --rewrite is omitted) all
      operations executed by individual dev invocations. The testcases
      are defined under testdata/recorderdriven/*; each test file
      corresponds to a captured recording found in
      testdata/recorderdriven/*.rec.

          dev test pkg/cmd/dev -f TestRecorderDriven/<fname>
      OR  go test ./pkg/cmd/dev -run TestRecorderDriven/<fname> [-rewrite]

      Recordings are used to mock out "system" behavior. When --rewrite
      is specified, attempts to shell out to bazel or perform other OS
      operations (like creating, removing, symlinking filepaths) are
      intercepted and system responses are recorded for future playback.

---

It's worth comparing TestDataDriven and TestRecorderDriven.

1. TestDataDriven is well suited for exercising flows that don't depend
   on reading external state in order to function (simply translating a
   `dev test <target>` to its corresponding bazel invocation for e.g.)
   All operations are run in "dry-run" mode when --rewrite is specified;
   all exec/os commands return successfully with no error + an empty
   response, and just the commands are recorded as test data.

2. TestRecorderDriven in contrast works better for flows that do depend
   on external state during execution (like reading the set of targets
   from a file for e.g., or hoisting files from a sandbox by searching
   through the file system directly). With these, when --rewrite is
   specified, we actually do shell out  and record the data for future
   playback.

We previously (cockroachdb#68514) ripped out the equivalent of TestRecorderDriven
because it was difficult to use (large recordings, execution errors
failing the test, etc.) -- issues this PR tries to address. There are
some real limitations here though that may still make this approach
untenable: When --rewrite is used for TestRecorderDriven, because it
shells out, will:
- takes time proportional to the actual dev invocations;
- makes it difficult to run under bazel (through without --rewrite it
  works just fine);

The few remaining tests for this recorder stuff are probably examples of
dev doing too much, when it should instead push the logic down into
bazel rules. As we continue doing so, we should re-evaluate whether this
harness provides much value.

Release note: None
@irfansharif
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 9, 2022

Build succeeded:

@craig craig bot merged commit 456ff4c into cockroachdb:master Feb 9, 2022
@irfansharif irfansharif deleted the 220203.recording-dev branch February 9, 2022 11:26
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Feb 9, 2022
Squash a bug I introduced in cockroachdb#76189; we were assuming a minimum slice
length when we shouldn't have.

Release note: None
craig bot pushed a commit that referenced this pull request Feb 9, 2022
76291: dev: fix slice bounds panic r=irfansharif a=irfansharif

Squash a bug I introduced in #76189; we were assuming a minimum slice
length when we shouldn't have.

Release note: None

Co-authored-by: irfan sharif <[email protected]>
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Feb 11, 2022
Slipped in as part of cockroachdb#76189. Before this patch we we unintentionally
running all unit tests in whatever package was targeted. To repro, try:

    dev bench pkg/server -f='BenchmarkSetupSpanForIncomingRPC'

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Feb 11, 2022
Slipped in as part of cockroachdb#76189. Before this patch we were unintentionally
running all unit tests in whatever package was targeted. To repro, try:

    dev bench pkg/server -f='BenchmarkSetupSpanForIncomingRPC'

Release note: None
RajivTS pushed a commit to RajivTS/cockroach that referenced this pull request Mar 6, 2022
Squash a bug I introduced in cockroachdb#76189; we were assuming a minimum slice
length when we shouldn't have.

Release note: None
RajivTS pushed a commit to RajivTS/cockroach that referenced this pull request Mar 6, 2022
Slipped in as part of cockroachdb#76189. Before this patch we were unintentionally
running all unit tests in whatever package was targeted. To repro, try:

    dev bench pkg/server -f='BenchmarkSetupSpanForIncomingRPC'

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants