Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/go/analysis: identify test-only packages #65749

Open
lfolger opened this issue Feb 16, 2024 · 31 comments
Open

x/tools/go/analysis: identify test-only packages #65749

lfolger opened this issue Feb 16, 2024 · 31 comments
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) Proposal Proposal-Accepted Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@lfolger
Copy link
Contributor

lfolger commented Feb 16, 2024

Proposal Details

Some analyzers need to customize their behavior based on the context in which the code is running, e.g. to restrict certain (inefficient) APIs to only be used in tests.

At the moment analyzers have no way to determine if the file they are analyzing is only used in tests. One way of achieving this today is to check if the file ends in _test.go. This works for the most part. However it fails for libraries that are not tests themselves but can only be used in tests. Some build systems can enforce this. See bazels testonly attribute.

One way to support this would be similar to the version support based on go/types.Info.FileVersions which is a map from *ast.File to the version string. A map in the analysis.Pass from *ast.File to a bollean that says if the file is only used in tests or not would solve this problem.

Note: it is not a property of the package to be analyzed but the individual files because when analyzing a test package it consists of test files and the actual package source files.

@gopherbot gopherbot added this to the Proposal milestone Feb 16, 2024
@lfolger
Copy link
Contributor Author

lfolger commented Feb 16, 2024

cc: @adonovan @timothy-king @findleyr

@seankhliao seankhliao added Tools This label describes issues relating to any tools in the x/tools repository. Analysis Issues related to static analysis (vet, x/tools/go/analysis) labels Feb 16, 2024
@adonovan
Copy link
Member

This feature would first need to be added to x/tools/go/packages, which the multichecker driver uses for package information. So let's expand this proposal to consider the API changes to both packages at once.

@adonovan adonovan changed the title proposal: x/tools/go/analysis: allow Analyzers to customize their behavior if a file is a test file proposal: x/tools/go/{analysis,packages}: allow Analyzers to customize their behavior if a file is a test file Feb 16, 2024
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Feb 16, 2024
@timothy-king
Copy link
Contributor

timothy-king commented Feb 20, 2024

(Reopening. Fat fingers closed this.)

@timothy-king timothy-king reopened this Feb 20, 2024
@timothy-king
Copy link
Contributor

3 candidate implementations:

  1. Add a field Test map[*ast.File]bool to analysis.Pass and packages.Package (original proposal)
  2. Add a field Test func(*ast.File)bool to analysis.Pass and packages.Package
  3. Add a field Tests []*ast.File to analysis.Pass and packages.Package

Thoughts:

  • 1 and 2 both strike me as user friendly for analyzer writers.
  • I have a sneaking suspicion that packages will be slightly more natural to express with 3. Packages can be given a list of the testing files and forwards it along.
  • 3 might be slightly inefficient if analyzers do O(|#test files||#files||#analyzers|*|#packages|) to check these, but this is easily to fix with a utility Analyzer if this becomes a problem.
  • Options 2 strikes me as not a natural fit for packages.
  • I do not see a particularly compelling reason to work over strings instead of *ast.File (packages has a few places strings appear though).

@adonovan
Copy link
Member

A utility analyzer is overkill; we should just get the API right. The client needs to know "is this file a test file?" or, more often, "is this node or position within a test file?", even when they don't have the File to hand. The API should support those queries directly.

I suspect the frequency of queries will not be very great: typically once per file (if deciding to skip some files) or once per finding (if deciding whether to suppress a diagnostic). So this suggests a representation like option #3, with a helper function such as:

// InTestFile reports whether the position is within
// (including EOF) a file that is only used during testing.
func (pass *Pass) InTestFile(pos token.Pos) bool {
    for _, f := range pass.TestFiles {
        if f.FileStart <= pos && pos <= f.FileEnd { return true }
    }
    return false
}

Clients can build a hash table if they need more frequent access.

@lfolger
Copy link
Contributor Author

lfolger commented Feb 21, 2024

I might be missing something but is there a reason this needs to be part of the analysis.Pass at all?
Is this to support files that are not part of the package under active analysis?
Otherwise it should be enough to have this information in the types.Package or types.Info which is always available for the package under analysis (even for packages that fail type checking it should be possible to fill in this data).

Or going the other direction. Does it need to be part of the types.Package or types.Info at all? Wouldn't it be sufficient to be part of the analysis.Pass only? (It might be hard for analysis drivers to get at this information).

I think what I'm trying to say is: is there a reason to duplicate this information?

@adonovan
Copy link
Member

I think what I'm trying to say is: is there a reason to duplicate this information?

Anything "input" data exposed by analysis.Pass interface needs to be derivable from packages.Package, since the design of multichecker driver relies on it. So if the Pass needs access to a "test-only file?" predicate, this information must be furnished by packages.Package (and also by go vet, but that's easier because the vet-unitchecker interface isn't a public API).

If types.Package provided this information that would be solution, but all existing creators of types.Package---that is, all clients of types.Checker, plus cmd/compiler + gcimporter, do not provide it.

@adonovan
Copy link
Member

func (pass *Pass) InTestFile(pos token.Pos) bool

The interface I proposed above assumes that files are parsed (which is the common use-case for go/packages, but isn't actually required), so this operation would be dependent on NeedSyntax. One alternative is to provide the list of names of test-only files, but it seems less useful.

@findleyr
Copy link
Contributor

findleyr commented Feb 21, 2024

A utility analyzer is overkill; we should just get the API right. The client needs to know "is this file a test file?" or, more often, "is this node or position within a test file?", even when they don't have the File to hand. The API should support those queries directly.

We went through a very similar line of reasoning when discussing file version information (https://go.dev/issue/62605). I still feel that it is philosophically incorrect to use token.Pos to look up information about the file, because the syntax tree is abstract and any properties of the file are properties of the File node. I think map[*ast.File]bool is correct.

EDIT: on second thought, I'm not sure. Unlike go/types, for which the input may realistically be a mutated AST, go/packages guarantees an AST produced by go/parser, for which a positional lookup is sound. Do we want to establish a requirement that analyzers operate only on syntax with full and accurate positions?

@lfolger
Copy link
Contributor Author

lfolger commented Feb 21, 2024

If types.Package provided this information that would be solution, but all existing creators of types.Package---that is, all clients of types.Checker, plus cmd/compiler + gcimporter, do not provide it.

Maybe I got confused by @timothy-king comment above but are you now suggesting a solution that does not involve adding this information to go/types?
Because if it is added to go/types (e.g. types.Info), that would be sufficient, wouldn't it? It is file the FileVersions so it should be sufficient for the "isTestFile" as well.

@adonovan
Copy link
Member

adonovan commented Feb 21, 2024

Maybe I got confused by @timothy-king comment above but are you now suggesting a solution that does not involve adding this information to go/types?

I wasn't suggesting that go/types be involved. Unlike FileVersions, which is critical information for the type checking algorithm, the "test only" predicate is not relevant to type checking and is purely a property of the build system. So I think this information belongs in go/packages, and that's what I interpreted Tim and Rob to be implying too.

@lfolger
Copy link
Contributor Author

lfolger commented Feb 21, 2024

Sorry I misread @timothy-king comment. It refers to packages.Package. I somehow read this as types.Package. Sorry for the confusion.

I think a analysis only solution is appropriate here.

@adonovan
Copy link
Member

I think a analysis only solution is appropriate here.

See my note in #65749 (comment): it has to be in both go/analysis and go/packages.

@timothy-king
Copy link
Contributor

Do we want to establish a requirement that analyzers operate only on syntax with full and accurate positions?

I am not really sure. Might need its own proposal for the other drivers for analysis.Analyzer out there?

But I feel like we do not need to resolve this now in order to know which files are tests. If we have a field pass.TestFiles, we can always add astutil.Within(pos token.Pos, files []*ast.Files) bool (what Alan wrote just not a method on Pass) without an API commitment. My point is once the field is there and filled in, we can figure out an ergonomic API.

@findleyr
Copy link
Contributor

But I feel like we do not need to resolve this now in order to know which files are tests.

Agreed. I prefer not resolving this now, by treating the position lookup as a separate concern. I'd be fine with e.g. packages.Package.TestFiles []string and analysis.Pass.TestFiles map[*ast.File]bool.

Or should we have analysis.Pass.TestFiles []string, completely passing through the equivalent field on packages.Package? The analysis.Pass has OtherFiles and IgnoredFiles, both of which are []string. Why should it be that test files only exist among the set of compiled go files? Presumably there could be a test file that is not compiled.

@adonovan
Copy link
Member

I'd be happy with lists of strings (filenames), parallel to TestFiles and OtherFiles. That avoids the tricky interaction with the NeedSyntax mode bit.

@findleyr
Copy link
Contributor

Ok, it sounds like we're honing in on:

For go/packages:

package packages

type Package {
  // TestFiles lists files that are only used for test code, e.g. because they contain tests
  // (as in a *_test.go file), or are otherwise marked by the build system as being restricted
  // to testing.
  TestFiles []string
  
  // ...
}

For go/analysis:

package analysis

type Pass {
  TestFiles []string // names of test-only files in this package

  // ...
}

Does that look right?

@rsc
Copy link
Contributor

rsc commented Mar 1, 2024

This doesn't look right to me. TestFiles has a well-understood meaning already: *_test.go files. If a build system has the concept of a package that can only be imported by tests, that doesn't make the files "TestFiles", since in general code in TestFiles is not importable at all. It's also weird that Pass.Files is a []*ast.File while Pass.TestFiles is a []string. What kind of string is it? Is it just the basename (like it would be in go/packages) or is it the fully qualified path that you might get from the FileSet?

If we have to add this foreign concept for use with Blaze/Bazel, it seems cleaner to add it directly than to disguise it in a way that ends up being awkward to explain in practice. Specifically, let's add Package.TestOnly bool and Pass.TestOnly bool

TestOnly bool // package is marked as "test-only" in the build system (not possible using go command)

@rsc rsc changed the title proposal: x/tools/go/{analysis,packages}: allow Analyzers to customize their behavior if a file is a test file proposal: x/tools/go/{analysis,packages}: identify test-only packages Mar 1, 2024
@rsc rsc changed the title proposal: x/tools/go/{analysis,packages}: identify test-only packages proposal: x/tools/go/analysis: identify test-only packages Mar 1, 2024
@rsc
Copy link
Contributor

rsc commented Mar 1, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Mar 1, 2024
@timothy-king
Copy link
Contributor

_test.go files in non-test packages are sufficiently similar to bazel testonly packages that I think most analysis would want to treat them the same. Each analyzer inferring 'is this file a test?' by combining TestOnly, the *ast.File, and the FileSet seems a bit less natural than the build system telling the analysis which subset of files are tests.

@rsc
Copy link
Contributor

rsc commented Mar 6, 2024

I disagree. Putting them in TestFiles makes it sound like they are _test.go files, and they aren't.
I think you are overfitting to this one specific use case.
Better to expose the actual concept than the bit you want about that concept today.

@adonovan
Copy link
Member

@timothy-king & @lfolger, does the TestOnly bool API address your needs?

@lfolger
Copy link
Contributor Author

lfolger commented Apr 11, 2024

It is not entirely clear to me if TestOnly would be set to true for tests themselves. If so, it wouldn't work because tests might consist of test and non-test files. If such a package would be marked TestOnly, we would consider all of the embedded library sources as TestOnly.

If TestOnly is only set for blaze/bazel go_library targets with testonly attributte but not for go_test targets, it would work for us.

@timothy-king
Copy link
Contributor

I don't have any immediate plans to have checkers that use this. I'll defer to @lfolger for whether it meets his needs.

@rsc
Copy link
Contributor

rsc commented May 23, 2024

Based on @lfolger's recent message, it sounds like #65749 (comment) works for everyone. This would mean adding a TestOnly bool to x/tools/go/packages, but the cmd/go integration would never emit it. Only Bazel/Blaze wrappers would emit it. Do I have that right?

@znkr
Copy link
Contributor

znkr commented May 25, 2024

A test only bit like the one described is interesting, but it's slightly worrying that it would only be used by Bazel and Blaze (Blazel?) builds. OTOH, I am wondering what we could do if we were able to pass more build system information through to analyzers. For example, dependency information is very different between Blazel and Go and understanding the dependency information on the build system level would allow us to replace some heuristics with the source of truth data.

What if, instead of handling the case of test only information, we had an extension mechanism that could transport this and other information? What other problems could we fix with that?

@rsc
Copy link
Contributor

rsc commented May 29, 2024

It stops being an API if arbitrary data can be passed between Blaze and analyzers. Let's add what we need and document what it means. TestOnly seems like a reasonable step on that path.

@rsc
Copy link
Contributor

rsc commented Jun 5, 2024

Have all remaining concerns about this proposal been addressed?

The proposal is to add

TestOnly bool // package is marked as "test-only" in the build system (not possible using go command)

to both golang.org/x/tools/go/packages.Package and golang.org/x/tools/go/analysis.Pass.

This is a concept in Blaze, Bazel, and other build systems but not the go command.

@lfolger
Copy link
Contributor Author

lfolger commented Jun 10, 2024

This would work for our use case.

@rsc
Copy link
Contributor

rsc commented Jun 12, 2024

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

The proposal is to add

TestOnly bool // package is marked as "test-only" in the build system (not possible using go command)

to both golang.org/x/tools/go/packages.Package and golang.org/x/tools/go/analysis.Pass.

This is a concept in Blaze, Bazel, and other build systems but not the go command.

@rsc rsc moved this from Active to Likely Accept in Proposals Jun 12, 2024
@rsc
Copy link
Contributor

rsc commented Jun 24, 2024

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

The proposal is to add

TestOnly bool // package is marked as "test-only" in the build system (not possible using go command)

to both golang.org/x/tools/go/packages.Package and golang.org/x/tools/go/analysis.Pass.

This is a concept in Blaze, Bazel, and other build systems but not the go command.

@rsc rsc moved this from Likely Accept to Accepted in Proposals Jun 24, 2024
@rsc rsc changed the title proposal: x/tools/go/analysis: identify test-only packages x/tools/go/analysis: identify test-only packages Jun 24, 2024
@rsc rsc modified the milestones: Proposal, Backlog Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) Proposal Proposal-Accepted Tools This label describes issues relating to any tools in the x/tools repository.
Projects
Status: Accepted
Development

No branches or pull requests

8 participants