Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] .bazelignore should support wildcard prefix #7093

Closed
or-shachar opened this issue Jan 11, 2019 · 38 comments
Closed

[FR] .bazelignore should support wildcard prefix #7093

or-shachar opened this issue Jan 11, 2019 · 38 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request

Comments

@or-shachar
Copy link
Contributor

Description of the feature request:

Currently .bazelignore supports only full relative path prefix without any wildcards

I have many node_modules subfolders in my repo that I want to ignore.
With .gitignore I'd just need to put: node_modules and it will be ignored by git.
With .bazelignore I need to specify each folder by it's full relative path from workspace root.

I suggest to adjust behavior to the way .gitignore and save our users the trouble of collecting all folders.

What operating system are you running Bazel on?

We tested it on linux with bazel 0.20.0 and 0.21.0

@iirina iirina added untriaged team-Bazel General Bazel product/strategy issues labels Jan 14, 2019
@ittaiz
Copy link
Member

ittaiz commented Jan 29, 2019

@iirina @dslomov when can we expect this to be triaged?

@dslomov dslomov added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. and removed team-Bazel General Bazel product/strategy issues labels Jul 19, 2019
@laurentlb laurentlb added P3 We're not considering working on this, but happy to review a PR. (No assignee) type: feature request and removed untriaged labels Jul 26, 2019
@jupp0r
Copy link

jupp0r commented Jan 17, 2020

Is somebody working on this issue? If not, I'd like to start working on a patch.

@gregmagolan
Copy link
Contributor

gregmagolan commented Feb 13, 2020

+1 for the Angular repo. .bazelrc could be a few lines with wildcards instead of

node_modules
dist
aio/content
aio/node_modules
aio/tools/examples/shared/node_modules
integration/bazel/node_modules
integration/bazel/.yarn_local_cache
integration/bazel-schematics/node_modules
integration/bazel-schematics/.yarn_local_cache
integration/bazel-schematics/demo
integration/cli-hello-world-ivy-compat/node_modules
integration/cli-hello-world-ivy-compat/node_modules
integration/cli-hello-world-ivy-i18n/node_modules
integration/cli-hello-world-ivy-minimal/node_modules
integration/cli-hello-world-lazy/node_modules
integration/cli-hello-world-lazy-rollup/node_modules
integration/dynamic-compiler/node_modules
integration/hello_world__closure/node_modules
integration/hello_world__systemjs_umd/node_modules
integration/i18n/node_modules
integration/injectable-def/node_modules
integration/ivy-i18n/node_modules
integration/language_service_plugin/node_modules
integration/ng_elements/node_modules
integration/ng_update/node_modules
integration/ng_update_migrations/node_modules
integration/ngcc/node_modules
integration/platform-server/node_modules
integration/service-worker-schema/node_modules
integration/side-effects/node_modules
integration/terser/node_modules
integration/typings_test_ts36/node_modules
integration/cli-hello-world/.yarn_local_cache
integration/cli-hello-world-ivy-compat/.yarn_local_cache
integration/cli-hello-world-ivy-i18n/.yarn_local_cache
integration/cli-hello-world-ivy-minimal/.yarn_local_cache
integration/cli-hello-world-lazy/.yarn_local_cache
integration/cli-hello-world-lazy-rollup/.yarn_local_cache
integration/dynamic-compiler/.yarn_local_cache
integration/hello_world__closure/.yarn_local_cache
integration/hello_world__systemjs_umd/.yarn_local_cache
integration/i18n/.yarn_local_cache
integration/injectable-def/.yarn_local_cache
integration/ivy-i18n/.yarn_local_cache
integration/language_service_plugin/.yarn_local_cache
integration/ng_elements/.yarn_local_cache
integration/ng_update/.yarn_local_cache
integration/ng_update_migrations/.yarn_local_cache
integration/ngcc/.yarn_local_cache
integration/platform-server/.yarn_local_cache
integration/service-worker-schema/.yarn_local_cache
integration/side-effects/.yarn_local_cache
integration/terser/.yarn_local_cache
integration/typings_test_ts36/.yarn_local_cache
packages/bazel/node_modules

@cgruber
Copy link
Contributor

cgruber commented Feb 28, 2020

+1 for all the rules_* repos, which typically have a variety of sub-repos to handle testing the rules.

@philwo philwo added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Jun 15, 2020
patrickt added a commit to github/semantic that referenced this issue Jul 13, 2020
@tclem ran into an issue where he had built `semantic-source` from the
`semantic-source` directory; `cabal` placed a `dist-newstyle`
directory there, which was fouling up Bazel due to the fact that
`cabal` generates binary files called `BUILD`, and Bazel tries to
interpret these files as package configurations. Not good!

We can't use a glob here because .bazelignore files don't support
globbing (bazelbuild/bazel#7093), so it
suffices to ignore all the projects' `dist-newstyle` directories so
that no one runs into this again.
robrix pushed a commit to github/semantic that referenced this issue Jul 16, 2020
@tclem ran into an issue where he had built `semantic-source` from the
`semantic-source` directory; `cabal` placed a `dist-newstyle`
directory there, which was fouling up Bazel due to the fact that
`cabal` generates binary files called `BUILD`, and Bazel tries to
interpret these files as package configurations. Not good!

We can't use a glob here because .bazelignore files don't support
globbing (bazelbuild/bazel#7093), so it
suffices to ignore all the projects' `dist-newstyle` directories so
that no one runs into this again.
@David-Lam
Copy link

David-Lam commented Jul 27, 2021

Hi everyone! I have create a PR #13756 that add wildcard to .bazelignore. Feel free to review it. Thanks! @cgruber @ittaiz

@nkoroste
Copy link
Contributor

also related #13807

@philwo philwo removed the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Nov 29, 2021
@guw
Copy link
Contributor

guw commented Jan 11, 2022

Is this a dup of #8106?

hanwen pushed a commit to GerritCodeReview/gerrit that referenced this issue Jun 7, 2022
Bazel removed `managed_directories` feature on Bazel@HEAD, see: [1].
Design document is here: [2].

Also extend the .bazelignore and add explicitly the path to the
node_modules directories. Note, that Bazel currently doesn't support
the same semantic as .gitignore. For more details see this issue: [3]
and this issue specific to rules_nodejs: [4].

[1] bazelbuild/bazel#15463
[2] https://docs.google.com/document/d/1u9V5RUc7i6Urh8gGfnSurxpWA7JMRtwCi1Pr5BHeE44/edit
[3] bazelbuild/bazel#7093
[4] bazelbuild/bazel#8106

Release-Notes: skip
Change-Id: I5dc582e05e4116064fc06d438d4b8a8b57b6bb8d
@Toxaris
Copy link

Toxaris commented Jan 31, 2023

Experience report:

Like others, we have a monorepo where developers run bazel build //... but we also install NodeJS packages into node_modules folders in the source tree (to make the IDE happy). We would prefer to put **/node_modules into the .bazelignore, but since that is not supported, we have a script that generates the .bazelignore from the repo structure, and a check in our CI that the .bazelignore has been generated by the script.

Even with these precautions, we run into issues after the repo structure has changed upstream and a developer does git checkout or git pull on their machine. Developers will get the new .bazelignore but still have the old node_modules folders in their source tree. Since the node_module folders contain symlinks to internal dependencies, bazel will expand the dependency graph into a tree and find an exponential number of copies of our internal packages. It will either crash because of memory issues, or try to make sense of these weird packages and run into visibility issues.

I acknowledge that the root cause here is really that we put non-source files into the source tree, but we are not aware of a practical way to avoid that (short of building our own IDE plugins). Wildcards in the .bazelignore file would give us a much nicer development experience in practice.

dgp1130 added a commit to dgp1130/rules_prerender that referenced this issue Mar 8, 2023
This creates a new `pnpm-workspace.yaml` which makes it install `node_modules/` in all the `packages/*` directories. This wasn't necessary before because `rules_prerender` and `@rules_prerender/declarative_shadow_dom` don't have any external dependencies, but it is necessary to install package-specific dependencies.

Also updates `tsconfig.json` to work for any `@rules_prerender/*` packages instead of just `@rules_prerender/declarative_shadow_dom`.

`.bazelignore` needs to be updated as well, but doesn't support globs unfortunately (see bazelbuild/bazel#7093). `node_modules/` verification was also incorrectly being done on the user `.bazelignore` file, rather than `@rules_prerender`'s `.bazelignore` when loaded as an external workspace. Used a `Label()` constructor to fix that.
dgp1130 added a commit to dgp1130/rules_prerender that referenced this issue Mar 9, 2023
This creates a new `pnpm-workspace.yaml` which makes it install `node_modules/` in all the `packages/*` directories. This wasn't necessary before because `rules_prerender` and `@rules_prerender/declarative_shadow_dom` don't have any external dependencies, but it is necessary to install package-specific dependencies.

Also updates `tsconfig.json` to work for any `@rules_prerender/*` packages instead of just `@rules_prerender/declarative_shadow_dom`.

`.bazelignore` needs to be updated as well, but doesn't support globs unfortunately (see bazelbuild/bazel#7093). `node_modules/` verification was also incorrectly being done on the user `.bazelignore` file, rather than `@rules_prerender`'s `.bazelignore` when loaded as an external workspace. Used a `Label()` constructor to fix that.
dgp1130 added a commit to dgp1130/rules_prerender that referenced this issue Mar 9, 2023
This creates a new `pnpm-workspace.yaml` which makes it install `node_modules/` in all the `packages/*` directories. This wasn't necessary before because `rules_prerender` and `@rules_prerender/declarative_shadow_dom` don't have any external dependencies, but it is necessary to install package-specific dependencies.

Also updates `tsconfig.json` to work for any `@rules_prerender/*` packages instead of just `@rules_prerender/declarative_shadow_dom`.

`.bazelignore` needs to be updated as well, but doesn't support globs unfortunately (see bazelbuild/bazel#7093). `node_modules/` verification was also incorrectly being done on the user `.bazelignore` file, rather than `@rules_prerender`'s `.bazelignore` when loaded as an external workspace. Used a `Label()` constructor to fix that.
lberki added a commit that referenced this issue Oct 16, 2024
Progress towards #7093.

RELNOTES: None.
PiperOrigin-RevId: 686088272
Change-Id: I17ea2481ed58b76352ff1939b78ba93b8bdc9eac
lberki added a commit that referenced this issue Oct 16, 2024
This required a number of somewhat awkward .toRelative() calls, but the alternative would have been to make IgnoredSubdirectories either have all absolute paths or all relative paths (both prefixes and inputs) and this one seemed simpler.

Progress towards #7093.

RELNOTES: None.
PiperOrigin-RevId: 686411349
Change-Id: I62b0b7f12801e7b44b6e064893c1dc0cb282badb
@lberki
Copy link
Contributor

lberki commented Oct 17, 2024

I have a change out that almost works. The "almost" part is because apparently requring the evaluation of REPO.bazel to determine the ignored package prefixes when the WORKSPACE file is enabled sometimes results in a Skyframe dependency cycle (for example in testNoModuleDotBazelAndFallbackToWorkspace).

The dependency cycle is as follows:

  1. The repo mapping for the main repository depends on the //external package (this makes sense)
  2. The //external package depends on ExternalPackageFunction (this also makes sense)
  3. That one in turn depends on the WORKSPACE file, i.e. on WorkspaceFileValue (this makes sense, too)
  4. WorkspaceFileValue depends on a Starlark file (i.e. a BzlLoadValue) in a repository defined in the WORKSPACE file itself if it loads a symbol from it (this makes sense; loading makes the dependency edge necessary)
  5. Loading the Starlark file depends on a PackageLookupValue in that repository.
  6. The PackageLookupValue depends on IgnoredPackagePrefixesValue (this makes sense; it needs to so that it knows which packages exist)
  7. The IgnoredPackagePrefixesValue depends on the REPO.bzl of that repository (makes sense, this is the new edge I am adding)
  8. The REPO.bzl file depends on the repository mapping for the main repository (apparently, to support Label.debugPrint()?)

Thus, dependency cycle.

More simply: The main repository mapping depends on the WORKSPACE file, which in turn can depend on a Starlark file in an external repository, which then can depend on its REPO.bzl file if it exists, which then depends on the main repository mapping.

@meteorcloudy do you have any ideas how to fix this?

@fmeum
Copy link
Collaborator

fmeum commented Oct 17, 2024

You can cut the last edge without any loss in functionality. REPO.bazel can't load and you can't use Label in it, so it doesn't need access to the main repo mapping.

@lberki
Copy link
Contributor

lberki commented Oct 17, 2024

@fmeum You are technically correct, but REPO.bazel does deal in Labels as parameters of repo() (e.g. default_compatible_with= and the like). So yes, that dependency edge could in theory be cut today, but it's probably not wise.

@fmeum
Copy link
Collaborator

fmeum commented Oct 17, 2024

@lberki But doesn't that only require the repo mapping of the repo containing the REPO.bazel file? The fixed dependency on the main repo mapping could then be cut. In the cycle you mention, it's not the main repo's REPO.bazel file that causes the cycle.

@lberki
Copy link
Contributor

lberki commented Oct 17, 2024

I don't know. I assumed that the reason why the main repo mapping is passed in is so that the labels reported on the debug output are consistent, no matter which repository they are in. @meteorcloudy is that correct?

It looks like we do pass in null in some cases (notably, when MODULE.bazel is evaluated). However, that doesn't solve the whole problem because there is still this line of code which also requires the main repo mapping. It's also about user-friendly output and it looks like there is some code that handles null there, too, but I wouldn't replace that with null without understanding why the main repo mapping is needed there, which I currently don't.

Let me create a pull request so that we can better discuss the change...

@lberki
Copy link
Contributor

lberki commented Oct 17, 2024

#24022 it is.

@lberki
Copy link
Contributor

lberki commented Oct 17, 2024

(paging @Wyverald )

@lberki
Copy link
Contributor

lberki commented Oct 17, 2024

Looks promising! I didn't have time to make a proper pull request, but this change implementing the "split RepoFileFunction into two parts, one not requiring a repository mapping" approach:

https://bazel-review.googlesource.com/c/bazel/+/261670

seems to be looking good, although the tests didn't finish by the end of my workday:

https://buildkite.com/bazel/google-bazel-presubmit/builds/85367

lberki added a commit that referenced this issue Oct 18, 2024
This is accomplished by a new directive in REPO.bazel,
"ignore_directories()". It takes a single argument, a list of
directories to ignore and it allows the same wildcards as glob().

This is done separately from .bazelignore to provide a migration path
off of that weird single-purpose configuration file.

Implementing this requires splitting RepoFileFunction into two: a part
that parses the repository file and one that creates a PackageArgs
instance. This was necessary to avoid a Skyframe dependency cycle: when
a WORKSPACE file is present and it loads a .bzl file from a repository
with a REPO.bazel file, the repo mapping for the main repository depends
on the WORKSPACE file, which depends on the .bzl file, which depends on
the IgnoredPackagePrefixesValue of its repository, which then depends on
the repo mapping of the main repository and the one the .bzl file is in,
which then depend on the WORKSPACE file.

Fixes #7093.

RELNOTES[NEW]: REPO.bazel now allows another directive,
"ignore_directories()". It takes a list of directories to ignore just
like .bazelignore does, but with glob semantics.
@lberki
Copy link
Contributor

lberki commented Oct 18, 2024

#24032 implements this feature, makes all test cases pass and is principled enough to be submittable. Let the review begin!

lberki added a commit that referenced this issue Nov 4, 2024
This is accomplished by a new directive in REPO.bazel,
"ignore_directories()". It takes a single argument, a list of
directories to ignore and it allows the same wildcards as glob().

This is done separately from .bazelignore to provide a migration path
off of that weird single-purpose configuration file.

Implementing this requires splitting RepoFileFunction into two: a part
that parses the repository file and one that creates a PackageArgs
instance. This was necessary to avoid a Skyframe dependency cycle: when
a WORKSPACE file is present and it loads a .bzl file from a repository
with a REPO.bazel file, the repo mapping for the main repository depends
on the WORKSPACE file, which depends on the .bzl file, which depends on
the IgnoredPackagePrefixesValue of its repository, which then depends on
the repo mapping of the main repository and the one the .bzl file is in,
which then depend on the WORKSPACE file.

Fixes #7093.

RELNOTES[NEW]: REPO.bazel now allows another directive,
"ignore_directories()". It takes a list of directories to ignore just
like .bazelignore does, but with glob semantics.
@lberki
Copy link
Contributor

lberki commented Nov 5, 2024

This is now fixed at HEAD: you can use the ignore_directories() directive in REPO.bazel to achieve the same effect as .bazelignore, except supporting glob-style wildcards.

@meteorcloudy do you think I should backport this feature to Bazel 8.0?

@meteorcloudy
Copy link
Member

I think we should do it, this is a nice feature that probably needs some migration time for community, so let's get it in an official release as soon as possible.

@lberki
Copy link
Contributor

lberki commented Nov 5, 2024

Ack. @Wyverald opined that you'd think otherwise, but I assume it was not that he had concerns, it's just that he thought you had. How urgent is the backporting?

lberki added a commit that referenced this issue Nov 5, 2024
This is accomplished by a new directive in REPO.bazel, "ignore_directories()". It takes a single argument, a list of directories to ignore and it allows the same wildcards as glob().

This is done separately from .bazelignore to provide a migration path off of that weird single-purpose configuration file.

Implementing this requires splitting RepoFileFunction into two: a part that parses the repository file and one that creates a PackageArgs instance. This was necessary to avoid a Skyframe dependency cycle: when a WORKSPACE file is present and it loads a .bzl file from a repository with a REPO.bazel file, the repo mapping for the main repository depends on the WORKSPACE file, which depends on the .bzl file, which depends on the IgnoredPackagePrefixesValue of its repository, which then depends on the repo mapping of the main repository and the one the .bzl file is in, which then depend on the WORKSPACE file.

Fixes #7093.

RELNOTES[NEW]: REPO.bazel now allows another directive, "ignore_directories()". It takes a list of directories to ignore just like .bazelignore does, but with glob semantics.

Closes #24032.

PiperOrigin-RevId: 693227896
Change-Id: Ia3e02a2bfe9caf999fc641f75261b528b19c1d03
@lberki
Copy link
Contributor

lberki commented Nov 5, 2024

#24203 sent out to backport this to Bazel 8.0.0.

@meteorcloudy
Copy link
Member

How urgent is the backporting?

We still have about 2 weeks before the final release, but the sooner the better ;)

@lberki
Copy link
Contributor

lberki commented Nov 5, 2024

Already done, it was much easier than I expected.

github-merge-queue bot pushed a commit that referenced this issue Nov 5, 2024
This is accomplished by a new directive in REPO.bazel,
"ignore_directories()". It takes a single argument, a list of
directories to ignore and it allows the same wildcards as glob().

This is done separately from .bazelignore to provide a migration path
off of that weird single-purpose configuration file.

Implementing this requires splitting RepoFileFunction into two: a part
that parses the repository file and one that creates a PackageArgs
instance. This was necessary to avoid a Skyframe dependency cycle: when
a WORKSPACE file is present and it loads a .bzl file from a repository
with a REPO.bazel file, the repo mapping for the main repository depends
on the WORKSPACE file, which depends on the .bzl file, which depends on
the IgnoredPackagePrefixesValue of its repository, which then depends on
the repo mapping of the main repository and the one the .bzl file is in,
which then depend on the WORKSPACE file.

Fixes #7093.

RELNOTES[NEW]: REPO.bazel now allows another directive,
"ignore_directories()". It takes a list of directories to ignore just
like .bazelignore does, but with glob semantics.

Closes #24032.

PiperOrigin-RevId: 693227896
Change-Id: Ia3e02a2bfe9caf999fc641f75261b528b19c1d03
@navignaw
Copy link

navignaw commented Nov 5, 2024

Is there any documentation for this change? I was able to piece together the syntax from the unit tests in the PR, but would be great to surface this in docs somewhere for other users!

@Wyverald
Copy link
Member

Wyverald commented Nov 5, 2024

Thanks for the nudge -- I'll be looking at adding docs tomorrow!

copybara-service bot pushed a commit that referenced this issue Nov 6, 2024
Follow-up for #7093

PiperOrigin-RevId: 693795472
Change-Id: Icac2f5658d1579377a38979905214ead9da73dac
@Wyverald
Copy link
Member

Wyverald commented Nov 6, 2024

docs are live: https://bazel.build/rules/lib/globals/repo#ignore_directories

@dgoldstein0
Copy link

thanks for implementing this, very much looking forward to using it once my team upgrades to bazel 8!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.