Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gcp): Ability to skip project ids in configuration #1350

Closed
TinLe opened this issue Jun 13, 2022 · 9 comments
Closed

feat(gcp): Ability to skip project ids in configuration #1350

TinLe opened this issue Jun 13, 2022 · 9 comments
Assignees

Comments

@TinLe
Copy link

TinLe commented Jun 13, 2022

Currently config.hcl allow one to give specific project or folder, along with all ('*'). Same for list of resources to fetch, e.g. you can list a bunch of resources.

It would be nice to be able to do the reverse, e.g. denylist.

For project, we can give a list of one or more projects and folder to not fetch data from.

For resources, we can give a list of one or more resources to not fetch data from.

Is your feature request related to a problem? Please describe.

When you have many projects and resources, it can be painful to list all projects EXCEPT for the denied one that you do not want. The same applies to resources, when you have to list all resources except for the one or two that you do not want.

Describe the solution you'd like

An example using GCP provider config:

provider "gcp" {
configuration {
// Nice to be able to put denied folders here, so fetch data from all folders except these.
// folder_ids_denylist = [ "organizations/<ORG_ID>", "folders/<FOLDER_ID>" ]

// project_ids_denylist = [<CHANGE_THIS_TO_YOUR_PROJECT_ID>]

}

// list of resources to NOT fetch
resources_denylist = [
"resource_manager.projects",
]
}

Describe alternatives you've considered

Without denylist, the current solution is to list all projects, except for the denied ones that we do not want to fetch. This does not work for us as we have thousands of projects.

Same idea for resources. Very painful.

Additional context

It's a nice to have, make configuration easier to manage.

@bbernays
Copy link
Collaborator

@TinLe we do already support the not_resources Attribute for specifying which resources not to fetch. I am working on the documentation for it tomorrow, so once that is done I will send you a link to it

@TinLe
Copy link
Author

TinLe commented Jun 13, 2022

@bbernays Did you meant "skip_resources" ? I found that in the code. I did not realize that option exists. Thank you for pointing me to it.

It would be nice also to have something similar for projects and folders. We have too many projects, and it is painful to list all the ones we want to scan vs just listing the smaller number that we do not want.

@bbernays
Copy link
Collaborator

Seeing as we already support what you are looking for at the core level (skipping of resources) I am going to move this feature request to the GCP provider so that we can look into supporting skipping projects/folders

@bbernays bbernays transferred this issue from cloudquery/cloudquery Jun 13, 2022
@erezrokah erezrokah changed the title Allow blacklists in config.hcl Add deny lists in config.hcl Jun 14, 2022
@erezrokah erezrokah changed the title Add deny lists in config.hcl Add denylists in config.hcl Jun 14, 2022
@erezrokah
Copy link
Member

Thanks for opening the issue @TinLe. I renamed blacklist to denylist as it can considered a more inclusive term. See more in https://www.ncsc.gov.uk/blog-post/terminology-its-not-black-and-white

@bbernays
Copy link
Collaborator

@TinLe - Here is the updated documentation for the skip_resources

@erezrokah erezrokah added the gcp label Aug 16, 2022
@erezrokah erezrokah transferred this issue from cloudquery/cq-provider-gcp Aug 16, 2022
amanenk pushed a commit to amanenk/cloudquery that referenced this issue Sep 12, 2022
…udquery#1350)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/hashicorp/go-hclog](https://togithub.com/hashicorp/go-hclog) | require | patch | `v1.2.1` -> `v1.2.2` |

---

### Release Notes

<details>
<summary>hashicorp/go-hclog</summary>

### [`v1.2.2`](https://togithub.com/hashicorp/go-hclog/releases/tag/v1.2.2)

[Compare Source](https://togithub.com/hashicorp/go-hclog/compare/v1.2.1...v1.2.2)

#### What's Changed

-   fix various typos in comments by [@&cloudquery#8203;marco-m](https://togithub.com/marco-m) in [https://github.com/hashicorp/go-hclog/pull/115](https://togithub.com/hashicorp/go-hclog/pull/115)
-   Omit empty colon when message is empty. Fixes [#&cloudquery#8203;109](https://togithub.com/hashicorp/go-hclog/issues/109) by [@&cloudquery#8203;evanphx](https://togithub.com/evanphx) in [https://github.com/hashicorp/go-hclog/pull/116](https://togithub.com/hashicorp/go-hclog/pull/116)

**Full Changelog**: hashicorp/go-hclog@v1.2.1...v1.2.2

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 3am on Monday" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox.

---

This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzMi4xMzMuMCIsInVwZGF0ZWRJblZlciI6IjMyLjEzMy4wIn0=-->
amanenk pushed a commit to amanenk/cloudquery that referenced this issue Sep 12, 2022
🤖 I have created a release *beep* *boop*
---


## [0.13.4](cloudquery/cq-provider-aws@v0.13.3...v0.13.4) (2022-08-02)


### Features

* Add Kinesis Data Stream support ([cloudquery#1348](cloudquery/cq-provider-aws#1348)) ([767bfab](cloudquery/cq-provider-aws@767bfab))
* Add Tags for ECR Repo ([cloudquery#1369](cloudquery/cq-provider-aws#1369)) ([3b31598](cloudquery/cq-provider-aws@3b31598))
* Added glue databases and tables ([cloudquery#1345](cloudquery/cq-provider-aws#1345)) ([0284a37](cloudquery/cq-provider-aws@0284a37))
* Added glue jobs ([cloudquery#1352](cloudquery/cq-provider-aws#1352)) ([562a6b3](cloudquery/cq-provider-aws@562a6b3))
* Column Resolvers    ([cloudquery#1301](cloudquery/cq-provider-aws#1301)) ([9b2dbed](cloudquery/cq-provider-aws@9b2dbed))


### Bug Fixes

* **deps:** Update module github.com/cloudquery/cq-gen to v0.0.7 ([cloudquery#1362](cloudquery/cq-provider-aws#1362)) ([3060854](cloudquery/cq-provider-aws@3060854))
* **deps:** Update module github.com/hashicorp/go-hclog to v1.2.2 ([cloudquery#1350](cloudquery/cq-provider-aws#1350)) ([82ec301](cloudquery/cq-provider-aws@82ec301))
* Update endpoints ([cloudquery#1347](cloudquery/cq-provider-aws#1347)) ([3191f3e](cloudquery/cq-provider-aws@3191f3e))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
@cqgaurav cqgaurav moved this to Ready in Data Framework Nov 11, 2022
@erezrokah erezrokah changed the title Add denylists in config.hcl feat(gcp): Ability to skip project ids in configuration Nov 29, 2022
@erezrokah
Copy link
Member

Hi @TinLe thank you for opening the issue and @bbernays for responding.

We've since moved from HCL to YAML and released CloudQuery v1.
With YAML configuration format I think it's easier to handle this request by doing the following:

# Get projects list
export CQ_PROJECTS=$(gcloud projects list --format="json" | jq '[.[].projectId]' | jq -c '. - ["project-1-to-skip","project-2-to-skip"]')
kind: source
spec:
  name: gcp
  path: cloudquery/gcp
  ...
  spec:
    project_ids: ${CQ_PROJECTS}

A similar approach can be used to skip folder_ids, WDYT?

@erezrokah erezrokah assigned erezrokah and unassigned bbernays Nov 29, 2022
@bbernays
Copy link
Collaborator

In V1 we expose the project_filter property that allows users to specify their own queries/filter functionalities and then the filtering is handled on the server side of GCP...For example this allows users to exclude any project that begins with TEST*

https://www.cloudquery.io/docs/plugins/sources/gcp/configuration#gcp-spec

https://cloud.google.com/resource-manager/reference/rest/v1beta1/projects/list

@erezrokah
Copy link
Member

n V1 we expose the project_filter property that allows users to specify their own queries/filter functionalities and then the filtering is handled on the server side of GCP...For example this allows users to exclude any project that begins with TEST*

Thanks @bbernays I noticed that but couldn't find an example to negate the pattern. Do you mind posting one here?

@erezrokah
Copy link
Member

Kudos to @bbernays, this is the syntax for the project_filter https://cloud.google.com/sdk/gcloud/reference/topic/filters which supports negated patterns.

Pending our comments I'll close the issue. @TinLe if the solutions provided doesn't solve your use case, please comment and I'll re-open

Repository owner moved this from Ready to ✅ Done in Data Framework Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants