Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File based Python logging config #1301

Conversation

chandanchowdhury
Copy link
Collaborator

Will resolve #1213 along with supporting more customization.

@chandanchowdhury chandanchowdhury marked this pull request as ready for review May 1, 2024 19:31
@chandanchowdhury
Copy link
Collaborator Author

Requesting review @achantavy

@@ -575,6 +583,9 @@ def main(self, argv: str) -> int:
# TODO support parameter lookup in environment variables if not present on command line
config: argparse.Namespace = self.parser.parse_args(argv)
# Logging config
if config.logging_config:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment linking to the Python doc for this API? This way readers can know the correct format for a logging config file.

@@ -0,0 +1,65 @@
[logger_root]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment indicating that this is a sample logging config file to be used with the --logging-config argument

chandanchowdhury and others added 26 commits November 26, 2024 23:39
Signed-off-by: chandanchowdhury <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
## Summary

We've found constant errors in the GitHub module when the response data
does not have the expected schema. These updates, while not always
prevent the crash, they will provide more insight into what happened.

Signed-off-by: chandanchowdhury <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
AWS SSO role names are weird because they look like
`AWSReservedSSO_myrolename_<somehash>`. This caused our awssaml module
to not draw links from Okta groups to these SSO roles correctly. This PR
updates the module with the correct string comparisons to do this.

Screenshot showing that this works:
![Screenshot 2024-06-03 at 1 49
03 PM](https://github.com/lyft/cartography/assets/46503781/82ef7971-36f3-4f07-ac9c-7d0f856489e2)

Signed-off-by: chandanchowdhury <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
…artography-cncf#1312)

Fixes cartography-cncf#1302: refactors EC2 launch template sync to use data model. This
way, writes to the graph are automatically batched and write failures
are retried.

Signed-off-by: chandanchowdhury <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
cartography-cncf#1309)

While running the AWS sync in Cartography, the following error occurs,
causing the sync process to fail:

```
ERROR:cartography.sync:Unhandled exception during sync stage 'aws'
Traceback (most recent call last):
  File "/home/REDACTED/cartography/cartography/sync.py", line 113, in run
    stage_func(neo4j_session, config)
  File "/home/REDACTED/cartography/cartography/util.py", line 197, in timed
    return method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/REDACTED/cartography/cartography/intel/aws/__init__.py", line 298, in start_aws_ingestion
    sync_successful = _sync_multiple_accounts(
                      ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/REDACTED/cartography/cartography/intel/aws/__init__.py", line 169, in _sync_multiple_accounts
    _sync_one_account(
  File "/home/REDACTED/cartography/cartography/intel/aws/__init__.py", line 64, in _sync_one_account
    RESOURCE_FUNCTIONS[func_name](**sync_args)
  File "/home/REDACTED/cartography/cartography/util.py", line 197, in timed
    return method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/REDACTED/cartography/cartography/intel/aws/iam.py", line 819, in sync
    sync_user_access_keys(neo4j_session, boto3_session, current_aws_account_id, update_tag, common_job_parameters)
  File "/home/REDACTED/cartography/cartography/util.py", line 197, in timed
    return method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/REDACTED/cartography/cartography/intel/aws/iam.py", line 795, in sync_user_access_keys
    access_keys = get_account_access_key_data(boto3_session, user["name"])
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/REDACTED/cartography/cartography/util.py", line 197, in timed
    return method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/REDACTED/cartography/cartography/intel/aws/iam.py", line 230, in get_account_access_key_data
    for access_key in access_keys['AccessKeyMetadata']:
                      ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
KeyError: 'AccessKeyMetadata'
ERROR:__main__:Error in AWS account sync REDACTED: 'AccessKeyMetadata'
```
The fix involves returning the access_keys object directly from the
get_account_access_key_data function. This ensures that the function
returns the correct data structure, even if the AccessKeyMetadata key is
missing.

The function was tested with various AWS accounts to ensure it correctly
handles cases where the AccessKeyMetadata key is present and when it is
missing.
Verified that the sync process was completed successfully without any
errors.

Co-authored-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
cartography-cncf#1310)

This merge request fixes an issue where the `describe_images` function
in `cartography/intel/aws/ec2/images.py` encountered a NoneType error.
The error occurred because None values were being included in the
image_ids list, which is not allowed by the AWS SDK. In one particular
account of several hundred tested, an image ID was returned that was
None. I was able to verify that this allows all of my accounts to run
ingestion successfully.

Signed-off-by: chandanchowdhury <[email protected]>
…artography-cncf#1315)

Adds missing cleanup job.

(Sorry for the mistake)

Signed-off-by: chandanchowdhury <[email protected]>
See neo4j/neo4j#13404 for context.

The neo4j 4.x container in our CI occasionally fails integration tests
because we create indexes quickly without giving the database enough
time for those actions to settle (roughly speaking).

This PR uses pytest-rerunfailures to retry flaky tests so that we don't
need to manually re-run them.

Credit to @heryxpc for figuring this out.

Signed-off-by: chandanchowdhury <[email protected]>
…ubnetid/ (cartography-cncf#1320)

Fixes cartography-cncf#1316.

Fixes a typo where EC2 subnets as known by EC2 instances would have
their id in `subnet_id` instead of `subnetid`. This would cause a
missing relationship between the subnet and VPC because VPCs attach to
subnets using `subnetid`; see
https://github.com/lyft/cartography/blob/098d8ca5f4bb172944338dad9df797a36e23485a/cartography/intel/aws/ec2/subnets.py#L50-L51.

This PR is the same as cartography-cncf#1318 but with tests; getting this fixed asap.

See https://lyftoss.slack.com/archives/CTZUQL0KX/p1718644518442939 for
more context.

Signed-off-by: chandanchowdhury <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
Minor change to remove hardcoded settings that likely don't fit many
environments

Co-authored-by: i_virus <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
achantavy and others added 25 commits November 26, 2024 23:39
### Summary
> Describe your changes.

The crxcavator project does not seem to be actively maintained anymore.
We are removing it from cartography.

-
https://www.reddit.com/r/cybersecurity/comments/1fklwuz/did_crxcavator_stop_working_for_anyone_else/
-
https://www.reddit.com/r/cybersecurity/comments/1fp45s7/cisco_duo_spinning_down_crxcavatorio/

Signed-off-by: chandanchowdhury <[email protected]>
### Summary
PR cartography-cncf#248 moved the location of `drift-detect.md`. This PR adjusts source
to point to: https://lyft.github.io/cartography/usage/drift-detect.html

### Related issues or links

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [ ] Update/add unit or integration tests.
- [ ] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [ ] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).

If you are implementing a new intel module:
- [ ] Use the NodeSchema [data
model](https://lyft.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: Emmanuel Ferdman <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
… preferred way to try cartography (cartography-cncf#1364)

### Summary
> Describe your changes.

Addresses cartography-cncf#1363 and cartography-cncf#1347.
- Updates all relevant documentation to prefer running docker-compose
for end-users.
- Provides documentation on how to use docker and docker-compose for
developers.

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [ ] Update/add unit or integration tests.
- [x] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

#### Proof that this works
- Running an AWS sync using docker-compose
<img width="988" alt="docker-compose-run"
src="https://github.com/user-attachments/assets/33432890-a8be-4630-9340-60e8875d55d6">

  CLI used:
  ```
docker-compose run -e AWS_PROFILE=1234_testprofile -e
AWS_DEFAULT_REGION=us-east-1 cartography --neo4j-uri
bolt://cartography-neo4j-1:7687
  ```

Signed-off-by: chandanchowdhury <[email protected]>
…raphy-cncf#1367)

docker-compose maps in ./.compose/* in the host source tree to the neo4j
container so that neo4j data can be persisted on each container re-run.

dev.Dockerfile copies in the contents of everything on the host source
tree, and that also includes the contents of .compose, which can get
very big.

This PR makes sure that we don't include .compose in the dev.Dockerfile
so that the size stays manageable.

Signed-off-by: chandanchowdhury <[email protected]>
### Summary
> Describe your changes.

Now that cartography has been donated to the CNCF, time to update the
docs

Signed-off-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
First release after CNCF donation, lets see if the CI tooling works

Signed-off-by: chandanchowdhury <[email protected]>
### Summary
> Describe your changes.

Per cncf/sandbox#135, including governance
guidelines.

Signed-off-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
### Summary

Motivation: The library dependency graph is already populated with
`PythonLibrary` nodes via Cartography's Github module, but no other
languages are supported. This PR adds `GoLibrary` nodes to the library
dependency graph to bring support for Go up to parity with Python. I
concur with the recommendation from @heryxpc that rather than writing
code to manually parse go.mod files, we should instead use the
dependency data returned by the Semgrep API.

Cartography's Semgrep module is already able to import supply chain
vulnerability data from the
[Findings](https://semgrep.dev/api/v1/docs/#tag/Finding/operation/semgrep_app.core_exp.findings.handlers.issue.openapi_list_recent_issues)
endpoint of the Semgrep API. Semgrep also provides a [List
Dependencies](https://semgrep.dev/api/v1/docs/#tag/SupplyChainService/operation/semgrep_app.products.sca.handlers.dependency.list_dependencies_conexxion)
endpoint that returns a list of every known dependency for a given
ecosystem (e.g. specifying the “gomod” ecosystem returns all
dependencies found in go.mod files). The response contains useful
information including the transitivity of the dependency and a link to
where it’s defined in source code.

The dependency nodes imported from the Semgrep API will be labelled
`GoLibrary::SemgrepDependency::Dependency` and will match the properties
of existing `PythonLibrary::Dependency` nodes as closely as possible.
This PR only imports Go dependencies from Semgrep, but I've structured
the code to make it easy to import additional languages from Semgrep in
the future.

Before these changes, a project with both Python and Go dependencies
will only have PythonLibrary nodes in the dependency graph:
<img width="1019" alt="image"
src="https://github.com/user-attachments/assets/9e291012-103e-4dae-a2bb-2da5205421b7">

After these changes, for the same project the graph contains both
PythonLibrary and GoLibrary nodes:
<img width="1015" alt="image"
src="https://github.com/user-attachments/assets/f945e489-6a3e-4edf-85d4-424bacd763b2">

<details>
<summary>Logs from semgrep module before these changes</summary>

```
INFO:cartography.sync:Starting sync with update tag '1730497895'
INFO:cartography.sync:Starting sync stage 'create-indexes'
INFO:cartography.intel.create_indexes:Creating indexes for cartography node types.
INFO:cartography.sync:Finishing sync stage 'create-indexes'
INFO:cartography.sync:Starting sync stage 'semgrep'
INFO:cartography.intel.semgrep.findings:Running Semgrep SCA findings sync job.
INFO:cartography.intel.semgrep.findings:Loading Semgrep deployment info {'id': ...} into the graph...
INFO:cartography.intel.semgrep.findings:Retrieving Semgrep SCA vulns for deployment 'X'.
INFO:cartography.intel.semgrep.findings:Processed page 0 of Semgrep SCA vulnerabilities.
...
INFO:cartography.intel.semgrep.findings:Processed page X of Semgrep SCA vulnerabilities.
INFO:cartography.intel.semgrep.findings:Retrieved X Semgrep SCA vulns in X pages.
INFO:cartography.intel.semgrep.findings:Loading X Semgrep SCA vulns info into the graph.
INFO:cartography.intel.semgrep.findings:Loading X Semgrep SCA usages info into the graph.
INFO:cartography.graph.statement:Completed semgrep_sca_risk_analysis statement #1
...
INFO:cartography.graph.statement:Completed semgrep_sca_risk_analysis statement #X
INFO:cartography.graph.job:Finished job semgrep_sca_risk_analysis
INFO:cartography.intel.semgrep.findings:Running Semgrep SCA findings cleanup job.
INFO:cartography.graph.statement:Completed SemgrepSCAFinding statement #1
...
INFO:cartography.graph.statement:Completed SemgrepSCAFinding statement #X
INFO:cartography.graph.job:Finished job SemgrepSCAFinding
INFO:cartography.intel.semgrep.findings:Running Semgrep SCA Locations cleanup job.
INFO:cartography.graph.statement:Completed SemgrepSCALocation statement #1
...
INFO:cartography.graph.statement:Completed SemgrepSCALocation statement #X
INFO:cartography.graph.job:Finished job SemgrepSCALocation
INFO:cartography.sync:Finishing sync stage 'semgrep'
INFO:cartography.sync:Finishing sync with update tag '1730497895'
```
</details>

<details>
<summary>Logs from semgrep module after these changes</summary>

```
INFO:cartography.sync:Starting sync with update tag '1730505324'
INFO:cartography.sync:Starting sync stage 'create-indexes'
INFO:cartography.intel.create_indexes:Creating indexes for cartography node types.
INFO:cartography.sync:Finishing sync stage 'create-indexes'
INFO:cartography.sync:Starting sync stage 'semgrep'
INFO:cartography.intel.semgrep.deployment:Loading Semgrep deployment info {'id': ...} into the graph...
INFO:cartography.intel.semgrep.dependencies:Running Semgrep dependencies sync job.
INFO:cartography.intel.semgrep.dependencies:Retrieving Semgrep dependencies for deployment 'X'.
INFO:cartography.intel.semgrep.dependencies:Processed page 0 of Semgrep dependencies.
...
INFO:cartography.intel.semgrep.dependencies:Processed page X of Semgrep dependencies.
INFO:cartography.intel.semgrep.dependencies:Retrieved X Semgrep dependencies in X pages.
INFO:cartography.intel.semgrep.dependencies:Loading X GoLibrary objects into the graph.
INFO:cartography.intel.semgrep.dependencies:Running Semgrep Go Library cleanup job.
INFO:cartography.graph.statement:Completed GoLibrary statement #1
...
INFO:cartography.graph.statement:Completed GoLibrary statement #X
INFO:cartography.graph.job:Finished job GoLibrary
INFO:cartography.intel.semgrep.findings:Running Semgrep SCA findings sync job.
INFO:cartography.intel.semgrep.findings:Retrieving Semgrep SCA vulns for deployment 'lyft'.
INFO:cartography.intel.semgrep.findings:Processed page 0 of Semgrep SCA vulnerabilities.
...
INFO:cartography.intel.semgrep.findings:Processed page X of Semgrep SCA vulnerabilities.
INFO:cartography.intel.semgrep.findings:Retrieved X Semgrep SCA vulns in X pages.
INFO:cartography.intel.semgrep.findings:Loading X Semgrep SCA vulns info into the graph.
INFO:cartography.intel.semgrep.findings:Loading X Semgrep SCA usages info into the graph.
INFO:cartography.graph.statement:Completed semgrep_sca_risk_analysis statement #1
...
INFO:cartography.graph.statement:Completed semgrep_sca_risk_analysis statement #X
INFO:cartography.graph.job:Finished job semgrep_sca_risk_analysis
INFO:cartography.intel.semgrep.findings:Running Semgrep SCA findings cleanup job.
INFO:cartography.graph.statement:Completed SemgrepSCAFinding statement #1
...
INFO:cartography.graph.statement:Completed SemgrepSCAFinding statement #X
INFO:cartography.graph.job:Finished job SemgrepSCAFinding
INFO:cartography.intel.semgrep.findings:Running Semgrep SCA Locations cleanup job.
INFO:cartography.graph.statement:Completed SemgrepSCALocation statement #1
...
INFO:cartography.graph.statement:Completed SemgrepSCALocation statement #X
INFO:cartography.graph.job:Finished job SemgrepSCALocation
INFO:cartography.sync:Finishing sync stage 'semgrep'
INFO:cartography.sync:Finishing sync with update tag '1730497895'
```
</details>

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [x] Update/add unit or integration tests.
- [x] Include a screenshot showing what the graph looked like before and
after your changes.
- [x] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [x] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).

If you are implementing a new intel module:
- [x] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

### TODO
- [ ] Clean up TODO comments in code
- [ ] Add/update files like
cartography/data/jobs/scoped_analysis/semgrep_sca_risk_analysis.json?

---------

Signed-off-by: Hans Wernetti <[email protected]>
Signed-off-by: Alex Chantavy <[email protected]>
Co-authored-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
### Summary
> Describe your changes.

Fixes the docker-compose steps to work on WSL2 (Windows Subsystem for
Linux 2) and OSX. Updates documentation to use the new cncf tag when
building the container.

docker-compose is very helpful for dev setups and trying out
cartography.

### Checklist

### Testing performed
I started with a fresh clone of this branch and on _both_ OSX and
WSL2ran

1. `docker build -t cartography-cncf/cartography-dev -f dev.Dockerfile
./`
1. `docker-compose run cartography-dev make test`

I can confirm that both paths worked and ran the full test suite using
docker-compose.

#### WSL2
Linter:

![image](https://github.com/user-attachments/assets/b165feb6-0d0a-4a9c-90dd-fe47e8390b0d)

Unit tests:

![image](https://github.com/user-attachments/assets/4cfa8a41-6775-4ac1-ad9c-6052163c0a17)

Integration tests:

![image](https://github.com/user-attachments/assets/f134dec5-8df2-4c9b-a7f6-57cccac51f89)

#### OSX
Linter:
<img width="1000" alt="Screenshot 2024-11-03 at 12 29 49 AM"
src="https://github.com/user-attachments/assets/e7840c2a-8065-4163-8057-3de902532291">

Unit tests:
<img width="997" alt="Screenshot 2024-11-03 at 12 30 07 AM"
src="https://github.com/user-attachments/assets/11d91723-776c-4438-b2a1-fe783e75e414">

Integration tests:
<img width="1003" alt="Screenshot 2024-11-03 at 12 30 22 AM"
src="https://github.com/user-attachments/assets/c2360ad5-a2c2-4309-91d0-1080dc9d0bb3">

---------

Signed-off-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
Bump release to 0.95.0.

Signed-off-by: chandanchowdhury <[email protected]>
### Summary

Small followup to
cartography-cncf#1368, noticed that
the schema doc is not being rendered correctly:

![image](https://github.com/user-attachments/assets/e3f34e94-a7c0-4b39-b243-460fe39acb57)

I'm not sure how to test that it will render correctly after this
change, but this format matches the existing format used by the github
schema:

https://github.com/cartography-cncf/cartography/blob/f11d7b2ff87331ed736a5de2547cde812face1a0/docs/root/modules/github/schema.md?plain=1#L210-L218

---------

Signed-off-by: Hans Wernetti <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
Updates the readme to the new CNCF Slack workspace.

Signed-off-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
## Summary
> Describe your changes.

### Before:

1. Pull down a fresh cartography from github
![Screenshot 2024-11-14 at 2 16
51 PM](https://github.com/user-attachments/assets/ad7a5706-c670-45a2-894d-44329eed28a4)

2. Build the container with `docker build -t
cartography-cncf/cartography-dev -f dev.Dockerfile ./` (I forgot to
screenshot this step)

3. Running the docker-compose linter works on the first try because the
initialization step takes a while:
![Screenshot 2024-11-14 at 2 17
23 PM](https://github.com/user-attachments/assets/148c5efa-799a-425d-b52c-0fca33f28d5a)

4. [Failure] But subsequent linter runs fail with "is this a git
repository?"
![Screenshot 2024-11-14 at 2 17
33 PM](https://github.com/user-attachments/assets/2447c0aa-26ac-447a-85bb-ade408db9d21)

### After:

1. Pull down fresh cartography from github and checkout this fix branch
(fixdcwait)

![image](https://github.com/user-attachments/assets/1fbafca6-6b6e-4e9f-a43c-7df2efbfbc03)

2. Build container

![image](https://github.com/user-attachments/assets/60b93567-85bc-4b24-97c9-879adab49ee2)

3. Run linter on the first try

![image](https://github.com/user-attachments/assets/7d8d9a52-9f3c-4501-82a0-60abfb3f29d7)

4. [SUCCESS] Subsequent linter runs work because we've added a waiting
mechanism
![Screenshot 2024-11-14 at 2 25
59 PM](https://github.com/user-attachments/assets/a8dfe235-95e4-4a40-83ec-c9c758928e8b)

### Related issues or links
> Include links to relevant issues or other pages.

- https://github.com/lyft/cartography/issues/...

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [ ] Update/add unit or integration tests.
- [x] Include a screenshot showing what the graph looked like before and
after your changes.
- [x] Include console log trace showing what happened before and after
your changes.

---------

Signed-off-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
…phy-cncf#1383)

### Summary
> Describe your changes.

We often need to connect 2 nodes based on a fuzzy match like in
cartography-cncf#1380 (comment).
This PR adds a `fuzzy_and_ignore_case` option to the `PropertyRef` so
that the final rendered query performs a `CONTAINS` in a
case-insensitive way instead of an exact match.

### Related issues or links
> Include links to relevant issues or other pages.

cartography-cncf#1380

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [x] Update/add unit or integration tests.
- [ ] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

---------

Signed-off-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
### Summary
> Describe your changes.

Adds support for AWS EC2 Network ACLs and their rules. Shows attachments
to VPCs and subnets.

Screenshot:
![Screenshot 2024-11-15 at 6 47
44 PM](https://github.com/user-attachments/assets/2e7f16ac-ec1c-4a8a-895b-29a4198f4f87)

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [x] Update/add unit or integration tests.
- [x] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [x] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).

If you are implementing a new intel module:
- [x] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
Pre-release bump

Signed-off-by: chandanchowdhury <[email protected]>
…cf#1380)

**Summary**
Mapped in [AWS Identity
Center](https://aws.amazon.com/iam/identity-center/) and the access it
provides to AWS accounts.
New Nodes: (AWSIdentityCenter), (AWSPermissionSet), (AWSSSOUser)
New Relationships:
(AWSAccount)-[RESOURCE]->(AWSIdentityCenter)
(AWSIdentityCenter)-[HAS_PERMISSION_SET]->(AWSPermissionSet)
(AWSSSOUser)<-[ALLOWED_BY]-(AWSRole)
(OktaUser)<-[CAN_ASSUME_IDENTITY]-(AWSSSOUser)
(AWSPermissionSet)-[ASSIGNED_TO_ROLE]->(AWSRole)

![image](https://github.com/user-attachments/assets/e0e6c746-8ef6-4c89-b08a-d5192277fbda)

![image](https://github.com/user-attachments/assets/6ec645b8-6157-4001-b6f6-f44dbc3df2cc)

**Console Trace**
INFO:cartography.intel.aws.identitycenter:Syncing Identity Center
instances for region us-east-1
INFO:cartography.intel.aws.identitycenter:Loading 1 Identity Center
instances for region us-east-1
INFO:cartography.intel.aws.identitycenter:Loading 32 permission sets for
instance arn:aws:sso:::instance/ssoins-72237a0dcb8c6df7 in region
us-east-1 INFO:cartography.intel.aws.identitycenter:Loading 777
permission set role assignments
INFO:cartography.intel.aws.identitycenter:Loading 803 SSO users for
identity store d-906747a0b9 in region us-east-1
INFO:cartography.intel.aws.identitycenter:Getting role assignments for
803 users INFO:cartography.intel.aws.identitycenter:Loading 24292 role
assignments INFO:cartography.intel.aws.identitycenter:Syncing Identity
Center instances for region us-east-2
INFO:cartography.intel.aws.identitycenter:Loading 0 Identity Center
instances for region us-east-2
INFO:cartography.intel.aws.identitycenter:Syncing Identity Center
instances for region us-west-1
INFO:cartography.intel.aws.identitycenter:Loading 0 Identity Center
instances for region us-west-1
INFO:cartography.intel.aws.identitycenter:Syncing Identity Center
instances for region us-west-2
INFO:cartography.intel.aws.identitycenter:Loading 0 Identity Center
instances for region us-west-2
INFO:cartography.graph.statement:Completed
aws_import_identity_center_cleanup statement #1
INFO:cartography.graph.statement:Completed
aws_import_identity_center_cleanup statement cartography-cncf#2
INFO:cartography.graph.statement:Completed
aws_import_identity_center_cleanup statement cartography-cncf#3
INFO:cartography.graph.statement:Completed
aws_import_identity_center_cleanup statement cartography-cncf#4
INFO:cartography.graph.statement:Completed
aws_import_identity_center_cleanup statement cartography-cncf#5
INFO:cartography.graph.statement:Completed
aws_import_identity_center_cleanup statement cartography-cncf#6

**Related issues or links**

Fixes - cartography-cncf#990

Checklist
Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:

[ x ] Update/add unit or integration tests.
[ X ] Include a screenshot showing what the graph looked like before and
after your changes.
[ X ] Include console log trace showing what happened before and after
your changes.
If you are changing a node or relationship:

[ x ] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).
If you are implementing a new intel module:

[ X ] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: chandanchowdhury <[email protected]>
### Preamble

This PR is a copy of cartography-cncf#1373, identical in content to that PR as of the
time of this new PR's creation. (That PR will now appear empty because
of some cleanup I did on my fork... and some learning I got on how forks
work. 😄). That PR does have some relevant discussion, which I suppose we
will continue here. Apologies for the noise, it'll hopefully be avoided
in future PRs.

### Summary

This PR adds to the Github graph, marking users as [enterprise
owners](https://docs.github.com/en/enterprise-cloud@latest/admin/managing-accounts-and-repositories/managing-users-in-your-enterprise/roles-in-an-enterprise#enterprise-owners).

We think this is a valuable addition to the graph in general, because
these users are not all necessarily visible in the graph at the moment
but have broad access. Less generally (but still maybe relevant to
others) our analysts at Etsy need to review these users as part of our
UAR (User Access Review) process, which we hope Cartography will
eventually help to power.

We wanted to do this in a light-touch way, without breaking existing
relationships or removing properties. We also wanted to follow how
similar properties are graphed on the user node: org ownership, for
example, is noted by the 'user.role' property; similarly, the
'user.is_site_admin' property notes whether a user is a site admin). To
that end, we did the following:

1. add an 'is_enterprise_owner' property to all user nodes
2. add a new type of user-org relationship: 'UNAFFILIATED'. The
[terminology](https://docs.github.com/en/graphql/reference/enums#roleinorganization)
is Github's, and it is used for enterprise owners who are not also
members of the graphed organization.

Here is an illustration of before/after (I will also add some screencap
below but thought the high-level illustration might help):
![Cartography AMPS User Owns Enterprise
(1)](https://github.com/user-attachments/assets/dc943ab5-2a95-4f76-a39a-6b9f6262169b)

### Other notes on the PR

1. I refactored the integration tests, taking cues from how the testing
for Github teams was done by testing the 'sync' function as a whole
instead of just the 'load' function.
1. In general I tried to do things in keeping with the style I saw
around me. I am happy to change anything.
1. In our slack conversation, it was mentioned PRs should use the new
models. I’d already written this when I read that, but, when I looked I
saw there are no models for this. Is that okay? Should they be added
and, if so, could it be in a separate PR or must it be here?

### Related issues or links

None.

### Screencaps

_(I could get other screencaps... if anything would be helpful, please
let me know.)_

In this case was helpful that we had an enterprise owner who was also a
user in one of our orgs, but not another. I highlighted them
specifically in a query here, showing both the new property and
relationship type.

**Before**
![User Org
Before](https://github.com/user-attachments/assets/f21bb2a9-8d3e-45ed-bc7b-112b21bd304a)

**After**
![User Org
After](https://github.com/user-attachments/assets/637eb10d-20f5-4aa6-9c3d-626b480b3014)

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [X] Update/add unit or integration tests.
- [X] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [X] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).
**NOTE: I updated the schema but not the README, which seemed like it
was out of date, did not already include github, and suggested using a
javascript dependency to update it... please advise, if this needs
update.** 😄

If you are implementing a new intel module:
- **N/A** [ ] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: Daniel Brauer <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
…ncf#1385)

### Summary
This PR adds support to ingest dependencies from Semgrep for the NPM
ecosystem, as well as introducing a CLI flag allowing users to specify
which ecosystems to ingest.

### Related issues or links
cartography-cncf#1368 added support
for ingesting dependencies from Semgrep (only for the `gomod` ecosystem)

### Demo

Before these changes, a project with both Go and NPM dependencies will
only have GoLibrary nodes in the dependency graph:

<img width="1036" alt="image"
src="https://github.com/user-attachments/assets/31d97626-be70-4c80-9a5b-71c26056a53b">

After these changes, for the same project the graph contains both
GoLibrary and NpmLibrary nodes:
<img width="1039" alt="image"
src="https://github.com/user-attachments/assets/d09cc265-ccd6-463e-bd01-2b3e7c6d1778">

<details>
<summary>Logs from semgrep module before these changes</summary>

```
INFO:cartography.sync:Starting sync stage 'semgrep'
INFO:cartography.intel.semgrep.deployment:Loading Semgrep deployment info {'id': ...} into the graph...
INFO:cartography.intel.semgrep.dependencies:Running Semgrep dependencies sync job.
INFO:cartography.intel.semgrep.dependencies:Retrieving Semgrep dependencies for deployment 'X'.
INFO:cartography.intel.semgrep.dependencies:Processed page 0 of Semgrep dependencies.
...
INFO:cartography.intel.semgrep.dependencies:Processed page X of Semgrep dependencies.
INFO:cartography.intel.semgrep.dependencies:Retrieved X Semgrep dependencies in X pages.
INFO:cartography.intel.semgrep.dependencies:Loading X GoLibrary objects into the graph.
INFO:cartography.intel.semgrep.dependencies:Running Semgrep Go Library cleanup job.
INFO:cartography.graph.statement:Completed GoLibrary statement #1
...
INFO:cartography.graph.statement:Completed GoLibrary statement #X
INFO:cartography.graph.job:Finished job GoLibrary
INFO:cartography.intel.semgrep.findings:Running Semgrep SCA findings sync job.
...
INFO:cartography.sync:Finishing sync stage 'semgrep'
INFO:cartography.sync:Finishing sync with update tag '1730497895'
```
</details>

<details>
<summary>Logs from semgrep module after these changes</summary>

```
INFO:cartography.intel.semgrep.deployment:Loading SemgrepDeployment {'id': ...} into the graph.
INFO:cartography.intel.semgrep.dependencies:Running Semgrep dependencies sync job.
INFO:cartography.intel.semgrep.dependencies:Retrieving Semgrep gomod dependencies for deployment 'X'.
INFO:cartography.intel.semgrep.dependencies:Processed page 0 of Semgrep gomod dependencies.
INFO:cartography.intel.semgrep.dependencies:Processed page X of Semgrep gomod dependencies.
INFO:cartography.intel.semgrep.dependencies:Retrieved X Semgrep gomod dependencies in X pages.
INFO:cartography.intel.semgrep.dependencies:Loading X GoLibrary objects into the graph.
INFO:cartography.intel.semgrep.dependencies:Running Semgrep Dependencies cleanup job for GoLibrary.
INFO:cartography.graph.statement:Completed GoLibrary statement #1
INFO:cartography.graph.statement:Completed GoLibrary statement cartography-cncf#2
INFO:cartography.graph.statement:Completed GoLibrary statement cartography-cncf#3
INFO:cartography.graph.job:Finished job GoLibrary
INFO:cartography.intel.semgrep.dependencies:Retrieving Semgrep npm dependencies for deployment 'X'.
INFO:cartography.intel.semgrep.dependencies:Processed page 0 of Semgrep npm dependencies.
...
INFO:cartography.intel.semgrep.dependencies:Processed page X of Semgrep npm dependencies.
INFO:cartography.intel.semgrep.dependencies:Retrieved X Semgrep npm dependencies in X pages.
INFO:cartography.intel.semgrep.dependencies:Loading X NpmLibrary objects into the graph.
INFO:cartography.intel.semgrep.dependencies:Running Semgrep Dependencies cleanup job for NpmLibrary.
INFO:cartography.graph.statement:Completed NpmLibrary statement #1
INFO:cartography.graph.statement:Completed NpmLibrary statement cartography-cncf#2
INFO:cartography.graph.statement:Completed NpmLibrary statement cartography-cncf#3
INFO:cartography.graph.job:Finished job NpmLibrary
INFO:cartography.intel.semgrep.findings:Running Semgrep SCA findings sync job.
...
INFO:cartography.sync:Finishing sync stage 'semgrep'
INFO:cartography.sync:Finishing sync with update tag '1731969699'
```
</details>

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [x] Update/add unit or integration tests.
- [x] Include a screenshot showing what the graph looked like before and
after your changes.
- [x] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [x] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).

If you are implementing a new intel module:
- [x] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: Hans Wernetti <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
### Summary
> Precommit includes an outdated version of autopep8 that doesn't work
in python 3.13. It fails with error:
`ModuleNotFoundError: No module named 'lib2to3' autopep8`
This MR revs all of the pre-commit packages and fixes all of the linting
issues

### Related issues or links
> Include links to relevant issues or other pages.

- https://github.com/lyft/cartography/issues/...

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [ ] Update/add unit or integration tests.
- [x] Include a screenshot showing what the graph looked like before and
after your changes.
- [ x ] Include console log trace showing what happened before and after
your changes.
`cartography % make test_lint
pre-commit run --all-files --show-diff-on-failure
check docstring is
first.................................................Passed
check that executables have
shebangs.....................................Passed
check for merge
conflicts................................................Passed
check vcs
permalinks.....................................................Passed
check
yaml...............................................................Passed
debug statements
(python)................................................Passed
fix end of
files.........................................................Passed
trim trailing
whitespace.................................................Passed

flake8...................................................................Passed

autopep8.................................................................Passed

pyupgrade................................................................Passed
Add trailing
commas......................................................Passed
Reorder python
imports...................................................Passed

mypy.....................................................................Passed`
If you are changing a node or relationship:
- [ ] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).

If you are implementing a new intel module:
- [ ] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

Co-authored-by: = <=>
Co-authored-by: i_virus <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
### Summary
> Describe your changes.

Adds a missing `__init__.py` file to AWS identitycenter.

Without this, running

```
import cartography.intel.aws
```
in a separate script will cause the script to fail with a
`ModuleNotFoundError`.

Crash dump:
```
 from cartography.sync import Sync
..
    import cartography.intel.aws
..
    from .resources import RESOURCE_FUNCTIONS
..
    from . import identitycenter
..
    from cartography.models.aws.identitycenter.awsidentitycenter import AWSIdentityCenterInstanceSchema
E   ModuleNotFoundError: No module named 'cartography.models.aws.identitycenter'
```

Signed-off-by: chandanchowdhury <[email protected]>
cartography-cncf#1389)

### Summary
> AWS is breaking its own rules for some roles it has created in
IdentityCenter. IdentityCenter created roles with inline policies that
can have multiple statements, all with the Sid of an empty string. This
means cartography will only load the last statement as they will all
have the same AWSPolicyStatement.id. This change checks if the Sid is an
empty string and uses the index numbering if it is encountered.

### Related issues or links
> Include links to relevant issues or other pages.

- https://github.com/lyft/cartography/issues/...

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [ ] Update/add unit or integration tests.
- [ ] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [ ] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).

If you are implementing a new intel module:
- [ ] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

Co-authored-by: = <=>
Co-authored-by: Alex Chantavy <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
### Summary

This PR adds to the Github graph, adding repo access that is granted to
all users who have a 'direct' affiliation to the repo.

Cartography currently
[does](https://cartography-cncf.github.io/cartography/modules/github/schema.html#id6)
already map some direct user repo access, but only for collaborators
with an 'outside' affiliation to the repo. This PR broadens that to
include all collaborators, aka anybody with a 'direct' affiliation. This
follows Github's naming for these concepts, as seen
[here](https://docs.github.com/en/graphql/reference/enums#collaboratoraffiliation).

In case it is unclear or for people newer to Github, note: this is
focusing on access users are granted directly to a repo, as opposed to
via a team. Access granted via team is outside the scope of this PR.

We think this is a valuable addition to the graph for a few reasons,
including:
1. Our analysts want few-to-no users to be granted access directly to
repos, on the thinking that managing access via teams can make access
easier to automate (ie with ABAC/RBAC type logic) and to audit. Graphing
direct-access, regardless of whether a user is within the org or outside
it, will help highlight who to clean up.
2. Longer term, we eventually want to know, from the graph, _all_ access
a user has. This PR is a step in that direction. (In a future PR, we
hope to add a user-team membership relationship. Since Cartography maps
team-repo access rel, we could then have a user-team-repo graph, and
that would complete the picture of user access in Github.)

#### Illustration of the intention

![Cartography AMPS User Direct Repo Access
(3)](https://github.com/user-attachments/assets/83a28a9b-f4f9-40fe-bdc5-153aa5196070)

#### Screencaps

**A REPO WITH OUTSIDE COLLABORATORS**

BEFORE

![CollabsBefore](https://github.com/user-attachments/assets/6806fe08-5e7c-4ced-a8c8-ed43a39566c6)

AFTER

![CollabsAfter](https://github.com/user-attachments/assets/18e9f96d-eb8e-4cad-984a-7e7056615776)

**A REPO WITH NON-OUTSIDE COLLABORATORS**

BEFORE
(no results, because these sorts of users are not graphed)
![Screenshot 2024-11-21 at 5 54
37 PM](https://github.com/user-attachments/assets/d5b371ba-f86e-4a97-a06f-9f62c2548e76)

AFTER
![Screenshot 2024-11-21 at 5 54
25 PM](https://github.com/user-attachments/assets/04941eb1-10ff-4ff9-a95d-172295744978)

**GENERAL COUNTS TO GIVE A SENSE OF CONNECTIONS NOW THERE**

BEFORE

![UserRepoRelsBefore](https://github.com/user-attachments/assets/08d2b96f-41ca-44b6-923d-75247ef09812)

AFTER

![UserRepoRelsAfter](https://github.com/user-attachments/assets/85e3bf51-6a60-4742-865a-454bfab1ef24)

### Related issues or links

None

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [X] Update/add unit or integration tests.
- [X] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [X] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).

**N/A/** If you are implementing a new intel module:
- [ ] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: Daniel Brauer <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
@chandanchowdhury chandanchowdhury force-pushed the topic/chandan/logging_config branch from 3b7f96b to 657a2ee Compare November 27, 2024 04:40
@chandanchowdhury
Copy link
Collaborator Author

Messed up the DCO fix, will open a new one.

@chandanchowdhury chandanchowdhury deleted the topic/chandan/logging_config branch November 27, 2024 04:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Add time to logs/output