Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github "Direct" User Repo Access #1391

Merged
merged 7 commits into from
Nov 25, 2024

Conversation

danbrauer
Copy link
Contributor

@danbrauer danbrauer commented Nov 20, 2024

Summary

This PR adds to the Github graph, adding repo access that is granted to all users who have a 'direct' affiliation to the repo.

Cartography currently does already map some direct user repo access, but only for collaborators with an 'outside' affiliation to the repo. This PR broadens that to include all collaborators, aka anybody with a 'direct' affiliation. This follows Github's naming for these concepts, as seen here.

In case it is unclear or for people newer to Github, note: this is focusing on access users are granted directly to a repo, as opposed to via a team. Access granted via team is outside the scope of this PR.

We think this is a valuable addition to the graph for a few reasons, including:

  1. Our analysts want few-to-no users to be granted access directly to repos, on the thinking that managing access via teams can make access easier to automate (ie with ABAC/RBAC type logic) and to audit. Graphing direct-access, regardless of whether a user is within the org or outside it, will help highlight who to clean up.
  2. Longer term, we eventually want to know, from the graph, all access a user has. This PR is a step in that direction. (In a future PR, we hope to add a user-team membership relationship. Since Cartography maps team-repo access rel, we could then have a user-team-repo graph, and that would complete the picture of user access in Github.)

Illustration of the intention

Cartography AMPS User Direct Repo Access (3)

Screencaps

A REPO WITH OUTSIDE COLLABORATORS

BEFORE
CollabsBefore

AFTER
CollabsAfter

A REPO WITH NON-OUTSIDE COLLABORATORS

BEFORE
(no results, because these sorts of users are not graphed)
Screenshot 2024-11-21 at 5 54 37 PM

AFTER
Screenshot 2024-11-21 at 5 54 25 PM

GENERAL COUNTS TO GIVE A SENSE OF CONNECTIONS NOW THERE

BEFORE
UserRepoRelsBefore

AFTER
UserRepoRelsAfter

Related issues or links

None

Checklist

Provide proof that this works (this makes reviews move faster). Please perform one or more of the following:

  • Update/add unit or integration tests.
  • Include a screenshot showing what the graph looked like before and after your changes.
  • Include console log trace showing what happened before and after your changes.

If you are changing a node or relationship:

N/A/ If you are implementing a new intel module:

@danbrauer danbrauer marked this pull request as draft November 20, 2024 21:02
@danbrauer
Copy link
Contributor Author

danbrauer commented Nov 20, 2024

This is partial and not yet working. Tests and docs are not yet updated, and I haven't reviewed my own code. I am sharing it early because I wanted to ask a couple of questions, which I will share in slack soon.

@danbrauer danbrauer force-pushed the gh_user_repo_relationships branch 2 times, most recently from 170dc34 to f94f00f Compare November 21, 2024 22:11
@danbrauer danbrauer marked this pull request as ready for review November 21, 2024 23:28
@danbrauer
Copy link
Contributor Author

This should be ready for review. I do still intend to do a review of my own for typos and the like but short of stuff like that, I think this is ready.

Comment on lines 23 to 32
@dataclass(frozen=False)
class UserAffiliationAndRepoPermission:
"""
Representation of a user's permission level and affiliation to a GitHub repo. See:
- Permission: https://docs.github.com/en/graphql/reference/enums#repositorypermission
- Affiliation: https://docs.github.com/en/graphql/reference/enums#collaboratoraffiliation
"""
user: Dict
permission: str # WRITE, MAINTAIN, ADMIN, etc
affiliation: str # OUTSIDE, DIRECT
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was led to creating this from a similar thing done in teams.py. There, a namedtuple is used. Here, I used a dataclass. I could go back to using a namedtuple if that'd be preferred?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's use a namedtuple instead so that it matches

Copy link
Contributor

@achantavy achantavy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!

Left a minor note about changing the dataclass to a namedtuple to be consistent, but otherwise this is great.

Comment on lines 23 to 32
@dataclass(frozen=False)
class UserAffiliationAndRepoPermission:
"""
Representation of a user's permission level and affiliation to a GitHub repo. See:
- Permission: https://docs.github.com/en/graphql/reference/enums#repositorypermission
- Affiliation: https://docs.github.com/en/graphql/reference/enums#collaboratoraffiliation
"""
user: Dict
permission: str # WRITE, MAINTAIN, ADMIN, etc
affiliation: str # OUTSIDE, DIRECT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's use a namedtuple instead so that it matches

@danbrauer danbrauer force-pushed the gh_user_repo_relationships branch from bbc72fa to 7be764f Compare November 24, 2024 03:33
@danbrauer
Copy link
Contributor Author

danbrauer commented Nov 24, 2024

Updated to named tuple as request. If all looks good, please merge. Otherwise, I'll be back on Tue and happy to work through whatever there is.

@achantavy
Copy link
Contributor

Thank you!

Including a note for future searchers: we deferred adding the data model here. See this thread for details: https://cloud-native.slack.com/archives/C080M2LRLDA/p1732137038521919

@achantavy achantavy merged commit 774d67b into cartography-cncf:master Nov 25, 2024
7 checks passed
chandanchowdhury pushed a commit to chandanchowdhury/cartography that referenced this pull request Nov 27, 2024
### Summary

This PR adds to the Github graph, adding repo access that is granted to
all users who have a 'direct' affiliation to the repo.

Cartography currently
[does](https://cartography-cncf.github.io/cartography/modules/github/schema.html#id6)
already map some direct user repo access, but only for collaborators
with an 'outside' affiliation to the repo. This PR broadens that to
include all collaborators, aka anybody with a 'direct' affiliation. This
follows Github's naming for these concepts, as seen
[here](https://docs.github.com/en/graphql/reference/enums#collaboratoraffiliation).

In case it is unclear or for people newer to Github, note: this is
focusing on access users are granted directly to a repo, as opposed to
via a team. Access granted via team is outside the scope of this PR.

We think this is a valuable addition to the graph for a few reasons,
including:
1. Our analysts want few-to-no users to be granted access directly to
repos, on the thinking that managing access via teams can make access
easier to automate (ie with ABAC/RBAC type logic) and to audit. Graphing
direct-access, regardless of whether a user is within the org or outside
it, will help highlight who to clean up.
2. Longer term, we eventually want to know, from the graph, _all_ access
a user has. This PR is a step in that direction. (In a future PR, we
hope to add a user-team membership relationship. Since Cartography maps
team-repo access rel, we could then have a user-team-repo graph, and
that would complete the picture of user access in Github.)

#### Illustration of the intention

![Cartography AMPS User Direct Repo Access
(3)](https://github.com/user-attachments/assets/83a28a9b-f4f9-40fe-bdc5-153aa5196070)

#### Screencaps

**A REPO WITH OUTSIDE COLLABORATORS**

BEFORE

![CollabsBefore](https://github.com/user-attachments/assets/6806fe08-5e7c-4ced-a8c8-ed43a39566c6)

AFTER

![CollabsAfter](https://github.com/user-attachments/assets/18e9f96d-eb8e-4cad-984a-7e7056615776)

**A REPO WITH NON-OUTSIDE COLLABORATORS**

BEFORE
(no results, because these sorts of users are not graphed)
![Screenshot 2024-11-21 at 5 54
37 PM](https://github.com/user-attachments/assets/d5b371ba-f86e-4a97-a06f-9f62c2548e76)

AFTER
![Screenshot 2024-11-21 at 5 54
25 PM](https://github.com/user-attachments/assets/04941eb1-10ff-4ff9-a95d-172295744978)

**GENERAL COUNTS TO GIVE A SENSE OF CONNECTIONS NOW THERE**

BEFORE

![UserRepoRelsBefore](https://github.com/user-attachments/assets/08d2b96f-41ca-44b6-923d-75247ef09812)

AFTER

![UserRepoRelsAfter](https://github.com/user-attachments/assets/85e3bf51-6a60-4742-865a-454bfab1ef24)

### Related issues or links

None

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [X] Update/add unit or integration tests.
- [X] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [X] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).

**N/A/** If you are implementing a new intel module:
- [ ] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: Daniel Brauer <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
def transform(repos_json: List[Dict]) -> Dict:
def transform(
repos_json: List[Dict], direct_collaborators: dict[str, List[UserAffiliationAndRepoPermission]],
outside_collaborators: dict[str, List[UserAffiliationAndRepoPermission]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achantavy this resulted in a breaking change as it is changing the signature of the function without providing a default value nor allowing to pass None as param.
Should we update this code? or this should be enforced for next versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the breaking change in the Lyft internal deployment of cartography? If so share the stack trace with me internally. I thought I accounted for the changes; sorry, I probably missed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants