Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identifying Enterprise Owners in Github #1373

Closed
wants to merge 0 commits into from

Conversation

danbrauer
Copy link
Contributor

Summary

This PR adds to the Github graph, marking users as enterprise owners.

We think this is a valuable addition to the graph in general, because these users are not all necessarily visible in the graph at the moment but have broad access. Less generally (but still maybe relevant to others) our analysts at Etsy need to review these users as part of our UAR (User Access Review) process, which we hope Cartography will eventually help to power.

We wanted to do this in a light-touch way, without breaking existing relationships or removing properties. We also wanted to follow how similar properties are graphed on the user node: org ownership, for example, is noted by the 'user.role' property; similarly, the 'user.is_site_admin' property notes whether a user is a site admin). To that end, we did the following:

  1. add an 'is_enterprise_owner' property to all user nodes
  2. add a new type of user-org relationship: 'UNAFFILIATED'. The terminology is Github's, and it is used for enterprise owners who are not also members of the graphed organization.

Here is an illustration of before/after (I will also add some screencap below but thought the high-level illustration might help):
Cartography AMPS User Owns Enterprise (1)

Other notes on the PR

  1. I refactored the integration tests, taking cues from how the testing for Github teams was done by testing the 'sync' function as a whole instead of just the 'load' function.
  2. In general I tried to do things in keeping with the style I saw around me. I am happy to change anything.
  3. In our slack conversation, it was mentioned PRs should use the new models. I’d already written this when I read that, but, when I looked I saw there are no models for this. Is that okay? Should they be added and, if so, could it be in a separate PR or must it be here?

Related issues or links

None.

Screencaps

(I could get other screencaps... if anything would be helpful, please let me know.)

In this case was helpful that we had an enterprise owner who was also a user in one of our orgs, but not another. I highlighted them specifically in a query here, showing both the new property and relationship type.

Before
User Org Before

After
User Org After

Checklist

Provide proof that this works (this makes reviews move faster). Please perform one or more of the following:

  • Update/add unit or integration tests.
  • Include a screenshot showing what the graph looked like before and after your changes.
  • Include console log trace showing what happened before and after your changes.

If you are changing a node or relationship:

  • Update the schema and readme. NOTE: I updated the schema but not the README, which seemed like it was out of date, did not already include github, and suggested using a javascript dependency to update it... please advise, if this needs update. 😄

If you are implementing a new intel module:

@danbrauer
Copy link
Contributor Author

danbrauer commented Oct 31, 2024

When first creating this PR, all the checks worked except the DCO check, which prompted me to sign off on my commits. In doing that, I rebased and, I think I did something wrong. This PR here, the actual contents of the code and changes, look as expected. But I seemed to have touched / pulled in commits other than my own. 😭 I am chatting with Alex a little later, and will hold off on trying to fix this lest I make it worse.

UPDATE: we chatted, it is okay 🤣

@danbrauer
Copy link
Contributor Author

Also after discussion, it was determined this PR could benefit from us trying to add in the newer-style models, and using those. Github User does not have those models yet, but we can reference what was done for Github Team. I will do that — aiming for sometime next week.

@achantavy
Copy link
Contributor

@danbrauer lemme know if you'd like to get on a call to talk through how to adopt the model

@achantavy
Copy link
Contributor

@danbrauer - I thought about this and I think we have a way forward. In our conversation, you made a good point where the data model is interesting for GitHub users. Here,

  1. GitHub users can exist independent of any GitHub organization
  2. users can be associated with more than one organization
  3. users can have differing privilege levels with each organization they are associated with
  4. and GitHub has a weird concept called "UNAFFILIATED" where a user can be an "enterprise owner" of an organization but not a member of that org. (ref). It is possible for a user to manage organization settings and yet not appear in the member list.

In this case, the sub_resource_relationship idea does not make sense. I would also argue that it does not make sense to have a cleanup job apply for a GitHub user mostly because of reason (1) above. As a counter-example, a GitHub team only exists if its org exists, so it makes sense for a GitHub team to have a sub_resource_relationship to its GitHub org.

So, my proposal to model GitHub users correctly would be to have a GitHubUserNodeSchema where sub_resource_relationship is left undefined. We would supply the githubuser-to-githuborg relationships to the other_relationships field, whether they're MEMBER_OF or UNAFFILIATED or ADMIN_OF or whatever. It would look like

@dataclass(frozen=True)
class GitHubUserSchema(CartographyNodeSchema):
    label: str = 'GitHubUser'
    properties: GitHubUserNodeProperties = GitHubUserNodeProperties()
    other_relationships: OtherRelationships = OtherRelationships([
        GitHubUserMemberOfOrg(),
        GitHubUserUnaffiliatedOrg(),
        GitHubUserAdminOfOrg(),
        ... <others> ...
    ])

This means that GitHubUsers will not have automatic cleanup jobs apply to them, so when a user gets offboarded from an organization, their node may appear to be an "orphan" node in the graph with no other associations.

I think this is fine and should work. Let me know if I missed something.

cc: @ramonpetgrave64 since you were working with me when we were designing the data model.

@ramonpetgrave64
Copy link
Collaborator

ramonpetgrave64 commented Nov 7, 2024

@achantavy in a way, every user is their own GitHub org, so I think we could attach every user node to their own org node. This also opens up the possibility of syncing the repos that may belong to a user.

Do we automatically cleanup these sub resource/tenancy nodes?

@achantavy
Copy link
Contributor

@ramonpetgrave64

in a way, every user is their own GitHub org

100%

Do we automatically cleanup these sub resource/tenancy nodes?

We do not, and that's fine since there are relatively few of them compared to the others in the graph.

I think we could attach every user node to their own org node. This also opens up the possibility of syncing the repos that may belong to a user.

Syncing the repos that belong to a user is a valid scenario. I'm not sure I like the idea of attaching every user node to their own "virtual tenancy" node though because that is an entity that doesn't actually exist.


Zooming out, I think we need to have the data model support automatic cleanup jobs where a sub_resource relationship is not defined. In this case, we would not delete stale nodes but we would delete stale relationships. I'll think on this a little bit. For now, I think @danbrauer can continue working on the design described here while I figure out how to handle the cleanup story for objects without sub_resource relationships.

@danbrauer
Copy link
Contributor Author

😅 I am cleaning up some stuff in our fork. Please ignore for now. I will re-open this PR (perhaps as a new PR but otherwise it will be identical here, while being cleaner on our end of things). I'll link for any new PR to here. Sorry for the noise.

@danbrauer
Copy link
Contributor Author

The new PR is here: #1378

Also thank you for the discussion above. I will consider what @achantavy said and if I have a comment I will make it in the new PR but link back to here.

achantavy pushed a commit that referenced this pull request Nov 19, 2024
### Preamble

This PR is a copy of #1373, identical in content to that PR as of the
time of this new PR's creation. (That PR will now appear empty because
of some cleanup I did on my fork... and some learning I got on how forks
work. 😄). That PR does have some relevant discussion, which I suppose we
will continue here. Apologies for the noise, it'll hopefully be avoided
in future PRs.

### Summary

This PR adds to the Github graph, marking users as [enterprise
owners](https://docs.github.com/en/enterprise-cloud@latest/admin/managing-accounts-and-repositories/managing-users-in-your-enterprise/roles-in-an-enterprise#enterprise-owners).

We think this is a valuable addition to the graph in general, because
these users are not all necessarily visible in the graph at the moment
but have broad access. Less generally (but still maybe relevant to
others) our analysts at Etsy need to review these users as part of our
UAR (User Access Review) process, which we hope Cartography will
eventually help to power.

We wanted to do this in a light-touch way, without breaking existing
relationships or removing properties. We also wanted to follow how
similar properties are graphed on the user node: org ownership, for
example, is noted by the 'user.role' property; similarly, the
'user.is_site_admin' property notes whether a user is a site admin). To
that end, we did the following:

1. add an 'is_enterprise_owner' property to all user nodes
2. add a new type of user-org relationship: 'UNAFFILIATED'. The
[terminology](https://docs.github.com/en/graphql/reference/enums#roleinorganization)
is Github's, and it is used for enterprise owners who are not also
members of the graphed organization.

Here is an illustration of before/after (I will also add some screencap
below but thought the high-level illustration might help):
![Cartography AMPS User Owns Enterprise
(1)](https://github.com/user-attachments/assets/dc943ab5-2a95-4f76-a39a-6b9f6262169b)

### Other notes on the PR

1. I refactored the integration tests, taking cues from how the testing
for Github teams was done by testing the 'sync' function as a whole
instead of just the 'load' function.
1. In general I tried to do things in keeping with the style I saw
around me. I am happy to change anything.
1. In our slack conversation, it was mentioned PRs should use the new
models. I’d already written this when I read that, but, when I looked I
saw there are no models for this. Is that okay? Should they be added
and, if so, could it be in a separate PR or must it be here?

### Related issues or links

None.

### Screencaps

_(I could get other screencaps... if anything would be helpful, please
let me know.)_

In this case was helpful that we had an enterprise owner who was also a
user in one of our orgs, but not another. I highlighted them
specifically in a query here, showing both the new property and
relationship type.

**Before**
![User Org
Before](https://github.com/user-attachments/assets/f21bb2a9-8d3e-45ed-bc7b-112b21bd304a)

**After**
![User Org
After](https://github.com/user-attachments/assets/637eb10d-20f5-4aa6-9c3d-626b480b3014)




### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [X] Update/add unit or integration tests.
- [X] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [X] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).
**NOTE: I updated the schema but not the README, which seemed like it
was out of date, did not already include github, and suggested using a
javascript dependency to update it... please advise, if this needs
update.** 😄

If you are implementing a new intel module:
- **N/A** [ ] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: Daniel Brauer <[email protected]>
chandanchowdhury pushed a commit to chandanchowdhury/cartography that referenced this pull request Nov 27, 2024
### Preamble

This PR is a copy of cartography-cncf#1373, identical in content to that PR as of the
time of this new PR's creation. (That PR will now appear empty because
of some cleanup I did on my fork... and some learning I got on how forks
work. 😄). That PR does have some relevant discussion, which I suppose we
will continue here. Apologies for the noise, it'll hopefully be avoided
in future PRs.

### Summary

This PR adds to the Github graph, marking users as [enterprise
owners](https://docs.github.com/en/enterprise-cloud@latest/admin/managing-accounts-and-repositories/managing-users-in-your-enterprise/roles-in-an-enterprise#enterprise-owners).

We think this is a valuable addition to the graph in general, because
these users are not all necessarily visible in the graph at the moment
but have broad access. Less generally (but still maybe relevant to
others) our analysts at Etsy need to review these users as part of our
UAR (User Access Review) process, which we hope Cartography will
eventually help to power.

We wanted to do this in a light-touch way, without breaking existing
relationships or removing properties. We also wanted to follow how
similar properties are graphed on the user node: org ownership, for
example, is noted by the 'user.role' property; similarly, the
'user.is_site_admin' property notes whether a user is a site admin). To
that end, we did the following:

1. add an 'is_enterprise_owner' property to all user nodes
2. add a new type of user-org relationship: 'UNAFFILIATED'. The
[terminology](https://docs.github.com/en/graphql/reference/enums#roleinorganization)
is Github's, and it is used for enterprise owners who are not also
members of the graphed organization.

Here is an illustration of before/after (I will also add some screencap
below but thought the high-level illustration might help):
![Cartography AMPS User Owns Enterprise
(1)](https://github.com/user-attachments/assets/dc943ab5-2a95-4f76-a39a-6b9f6262169b)

### Other notes on the PR

1. I refactored the integration tests, taking cues from how the testing
for Github teams was done by testing the 'sync' function as a whole
instead of just the 'load' function.
1. In general I tried to do things in keeping with the style I saw
around me. I am happy to change anything.
1. In our slack conversation, it was mentioned PRs should use the new
models. I’d already written this when I read that, but, when I looked I
saw there are no models for this. Is that okay? Should they be added
and, if so, could it be in a separate PR or must it be here?

### Related issues or links

None.

### Screencaps

_(I could get other screencaps... if anything would be helpful, please
let me know.)_

In this case was helpful that we had an enterprise owner who was also a
user in one of our orgs, but not another. I highlighted them
specifically in a query here, showing both the new property and
relationship type.

**Before**
![User Org
Before](https://github.com/user-attachments/assets/f21bb2a9-8d3e-45ed-bc7b-112b21bd304a)

**After**
![User Org
After](https://github.com/user-attachments/assets/637eb10d-20f5-4aa6-9c3d-626b480b3014)

### Checklist

Provide proof that this works (this makes reviews move faster). Please
perform one or more of the following:
- [X] Update/add unit or integration tests.
- [X] Include a screenshot showing what the graph looked like before and
after your changes.
- [ ] Include console log trace showing what happened before and after
your changes.

If you are changing a node or relationship:
- [X] Update the
[schema](https://github.com/lyft/cartography/tree/master/docs/root/modules)
and
[readme](https://github.com/lyft/cartography/blob/master/docs/schema/README.md).
**NOTE: I updated the schema but not the README, which seemed like it
was out of date, did not already include github, and suggested using a
javascript dependency to update it... please advise, if this needs
update.** 😄

If you are implementing a new intel module:
- **N/A** [ ] Use the NodeSchema [data
model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node).

---------

Signed-off-by: Daniel Brauer <[email protected]>
Signed-off-by: chandanchowdhury <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants