Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub intel updates to error handling #1303

Merged
merged 8 commits into from
May 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions cartography/intel/aws/ec2/snapshots.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,10 @@ def get_snapshots(boto3_session: boto3.session.Session, region: str, in_use_snap
snapshots.extend(page['Snapshots'])
except ClientError as e:
if e.response['Error']['Code'] == 'InvalidSnapshot.NotFound':
logger.warning(f"Failed to retrieve page of in-use, \
not owned snapshots. Continuing anyway. Error - {e}")
logger.warning(
f"Failed to retrieve page of in-use, \
not owned snapshots. Continuing anyway. Error - {e}",
)
else:
raise

Expand Down
9 changes: 6 additions & 3 deletions cartography/intel/github/teams.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,13 @@ def _get_team_repos_for_multiple_teams(

team_repos = _get_team_repos(org, api_url, token, team_name) if repo_count > 0 else None

# Shape = [(repo_url, 'WRITE'), ...]]
repo_urls = [t['url'] for t in team_repos.nodes] if team_repos else []
repo_permissions = [t['permission'] for t in team_repos.edges] if team_repos else []
repo_urls = []
repo_permissions = []
if team_repos:
repo_urls = [t['url'] for t in team_repos.nodes] if team_repos.nodes else []
repo_permissions = [t['permission'] for t in team_repos.edges] if team_repos.edges else []

# Shape = [(repo_url, 'WRITE'), ...]]
result[team_name] = list(zip(repo_urls, repo_permissions))
return result

Expand Down
34 changes: 26 additions & 8 deletions cartography/intel/github/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,12 +81,12 @@ def call_github_api(query: str, variables: str, token: str, api_url: str) -> Dic


def fetch_page(
token: str,
api_url: str,
organization: str,
query: str,
cursor: Optional[str] = None,
**kwargs: Any,
token: str,
api_url: str,
organization: str,
query: str,
cursor: Optional[str] = None,
**kwargs: Any,
kledo-lyft marked this conversation as resolved.
Show resolved Hide resolved
) -> Dict[str, Any]:
"""
Return a single page of max size 100 elements from the Github api_url using the given `query` and `cursor` params.
Expand Down Expand Up @@ -139,6 +139,7 @@ def fetch_all(
"""
cursor = None
has_next_page = True
org_data: Dict[str, Any] = {}
data: PaginatedGraphqlData = PaginatedGraphqlData(nodes=[], edges=[])
retry = 0

Expand Down Expand Up @@ -170,6 +171,15 @@ def fetch_all(
time.sleep(2 ** retry)
continue

if 'data' not in resp:
logger.warning(
f'Got no "data" attribute in response: {resp}. '
f'Stopping requests for organization: {organization} and '
f'resource_type: {resource_type}',
)
has_next_page = False
continue
Comment on lines +174 to +181
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a known issue that could happen? It's like a key in a dictionary that we expect but doesn't exist; and that can happen in many other places. Wonder if we know more about this particular case to suspect it might happen here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data is the root element from GH GraphQL API.
Unfortunately, there is no official documentation explicitly describing the response 🙃 .
But, in case it doesn't exists, the code cannot get any data to ingest.

Copy link
Contributor Author

@serge-wq serge-wq May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @heryxpc said, GraphQL response usually contain data as the root attribute in the response which contains the actual requested data. This is a convention I've seen across GraphQL, not only for GitHub. Another thing that I've seen across GraphQL APIs is that if there is no data available, the attributes are returned as undefined, null or not even present. That's why this check here ill help us know what happened.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this issue is what has been breaking the GitHub crons in our infra. As it's unclear in which page it happens, I'm just breaking out the loop gracefully and making sure we at least got the org_data.


resource = resp['data']['organization'][resource_type]
if resource_inner_type:
resource = resp['data']['organization'][resource_type][resource_inner_type]
Expand All @@ -180,6 +190,14 @@ def fetch_all(

cursor = resource['pageInfo']['endCursor']
has_next_page = resource['pageInfo']['hasNextPage']

org_data = {'url': resp['data']['organization']['url'], 'login': resp['data']['organization']['login']}
Comment on lines -183 to -184
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the change here is that: in the past, this information is retrieved in the last.
We change this to collect this info on every page if not already collected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

org_data is mandatory to perform the full ingestion of Github info to Cartography: https://github.com/lyft/cartography/blob/fa1be115c80e83cefe0e31be5d738b5cda9a0887/cartography/intel/github/users.py#L116
I believe below code ensures that it exists on any page, otherwise it breaks the full process before starting ingestion.

if not org_data:
org_data = {
'url': resp['data']['organization']['url'],
'login': resp['data']['organization']['login'],
}

if not org_data:
raise ValueError(
f"Didn't get any organization data for organization: {organization} and resource_type: {resource_type}",
)
return data, org_data
6 changes: 4 additions & 2 deletions tests/integration/cartography/test_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,17 @@ def test_merge_module_sync_metadata(mock_stat_incr, neo4j_session):
stat_handler=stat_handler,
)
# Assert
nodes = neo4j_session.run(f"""
nodes = neo4j_session.run(
f"""
MATCH (m:ModuleSyncMetadata{{id:'AWSAccount_{TEST_ACCOUNT_ID}_S3Bucket'}})
RETURN
m.id,
m.syncedtype,
m.grouptype,
m.groupid,
m.lastupdated
""")
""",
)
# Assert
actual_nodes = {
(
Expand Down
Loading