Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firewall Policy Metrics, parallel writes, aligned timestamps #871

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
cdf5355
support for project level VPC firewall metrics
maunope Oct 3, 2022
7f2399b
Added charts to dashboard, fixed a merge glitch, updated readme, remo…
maunope Oct 4, 2022
338ffe4
Merge branch 'GoogleCloudPlatform:master' into maunope/network-dashbo…
maunope Oct 4, 2022
bebe9ed
removed hardcoded debug value
maunope Oct 5, 2022
14da0ee
removed obsolete code - compute api firewalls list
maunope Oct 5, 2022
8d39ded
Merge branch 'master' into maunope/network-dashboards-updates
maunope Oct 5, 2022
c0573ce
fixed PR #856 comments
maunope Oct 7, 2022
2df2bbd
Merge branch 'maunope/network-dashboards-updates' of https://github.c…
maunope Oct 7, 2022
6e128ee
Merge branch 'master' into maunope/network-dashboards-updates
maunope Oct 7, 2022
1f3cfe8
fixed dashboard comments
maunope Oct 7, 2022
ac82a8c
solved pul comments and fixed grouping on firewalls utilization chart
maunope Oct 7, 2022
dffd5b4
Merge branch 'master' into maunope/network-dashboards-updates
maunope Oct 10, 2022
17ee413
Merge branch 'master' into maunope/network-dashboards-updates
aurelienlegrand Oct 10, 2022
82f3daf
Added Firewall Policies Monitoring, added buffered metric writes (CF …
maunope Oct 10, 2022
c52b623
Added Firewall Policies Monitoring
maunope Oct 10, 2022
3b259a9
Merge branch 'maunope/network-dashboards-updates' of https://github.c…
maunope Oct 10, 2022
3608b57
updated dashbaord and readme
maunope Oct 10, 2022
658b34c
Merge branch 'master' of https://github.com/GoogleCloudPlatform/cloud…
maunope Oct 14, 2022
801fe1c
fixes to dashboard
maunope Oct 14, 2022
2248639
fixed merge
maunope Oct 14, 2022
2380761
Merge branch 'master' into maunope/network-dashboards-updates
aurelienlegrand Oct 19, 2022
4d11251
Update README.md
aurelienlegrand Oct 19, 2022
d2e38fb
removed dependency
maunope Oct 19, 2022
9483178
Merge branch 'maunope/network-dashboards-updates' of https://github.c…
maunope Oct 19, 2022
ec26624
Merge branch 'master' of https://github.com/GoogleCloudPlatform/cloud…
maunope Oct 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions blueprints/cloud-operations/network-dashboard/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,12 @@ Clone this repository, then go through the following steps to create resources:
Once the resources are deployed, go to the following page to see the dashboard: https://console.cloud.google.com/monitoring/dashboards?project=<YOUR-MONITORING-PROJECT>.
A dashboard called "quotas-utilization" should be created.

The Cloud Function runs every 5 minutes by default so you should start getting some data points after a few minutes.
The Cloud Function runs every 10 minutes by default so you should start getting some data points after a few minutes.
You can use the metric explorer to view the data points for the different custom metrics created: https://console.cloud.google.com/monitoring/metrics-explorer?project=<YOUR-MONITORING-PROJECT>.
You can change this frequency by modifying the "schedule_cron" variable in variables.tf.

Note that some charts in the dashboard align values over 1h so you might need to wait 1h to see charts on the dashboard views.

Once done testing, you can clean up resources by running `terraform destroy`.

## Supported limits and quotas
Expand All @@ -46,6 +49,7 @@ The Cloud Function currently tracks usage, limit and utilization of:
- Dynamic routes per VPC peering group
- IP utilization per subnet (% of IP addresses used in a subnet)
- VPC firewall rules per project (VPC drill down is available for usage)
- Tuples per Firewall Policy

It writes this values to custom metrics in Cloud Monitoring and creates a dashboard to visualize the current utilization of these metrics in Cloud Monitoring.

Expand All @@ -58,4 +62,4 @@ In a future release, we could support:
- Static routes per VPC / per VPC peering group
- Google managed VPCs that are peered with PSA (such as Cloud SQL or Memorystore)

If you are interested in this and/or would like to contribute, please contact [email protected].
If you are interested in this and/or would like to contribute, please contact [email protected].
106 changes: 59 additions & 47 deletions blueprints/cloud-operations/network-dashboard/cloud-function/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,14 @@
# limitations under the License.
# CFv2 define whether to use Cloud function 2nd generation or 1st generation

import re
from distutils.command.config import config
import os
import time
from google.cloud import monitoring_v3, asset_v1
from google.protobuf import field_mask_pb2
from googleapiclient import discovery
from metrics import ilb_fwrules, instances, networks, metrics, limits, peerings, routes, subnets, vpc_firewalls
from metrics import ilb_fwrules, firewall_policies, instances, networks, metrics, limits, peerings, routes, subnets, vpc_firewalls

CFv2 = False
if CFv2:
Expand Down Expand Up @@ -123,6 +124,7 @@ def monitoring_interval():
"asset_client": asset_v1.AssetServiceClient(),
"monitoring_client": monitoring_v3.MetricServiceClient()
},
"series_buffer": []
}


Expand Down Expand Up @@ -150,6 +152,7 @@ def main(event, context):
project_quotas_dict = limits.get_quota_project_limit(config)

firewalls_dict = vpc_firewalls.get_firewalls_dict(config)
firewall_policies_dict = firewall_policies.get_firewall_policies_dict(config)

# IP utilization subnet level metrics
subnets.get_subnets(config, metrics_dict)
Expand All @@ -160,51 +163,60 @@ def main(event, context):
l7_forwarding_rules_dict = ilb_fwrules.get_forwarding_rules_dict(config, "L7")
subnet_range_dict = networks.get_subnet_ranges_dict(config)

# Per Project metrics
vpc_firewalls.get_firewalls_data(config, metrics_dict, project_quotas_dict,
firewalls_dict)

# Per Network metrics
instances.get_gce_instances_data(config, metrics_dict, gce_instance_dict,
limits_dict['number_of_instances_limit'])
ilb_fwrules.get_forwarding_rules_data(
config, metrics_dict, l4_forwarding_rules_dict,
limits_dict['internal_forwarding_rules_l4_limit'], "L4")
ilb_fwrules.get_forwarding_rules_data(
config, metrics_dict, l7_forwarding_rules_dict,
limits_dict['internal_forwarding_rules_l7_limit'], "L7")
peerings.get_vpc_peering_data(config, metrics_dict,
limits_dict['number_of_vpc_peerings_limit'])
dynamic_routes_dict = routes.get_dynamic_routes(
config, metrics_dict, limits_dict['dynamic_routes_per_network_limit'])

# Per VPC peering group metrics
metrics.get_pgg_data(
config,
metrics_dict["metrics_per_peering_group"]["instance_per_peering_group"],
gce_instance_dict, config["limit_names"]["GCE_INSTANCES"],
limits_dict['number_of_instances_ppg_limit'])
metrics.get_pgg_data(
config, metrics_dict["metrics_per_peering_group"]
["l4_forwarding_rules_per_peering_group"], l4_forwarding_rules_dict,
config["limit_names"]["L4"],
limits_dict['internal_forwarding_rules_l4_ppg_limit'])
metrics.get_pgg_data(
config, metrics_dict["metrics_per_peering_group"]
["l7_forwarding_rules_per_peering_group"], l7_forwarding_rules_dict,
config["limit_names"]["L7"],
limits_dict['internal_forwarding_rules_l7_ppg_limit'])
metrics.get_pgg_data(
config, metrics_dict["metrics_per_peering_group"]
["subnet_ranges_per_peering_group"], subnet_range_dict,
config["limit_names"]["SUBNET_RANGES"],
limits_dict['number_of_subnet_IP_ranges_ppg_limit'])
routes.get_dynamic_routes_ppg(
config, metrics_dict["metrics_per_peering_group"]
["dynamic_routes_per_peering_group"], dynamic_routes_dict,
limits_dict['dynamic_routes_per_peering_group_limit'])

return 'Function executed successfully'
try:

# Per Project metrics
vpc_firewalls.get_firewalls_data(config, metrics_dict, project_quotas_dict,
firewalls_dict)
# Per Firewall Policy metrics
firewall_policies.get_firewal_policies_data(config, metrics_dict,
firewall_policies_dict)
# Per Network metrics
instances.get_gce_instances_data(config, metrics_dict, gce_instance_dict,
limits_dict['number_of_instances_limit'])
ilb_fwrules.get_forwarding_rules_data(
config, metrics_dict, l4_forwarding_rules_dict,
limits_dict['internal_forwarding_rules_l4_limit'], "L4")
ilb_fwrules.get_forwarding_rules_data(
config, metrics_dict, l7_forwarding_rules_dict,
limits_dict['internal_forwarding_rules_l7_limit'], "L7")
peerings.get_vpc_peering_data(config, metrics_dict,
limits_dict['number_of_vpc_peerings_limit'])
dynamic_routes_dict = routes.get_dynamic_routes(
config, metrics_dict, limits_dict['dynamic_routes_per_network_limit'])

# Per VPC peering group metrics
metrics.get_pgg_data(
config,
metrics_dict["metrics_per_peering_group"]["instance_per_peering_group"],
gce_instance_dict, config["limit_names"]["GCE_INSTANCES"],
limits_dict['number_of_instances_ppg_limit'])
metrics.get_pgg_data(
config, metrics_dict["metrics_per_peering_group"]
["l4_forwarding_rules_per_peering_group"], l4_forwarding_rules_dict,
config["limit_names"]["L4"],
limits_dict['internal_forwarding_rules_l4_ppg_limit'])
metrics.get_pgg_data(
config, metrics_dict["metrics_per_peering_group"]
["l7_forwarding_rules_per_peering_group"], l7_forwarding_rules_dict,
config["limit_names"]["L7"],
limits_dict['internal_forwarding_rules_l7_ppg_limit'])
metrics.get_pgg_data(
config, metrics_dict["metrics_per_peering_group"]
["subnet_ranges_per_peering_group"], subnet_range_dict,
config["limit_names"]["SUBNET_RANGES"],
limits_dict['number_of_subnet_IP_ranges_ppg_limit'])
routes.get_dynamic_routes_ppg(
config, metrics_dict["metrics_per_peering_group"]
["dynamic_routes_per_peering_group"], dynamic_routes_dict,
limits_dict['dynamic_routes_per_peering_group_limit'])
except Exception as e:
print("Error writing metrics")
print(e)
finally:
metrics.flush_series_buffer(config)

return 'Function execution completed'


if CFv2:
Expand All @@ -214,4 +226,4 @@ def main_http(request):
main(None, None)
else:
if __name__ == "__main__":
main(None, None)
main(None, None)
Original file line number Diff line number Diff line change
Expand Up @@ -172,3 +172,17 @@ metrics_per_project:
utilization:
name: firewalls_per_project_utilization
description: Number of VPC firewall rules in a project - utilization.
metrics_per_firewall_policy:
firewall_policy_tuples:
usage:
name: firewall_policy_tuples_per_policy_usage
description: Number of tuples in a firewall policy - usage.
limit:
# This limit is not visibile through Google APIs, set default_value
name: firewall_policy_tuples_per_policy_limit
description: Number of tuples in a firewall policy - limit.
values:
default_value: 2000
utilization:
name: firewall_policy_tuples_per_policy_utilization
description: Number of tuples in a firewall policy - utilization.
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
#
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

import re
import time

from collections import defaultdict
from pydoc import doc
from collections import defaultdict
from google.protobuf import field_mask_pb2
from . import metrics, networks, limits


def get_firewall_policies_dict(config: dict):
'''
Calls the Asset Inventory API to get all Firewall Policies under the GCP organization

Parameters:
config (dict): The dict containing config like clients and limits
Returns:
firewal_policies_dict (dictionary of dictionary): Keys are policy ids, subkeys are policy field values
'''

firewall_policies_dict = defaultdict(int)
read_mask = field_mask_pb2.FieldMask()
read_mask.FromJsonString('name,versionedResources')

response = config["clients"]["asset_client"].search_all_resources(
request={
"scope": f"organizations/{config['organization']}",
"asset_types": ["compute.googleapis.com/FirewallPolicy"],
"read_mask": read_mask,
})
for resource in response:
for versioned in resource.versioned_resources:
firewall_policy = dict()
for field_name, field_value in versioned.resource.items():
firewall_policy[field_name] = field_value
firewall_policies_dict[firewall_policy['id']] = firewall_policy
return firewall_policies_dict


def get_firewal_policies_data(config, metrics_dict, firewall_policies_dict):
'''
Gets the data for VPC Firewall lorem ipsum

Parameters:
config (dict): The dict containing config like clients and limits
metrics_dict (dictionary of dictionary of string: string): metrics names and descriptions.
firewall_policies_dict (dictionary of of dictionary of string: string): Keys are policies ids, subkeys are policies values
Returns:
None
'''

current_tuples_limit = None
try:
current_tuples_limit = metrics_dict["metrics_per_firewall_policy"][
"firewall_policy_tuples"]["limit"]["values"]["default_value"]
except Exception:
print(
f"Could not determine number of tuples metric limit due to missing default value"
)
if current_tuples_limit < 0:
print(
f"Could not determine number of tuples metric limit as default value is <= 0"
)

timestamp = time.time()
for firewall_policy_key in firewall_policies_dict:
firewall_policy = firewall_policies_dict[firewall_policy_key]

# may either be a org, a folder, or a project
# folder and org require to split {folder,organization}\/\w+
parent = re.search("(\w+$)", firewall_policy["parent"]).group(
1) if "parent" in firewall_policy else re.search(
"([\d,a-z,-]+)(\/[\d,a-z,-]+\/firewallPolicies/[\d,a-z,-]*$)",
firewall_policy["selfLink"]).group(1)
parent_type = re.search("(^\w+)", firewall_policy["parent"]).group(
1) if "parent" in firewall_policy else "projects"

metric_labels = {'parent': parent, 'parent_type': parent_type}

metric_labels["name"] = firewall_policy[
"displayName"] if "displayName" in firewall_policy else firewall_policy[
"name"]

metrics.append_data_to_series_buffer(
config, metrics_dict["metrics_per_firewall_policy"]
[f"firewall_policy_tuples"]["usage"]["name"],
firewall_policy['ruleTupleCount'], metric_labels, timestamp=timestamp)
if not current_tuples_limit == None and current_tuples_limit > 0:
metrics.append_data_to_series_buffer(
config, metrics_dict["metrics_per_firewall_policy"]
[f"firewall_policy_tuples"]["limit"]["name"], current_tuples_limit,
metric_labels, timestamp=timestamp)
metrics.append_data_to_series_buffer(
config, metrics_dict["metrics_per_firewall_policy"]
[f"firewall_policy_tuples"]["utilization"]["name"],
firewall_policy['ruleTupleCount'] / current_tuples_limit,
metric_labels, timestamp=timestamp)

print(f"Buffered number tuples per Firewall Policy")
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
# limitations under the License.
#

import time

from collections import defaultdict
from google.protobuf import field_mask_pb2
from . import metrics, networks, limits
Expand Down Expand Up @@ -75,15 +77,17 @@ def get_forwarding_rules_data(config, metrics_dict, forwarding_rules_dict,
Returns:
None
'''
for project in config["monitored_projects"]:
network_dict = networks.get_networks(config, project)

timestamp = time.time()
for project_id in config["monitored_projects"]:
network_dict = networks.get_networks(config, project_id)

current_quota_limit = limits.get_quota_current_limit(
config, f"projects/{project}", config["limit_names"][layer])
config, f"projects/{project_id}", config["limit_names"][layer])

if current_quota_limit is None:
print(
f"Could not write {layer} forwarding rules to metric for projects/{project} due to missing quotas"
f"Could not determine {layer} forwarding rules to metric for projects/{project_id} due to missing quotas"
)
continue

Expand All @@ -95,20 +99,24 @@ def get_forwarding_rules_data(config, metrics_dict, forwarding_rules_dict,
usage = 0
if net['self_link'] in forwarding_rules_dict:
usage = forwarding_rules_dict[net['self_link']]
metrics.write_data_to_metric(
config, project, usage, metrics_dict["metrics_per_network"]

metric_labels = {
'project': project_id,
'network_name': net['network_name']
}
metrics.append_data_to_series_buffer(
config, metrics_dict["metrics_per_network"]
[f"{layer.lower()}_forwarding_rules_per_network"]["usage"]["name"],
net['network_name'])
metrics.write_data_to_metric(
config, project, net['limit'], metrics_dict["metrics_per_network"]
usage, metric_labels, timestamp=timestamp)
metrics.append_data_to_series_buffer(
config, metrics_dict["metrics_per_network"]
[f"{layer.lower()}_forwarding_rules_per_network"]["limit"]["name"],
net['network_name'])
metrics.write_data_to_metric(
config, project, usage / net['limit'],
metrics_dict["metrics_per_network"]
net['limit'], metric_labels, timestamp=timestamp)
metrics.append_data_to_series_buffer(
config, metrics_dict["metrics_per_network"]
[f"{layer.lower()}_forwarding_rules_per_network"]["utilization"]
["name"], net['network_name'])
["name"], usage / net['limit'], metric_labels, timestamp=timestamp)

print(
f"Wrote number of {layer} forwarding rules to metric for projects/{project}"
)
f"Buffered number of {layer} forwarding rules to metric for projects/{project_id}"
)
Loading