moat is a monitoring tool that scans clusters for basic health metrics and malicious activity with Prometheus and AWS Cloudwatch. Users can view potential threats on a dashboard featuring Grafana panels to quickly pinpoint and remediate issues. We embarked upon this effort with a desire to learn more about Docker, Kubernetes, AWS, Prometheus, and Grafana. moat facilitated our explorations and helped us to deepen our understanding of these technologies, along with the importance of applying security best practices to any Kubernetes cluster.
Our application is still in its development phase. Our efforts up to now have consisted of building a Kubernetes cluster using Amazon EKS to simulate a real-world scenario. Then we deployed Prometheus to scan our clusters and log essential metrics. We are currently displaying general health metrics from our kubernetes cluster on our dashboard through Grafana panels. Also, given there are many potential entry points to a cluster, we decided to focus our first security feature on identifying failed AWS login attempts, which could indicate a brute force attack.
The instructions below will guide you through many of the same steps we took to build out our test environment and monitoring system. It will involve creating a Kubernetes cluster on AWS, deploying an application to it, and scanning it with Prometheus. CloudWatch alerts will allow you to identify when a user exceeds a specified login attempt threshold. You can leverage our frontend dashboard to display your Grafana panels and view the data that Prometheus scrapes from your cluster.
├── root
│ ├── moat-dashboard
│ ├── test-env-aws-eb-rds-deployment
│ ├── test-env-manual-deployment
- The
moat-dashboard
directory contains our front-end dashboard. This is where you will need to npm install the necessary dependencies to view the dashboard in your browser. - The
test-env-aws-eb-rds-deployment
directory contains the config files we used to launch our Kubernetes cluster in AWS using Elastic Beanstalk and RDS. Feel free to use them for your own deployments, services, etc. if you decide to take this route. - The
test-env-manual-deployment
directory contains the config files we used to launch a Kubernetes cluster manually. Feel free to use them for your own deployments, services, etc. if you decide to take this route.
- Fork and clone this repo.
- To view the moat dashboard in your browser, navigate to
moat-dashboard
directory. Install the dependencies there.
npm install
- Launch moat from the command line:
npm run dev
The following instructions are for building a test environment on AWS. Keep in mind, this will cost money. If you want a free alternative, try building a test environment locally with minikube. You can leverage the files in our test environment directories to deploy a cluster with the AWS CLI and YAML files OR with AWS Elastic Beanstalk and RDS:
Manual configuration: test-env-manual-deployment AWS Elastic Beanstalk/RDS configuration: test-env-aws-eb-rds-deployment
- Create an AWS account and set up your IAM roles.
- While all actions can be done in the root user’s account, it is NOT recommended to use the root user for anything other than setting up IAM roles.
- Set up a user group with the AdministratorAccess policy and set up any additional IAM user roles from there. Use the rule of least-privilege for granting permissions.
- Create a kubernetes cluster on AWS using EKS. Once the kubernetes cluster has been spun up, you will be use AWS’s console or the AWS CLI to add, edit, and stop deployments.
- Create a Docker image of your application (or use the example CodeForge app in our repo).
- Push your image to either ECR or Docker Hub. Remember the repo and organization name, it will be important when creating the .yaml files for deploying the EC2 instance to our k8s cluster.
- Deploy your application to AWS EC2 instance using Elastic Beanstalk, or your preferred method.
- Set up your database with RDS, deploy your own database to the cluster, or use your preferred method.
- Deploy your EC2 instance and database to your Kubernetes cluster.
- Configure nginx-ingress controller to serve EC2 instance through an external URI. Make sure that the ports being exposed are the same ports being used by your application (CodeForge uses port 3000).
- After testing, REMEMBER to tear down all unused AWS resources, they WILL charge you.
- Install AWS CLI and Configure AWS Credentials.
- After AWS CLI is installed, go to AWS account > security credentials > access keys > create new access key.
- Create IAM Role with administration access and make sure all services and users have been assigned to use this Role.
- Run the command aws configure and enter AWS credentials. (If running into permissions issues you can use the root users credentials to configure AWS but this is NOT recommended because it is not secure.)
- Install and setup kubectl (Kubernetes Command Line Tool).
- Install and setup Helm (Kubernetes Package Manager).
- Install and setup eksctl (CLI for Amazon EKS).
- After instilation, you can either create a cluster here or pull an existing one.
- To pull an existing cluster, run the command
eksctl get cluster --name your-cluster-name --region your-region
. Make sure that everything on AWS is using the same region!
- Update Kube Config and connect to EKS cluster.
- Run the command
aws eks update-kubeconfig --name your-cluster-name --region your-region
. - You can verify this connection by running kubectl get nodes.
- Run the command
The following commands will install the Prometheus and Grafana OSS (Not Grafana Cloud) as a sidecar container on your Kubernetes Cluster.
- Add Helm Stable Charts for your local client:
helm repo add stable https://charts.helm.sh/stable
- Add Prometheus Helm repo:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
- Create Prometheus namespace:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
- Install Prometheus:
helm install stable prometheus-community/kube-prometheus-stack -n prometheus
- Verify by running
kubectl get pods -n prometheus
- Edit Prometheus and Grafana service files to use LoadBalancer:
kubectl get svc -n prometheus
- Grafana will be installed along with Prometheus, so no need to install it separately
kubectl edit svc stable-kube-prometheus-sta-prometheus -n prometheus
- At the bottom of this service file, under spec, change type from ClusterIP to
type: LoadBalancer
. Under status, change toloadBalancer: {}
. Save the file.
- Verify type and status have been changed:
kubectl get svc -n prometheus
- Now do the same for Grafana:
kubectl edit svc stable-grafana -n prometheus
- Change type and status from ClusterIP to LoadBalancer:
kubectl get svc -n prometheus
. This will provide a URL to access both the Prometheus and Grafana Servers. - Grafana default login credentials:
- Username: admin
- Password: prom-operator
- Access secrets by running
kubectl get svc -n prometheus
- Prometheus should already be configured as a data source in Grafana.
- On the top search bar click Import Dashboard.
- Under Import from Grafana.com, enter
15760
. - Select Prometheus as the data source. This will load the pre-configured dashboard Kubernetes/Views/Pods.
- Create your own dashboard panels by querying Prometheus using PromQL, or view other pre-configured dashboards here.
- Update Grafana configuration to allow embedding.
- Navigate to the grafana-configmap.yaml in the repo, and run the following comand:
kubectl apply -f /path_to/grafana-configmap.yaml
- Verify changes are reflected in Grafana: Home > Administration > Settings > security > allow_embedding=true
- Remove time stamp from embedded URL.
- You can now embed dashboard panels on an external website!
Source for installing Prometheus on EKS
Once you have Grafana configured and your cluster data from Prometheus is being displayed in your dashboard, you should be able to embed iframes of key metrics into the moat dashboard.
- Set up a CloudWatch alarm for sign-in failures in AWS.
- Create an SNS (Simple Notification Service) topic in AWS. This topic will be used to send notifications when alarms are triggered.
- In your CloudWatch alarm settings, configure actions to be taken when the alarm state changes to "ALARM." Add an action to publish a message to your SNS topic.
- Set up a Lambda function that receives CloudWatch alarm notifications and exposes them in Grafana. Here is the python code we used:
import boto3
import json
def lambda_handler(event, context):
# Extract information from the CloudWatch Alarm event
try:
alarm_name = event['detail']['Console sign-in failures alarm']
alarm_description = event['detail']['Raises alarms if more than 3 console sign-in failures occur in 5 minutes']
new_state = event['detail']['newState']['stateValue']
except KeyError as e:
# Handle missing keys gracefully
return {
'statusCode': 400,
'body': json.dumps(f'Error: Missing key {e} in event.')
}
# Check if the alarm has entered the ALARM state
if new_state == 'ALARM':
# Define the action to take when the alarm is triggered
action_to_take = "Take action to address console sign-in failures."
# You can add your custom logic or notifications here
# For example, sending a notification using SNS
sns_client = boto3.client('sns')
topic_arn = 'YOUR-ARN'
message = f"Alarm '{alarm_name}' triggered: {alarm_description}\nAction: {action_to_take}"
sns_client.publish(
TopicArn=topic_arn,
Message=message,
Subject=f"Alarm '{alarm_name}' Triggered"
)
# You can also perform other actions or integrations as needed
# Return a response
response = {
'statusCode': 200,
'body': json.dumps('Alarm triggered successfully.')
}
else:
# Return a response indicating that the alarm is not in ALARM state
response = {
'statusCode': 200,
'body': json.dumps('Alarm is not in ALARM state.')
}
return response
- Install the CloudWatch Data Source plugin for Grafana. This plugin allows Grafana to fetch data from CloudWatch.
- Go to the Grafana home page, click on the gear icon (⚙️) on the left sidebar to access the configuration.
- Choose "Data Sources" and then click "Add data source."
- Search for "CloudWatch" and select it.
- Configure the CloudWatch data source with your AWS credentials and settings.
- In Grafana, set up alerting rules for the panels on your dashboards. You can define alert conditions based on CloudWatch data. When an alert condition is met, Grafana can trigger actions, such as sending notifications or changing the state of a panel.
- Test the entire setup by triggering a CloudWatch alarm with excessive login attempts.
Name | Github | |
---|---|---|
Anil Kondaveeti | https://github.com/Akon530 | http://www.linkedin.com/in/anil-kondaveeti-23175320b |
Gayle Martin | https://github.com/gaylem | https://www.linkedin.com/in/gaylem/ |
Ivy Shmikler | https://github.com/ishmikler | http://www.linkedin.com/in/ivy-shmikler |
Max Weiner | https://github.com/maxweiner02 | https://www.linkedin.com/in/max-j-weiner/ |
Meredith Frazier Britt | https://github.com/mfrazb | https://www.linkedin.com/in/meredithfrazierbritt/ |
If you wish to contribute, or just learn from our progress, you are more then welcome! Please follow these guidelines:
-
Fork and clone the repository
-
CREATE BRANCH with the format:
[!IMPORTANT] category/your-branch-name-here
Category Description hotfix for quickly fixing critical issues, usually with a temporary solution bugfix for fixing a bug feature for adding, removing or modifying a feature test for experimenting with something that is not an issue -
Guidelines for commit messages:
- Capitalize first word
- Use active voice: “Create sidebar component”
- Give why/how context when helpful to other developers
- Commit early and often
- Use multi-author commits if you paired with another developer on your contribution
-
DID YOU ADD ANY SENSITIVE INFORMATION TO CODE? Before you commit, move your sensitive data to a .env file. and add .env to .gitignore file.
-
COMMIT when you make a meaningful change and use the guidelines.
-
When you are ready to push your code, pull down dev and merge your code BEFORE pushing.
-
Submit a pull request to the dev branch and fill out the pull request template (feature or bug).
Here's our wishlist of features and tasks, in case you wish to contribute to this project:
Task | Description |
---|---|
Front and back end testing | Add Cypress and jest tests for user navigation (front end) and server connection (back end) |
Sign-up, login, and authentication for moat | Implement basic sign-up functionality, as well as login and authentication, for the moat dashboard |
Lockout mechanism for excessive login attempts | Write back end middleware to lock out user from test environment application (CodeForge) after 3 login attempts |
Simulate brute force login attempts | Write script that simulates attempts to gain access to our cluster via our test environment's login portal and our dashboard login portal |