This guide will walk you through the setup and execution process for configuring SuperSet with SafeSquid Reporting. The setup.sh
script will automate the configuration of various components necessary for the setup.
Note: Ensure you are logged in as the root user before performing any of the following actions. Root access is required for configuring services, modifying system files, and setting up log synchronization.
This guide will walk you through the setup and execution process for configuring SuperSet with SafeSquid Reporting. The setup.sh
script will automate the configuration of various components necessary for the setup, including setting up Monit for log synchronization and monitoring.
First, clone the GitHub repository that contains the necessary files for setup:
git clone https://github.com/SafeSquid-Github/superset-reporting.git
cd superset-reporting
To start the setup process with the default configuration, execute the following command:
bash setup.sh
After the setup completes, verify that the services are running correctly:
systemctl status superset.service
Once the services are running without any issues, you can configure your log server to pull logs from your proxy server.
To enable the aggregator to sync log files, follow these steps:
-
Add SSH Key to SafeSquid Proxy Server
The SSH key for the log aggregator server can be found in
/opt/aggregator/setup_authorized_keys
on the log server. Add this key to the/root/.ssh/authorized_keys
file on each SafeSquid proxy server to allow secure access. -
Download and Set Up rrsync Script
Use
curl
to download therrsync
script to the SafeSquid proxy server:curl -o /usr/local/bin/rrsync https://raw.githubusercontent.com/SafeSquid-Github/superset-reporting/refs/heads/master/scripts/rrsync
This will save the script to
/usr/local/bin/rrsync
. -
Set Execute Permissions
Ensure that
rrsync
has the appropriate permissions by running:chmod 755 /usr/local/bin/rrsync
-
Specify Proxy Server IPs
After adding the authorization key, you must specify the IP addresses of the proxy servers from which the logs will be pulled. Open the file
/opt/aggregator/servers.list
on the log server and enter each proxy server’s IP address on a new line.For example:
192.168.1.10 192.168.1.11 192.168.1.12
This file will allow the aggregator to pull logs from each specified proxy server.
The setup.sh
script automatically configures Monit to monitor and maintain the log synchronization process. Here’s what Monit will do:
-
Log File Monitoring: Monit checks the
/var/log/sync.log
file to ensure logs are synced:- If
sync.log
does not exist, Monit will create it. - If
sync.log
is older than an hour, Monit will triggersync.sh
to update logs andinsert.sh
to insert data into the databases.
- If
-
Server List Monitoring: Monit also monitors the
/opt/aggregator/servers.list
file:- If
servers.list
is modified (e.g., a new IP is added), Monit will executesync.sh
to pull updated logs.
- If
This Monit setup helps ensure your logs stay up-to-date, providing accurate data for SuperSet reports without manual intervention.
With these steps complete, your log server is now configured to securely pull logs from the SafeSquid proxy servers, to generate up-to-date reports.
Once the setup script has completed, activate the virtual environment to ensure all Python dependencies are correctly managed:
source /opt/aggregator/safesquid_reporting/bin/activate
Navigate to the aggregator directory where the scripts are located:
cd /opt/aggregator/bin/
The main.py
script provides a command-line interface for managing the database and logs. Below are the available commands:
Next, you'll need to create the necessary databases for storing logs. The following commands will create databases based on the log type: extended or performance.
Note: The create-database
command requires an argument specifying the log type (extended
or performance
). If you do not provide this argument, you will receive an error:
Usage: main.py create-database [OPTIONS] {extended|performance}
Try 'main.py create-database --help' for help.
Error: Missing argument '{extended|performance}'. Choose from:
extended,
performance
To create the database for extended logs:
Example:
python3 main.py create-database extended
To create the database for performance logs:
Example:
python3 main.py create-database performance
To insert logs into the database at any time, use the following command. Ensure you're in the aggregator
directory before executing:
Replace <log_type>
with extended
or performance
, and <log_file_path>
with the path to your log file.
Note: The insert
command requires two arguments: the log type and the path to the log file. If either argument is missing, the command will fail with an error.
python main.py insert <log_type> <log_file_path>
Example:
python main.py insert extended /var/log/aggregator/rsyslog/extended/192.168.2.10/20240603164101-extended.log
If you need to customize the setup (e.g., changing default user credentials, host, or database name), you can use the following steps.
First, review the help menu to understand the available options:
bash setup.sh -h
To execute the setup with custom values, use the following command. Replace the placeholders with your desired values:
bash setup.sh -u admin -p password -H 127.0.0.1 -P 5432 -d safesquid_logs -a admin -w password -f admin -l admin -e [email protected] -D /opt/aggregator -v safesquid_reporting
- -u PGUSER: PostgreSQL username (default:
admin
) - -p PGPASSWORD: PostgreSQL password (default:
safesquid
) - -H PGHOST: PostgreSQL host (default:
127.0.0.1
) - -P PGPORT: PostgreSQL port (default:
5432
) - -d PGDATABASE: PostgreSQL database name (default:
safesquid_logs
) - -a ADMIN_USERNAME: Admin username for SuperSet (default:
admin
) - -w ADMIN_PASSWORD: Admin password for SuperSet (default:
safesquid
) - -f ADMIN_FIRST_NAME: Admin first name (default:
admin
) - -l ADMIN_LAST_NAME: Admin last name (default:
admin
) - -e ADMIN_EMAIL: Admin email address (default:
[email protected]
) - -D DIRECTORY_NAME: Directory name for the project (default:
/opt/aggregator
) - -v VENV_NAME: Virtual environment path (default:
safesquid_reporting
)
The main.py
script offers several commands to manage and interact with your database. Below are some of the additional options available:
For further assistance and details on each command:
python3 main.py --help
To clear the database and drop all tables:
python3 main.py clear-database
To analyze the logs stored in the database and retrieve information about the database schema and row counts, use:
python3 main.py analyse-database
For example, analyzing the extended_logs
table may produce output similar to this:
- Table:
extended_logs
- Column:
record_id
, Type:text
- Column:
client_id
, Type:integer
- Column:
request_id
, Type:integer
- Column:
date_time
, Type:timestamp without time zone
- Column:
elapsed_time
, Type:integer
- Column:
status
, Type:integer
- Column:
size
, Type:integer
- Column:
upload
, Type:integer
- Column:
download
, Type:integer
- Column:
bypassed
, Type:boolean
- Column:
client_ip
, Type:text
- Column:
username
, Type:text
- Column:
method
, Type:text
- Column:
url
, Type:text
- Column:
http_referer
, Type:text
- Column:
useragent
, Type:text
- Column:
mime
, Type:text
- Column:
filter_name
, Type:text
- Column:
filtering_reason
, Type:text
- Column:
interface
, Type:text
- Column:
cachecode
, Type:text
- Column:
peercode
, Type:text
- Column:
peer
, Type:text
- Column:
request_host
, Type:text
- Column:
request_tld
, Type:text
- Column:
referer_host
, Type:text
- Column:
referer_tld
, Type:text
- Column:
range
, Type:text
- Column:
time_profiles
, Type:ARRAY
- Column:
user_groups
, Type:ARRAY
- Column:
request_profiles
, Type:ARRAY
- Column:
application_signatures
, Type:ARRAY
- Column:
categories
, Type:ARRAY
- Column:
response_profiles
, Type:ARRAY
- Column:
upload_content_types
, Type:ARRAY
- Column:
download_content_types
, Type:ARRAY
- Column:
profiles
, Type:ARRAY
- Row count:
67225
- Column: