From f893f9a699f8c342acf9e666eb6a37276844051b Mon Sep 17 00:00:00 2001 From: Seth Grover Date: Mon, 24 Apr 2023 08:17:37 -0600 Subject: [PATCH] work in progress for idaholab/Malcolm#173, documentation changes for kubernetes deployment. this commit takes care of the changes for moving stuff from docker-compose.yml into .env configuration files --- docs/README.md | 2 +- docs/asset-interaction-analysis.md | 10 +- docs/authsetup.md | 12 +- docs/contributing-dashboards.md | 2 +- docs/contributing-local-modifications.md | 288 ++++++++++++++++------- docs/contributing-logstash.md | 2 +- docs/contributing-zeek.md | 2 +- docs/file-scanning.md | 14 +- docs/host-config-windows.md | 2 +- docs/ics-best-guess.md | 2 +- docs/index-management.md | 4 +- docs/live-analysis.md | 10 +- docs/malcolm-config.md | 158 +++++++------ docs/malcolm-hedgehog-e2e-iso-install.md | 5 +- docs/malcolm-iso.md | 2 +- docs/malcolm-preparation.md | 2 +- docs/malcolm-upgrade.md | 6 +- docs/opensearch-instances.md | 6 +- docs/quickstart.md | 4 +- docs/running.md | 2 +- docs/severity.md | 10 +- docs/third-party-logs.md | 8 +- docs/ubuntu-install-example.md | 8 +- docs/upload.md | 4 +- malcolm-iso/build.sh | 1 + scripts/configure | 1 + scripts/install.py | 3 + scripts/malcolm_appliance_packager.sh | 1 + 28 files changed, 352 insertions(+), 219 deletions(-) create mode 120000 scripts/configure diff --git a/docs/README.md b/docs/README.md index 07468ed13..cdcc3f161 100644 --- a/docs/README.md +++ b/docs/README.md @@ -24,7 +24,7 @@ For smaller networks, use at home by network security enthusiasts, or in the fie * [Configuration](malcolm-preparation.md#Configuration) - [Recommended system requirements](system-requirements.md#SystemRequirements) - [Malcolm Configuration](malcolm-config.md#ConfigAndTuning) - + [`docker-compose.yml` parameters](malcolm-config.md#DockerComposeYml) + + [Environment Variable Files](malcolm-config.md#MalcolmConfigEnvVars) - [Configure authentication](authsetup.md#AuthSetup) + [Local account management](authsetup.md#AuthBasicAccountManagement) + [Lightweight Directory Access Protocol (LDAP) authentication](authsetup.md#AuthLDAP) diff --git a/docs/asset-interaction-analysis.md b/docs/asset-interaction-analysis.md index 2c2ce86a4..a72884e26 100644 --- a/docs/asset-interaction-analysis.md +++ b/docs/asset-interaction-analysis.md @@ -16,7 +16,7 @@ Please see the [NetBox page on GitHub](https://github.com/netbox-community/netbo ## Enriching network traffic metadata via NetBox lookups -As Zeek logs and Suricata alerts are parsed and enriched (if the `LOGSTASH_NETBOX_ENRICHMENT` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) is set to `true`) the NetBox API will be queried for the associated hosts' information. If found, the information retrieved by NetBox will be used to enrich these logs through the creation of the following new fields. See [the NetBox API](https://demo.netbox.dev/api/docs/) documentation and [the NetBox documentation](https://demo.netbox.dev/static/docs/introduction/). +As Zeek logs and Suricata alerts are parsed and enriched (if the `LOGSTASH_NETBOX_ENRICHMENT` [environment variable in `./config/logstash.env`](malcolm-config.md#MalcolmConfigEnvVars) is set to `true`) the NetBox API will be queried for the associated hosts' information. If found, the information retrieved by NetBox will be used to enrich these logs through the creation of the following new fields. See [the NetBox API](https://demo.netbox.dev/api/docs/) documentation and [the NetBox documentation](https://demo.netbox.dev/static/docs/introduction/). * `destination.…` - `destination.device.cluster` (`/virtualization/clusters/`) (for [Virtual Machine](https://demo.netbox.dev/static/docs/coe-functionality/virtualization/) device types) @@ -28,13 +28,13 @@ As Zeek logs and Suricata alerts are parsed and enriched (if the `LOGSTASH_NETBO - [`destination.device.service`](https://demo.netbox.dev/static/docs/core-functionality/services/#service-templates) (`/ipam/services/`) - `destination.device.site` (`/dcim/sites/`) - `destination.device.url` (`/dcim/devices/`) - - `destination.device.details` (full JSON object, [only with `LOGSTASH_NETBOX_ENRICHMENT_VERBOSE: 'true'`](malcolm-config.md#DockerComposeYml)) + - `destination.device.details` (full JSON object, [only with `LOGSTASH_NETBOX_ENRICHMENT_VERBOSE: 'true'`](malcolm-config.md#MalcolmConfigEnvVars)) - `destination.segment.id` (`/ipam/vrfs/{id}`) - `destination.segment.name` (`/ipam/vrfs/`) - `destination.segment.site` (`/dcim/sites/`) - `destination.segment.tenant` (`/tenancy/tenants/`) - `destination.segment.url` (`/ipam/vrfs/`) - - `destination.segment.details` (full JSON object, [only with `LOGSTASH_NETBOX_ENRICHMENT_VERBOSE: 'true'`](malcolm-config.md#DockerComposeYml)) + - `destination.segment.details` (full JSON object, [only with `LOGSTASH_NETBOX_ENRICHMENT_VERBOSE: 'true'`](malcolm-config.md#MalcolmConfigEnvVars)) * `source.…` same as `destination.…` * collected as `related` fields (the [same approach](https://www.elastic.co/guide/en/ecs/current/ecs-related.html) used in ECS) - `related.device_type` @@ -47,7 +47,7 @@ As Zeek logs and Suricata alerts are parsed and enriched (if the `LOGSTASH_NETBO For Malcolm's purposes, both physical devices and virtualized hosts will be stored as described above: the `device_type` field can be used to distinguish between them. -NetBox has the concept of [sites](https://demo.netbox.dev/static/docs/core-functionality/sites-and-racks/). Sites can have overlapping IP address ranges, of course. The value of the `NETBOX_DEFAULT_SITE` variable in [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) will be used as a query parameter for these enrichment lookups. +NetBox has the concept of [sites](https://demo.netbox.dev/static/docs/core-functionality/sites-and-racks/). Sites can have overlapping IP address ranges, of course. The value of the `NETBOX_DEFAULT_SITE` variable in [environment variable in `netbox-common.env`](malcolm-config.md#MalcolmConfigEnvVars) will be used as a query parameter for these enrichment lookups. This feature was implemented as described in [idaholab/Malcolm#132](https://github.com/idaholab/Malcolm/issues/132). @@ -73,7 +73,7 @@ See [idaholab/Malcolm#134](https://github.com/idaholab/Malcolm/issues/134). ## Populate NetBox inventory via passively-gathered network traffic metadata -The purpose of an asset management system is to document the intended state of a network: were Malcolm to actively and agressively populate NetBox with the live network state, a network configuration fault could result in an incorrect documented configuration. The Malcolm development team is investigating what data, if any, should automatically flow to NetBox based on traffic observed (enabled via the `NETBOX_CRON` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml)), and what NetBox inventory data could be used, if any, to enrich Malcolm's network traffic metadata. Well-considered suggestions in this area are welcome. +The purpose of an asset management system is to document the intended state of a network: were Malcolm to actively and agressively populate NetBox with the live network state, a network configuration fault could result in an incorrect documented configuration. The Malcolm development team is investigating what data, if any, should automatically flow to NetBox based on traffic observed (enabled via the `NETBOX_CRON` [environment variable in `netbox-common.env`](malcolm-config.md#MalcolmConfigEnvVars)), and what NetBox inventory data could be used, if any, to enrich Malcolm's network traffic metadata. Well-considered suggestions in this area are welcome. See [idaholab/Malcolm#135](https://github.com/idaholab/Malcolm/issues/135). diff --git a/docs/authsetup.md b/docs/authsetup.md index e1d051edd..96b0a1ac1 100644 --- a/docs/authsetup.md +++ b/docs/authsetup.md @@ -12,7 +12,7 @@ With the local basic authentication method, user accounts are managed by Malcolm LDAP authentication are managed on a remote directory service, such as a [Microsoft Active Directory Domain Services](https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/get-started/virtual-dc/active-directory-domain-services-overview) or [OpenLDAP](https://www.openldap.org/). -Malcolm's authentication method is defined in the `x-auth-variables` section near the top of the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file with the `NGINX_BASIC_AUTH` environment variable: `true` for local TLS-encrypted HTTP basic authentication, `false` for LDAP authentication. +Malcolm's authentication method is defined in the [`auth-common.env` configuration file](malcolm-config.md#MalcolmConfigEnvVars) file with the `NGINX_BASIC_AUTH` environment variable: `true` for local TLS-encrypted HTTP basic authentication, `false` for LDAP authentication and `no_authentication` to disable authentication completely. In either case, you **must** run `./scripts/auth_setup` before starting Malcolm for the first time in order to: @@ -82,16 +82,16 @@ Authentication over LDAP can be done using one of three ways, [two of which](htt * **LDAPS** - a commonly used (though unofficial and considered deprecated) method in which SSL negotiation takes place before any commands are sent from the client to the server * **Unencrypted** (cleartext) (***not recommended***) -In addition to the `NGINX_BASIC_AUTH` environment variable being set to `false` in the `x-auth-variables` section near the top of the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file, the `NGINX_LDAP_TLS_STUNNEL` and `NGINX_LDAP_TLS_STUNNEL` environment variables are used in conjunction with the values in `nginx/nginx_ldap.conf` to define the LDAP connection security level. Use the following combinations of values to achieve the connection security methods above, respectively: +In addition to the `NGINX_BASIC_AUTH` environment variable being set to `false` in the [`auth-common.env` configuration file](malcolm-config.md#MalcolmConfigEnvVars) file, the `NGINX_LDAP_TLS_STUNNEL` and `NGINX_LDAP_TLS_STUNNEL` environment variables are used in conjunction with the values in `nginx/nginx_ldap.conf` to define the LDAP connection security level. Use the following combinations of values to achieve the connection security methods above, respectively: * **StartTLS** - - `NGINX_LDAP_TLS_STUNNEL` set to `true` in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) + - `NGINX_LDAP_TLS_STUNNEL` set to `true` in [`auth-common.env`](malcolm-config.md#MalcolmConfigEnvVars) - `url` should begin with `ldap://` and its port should be either the default LDAP port (389) or the default Global Catalog port (3268) in `nginx/nginx_ldap.conf` * **LDAPS** - - `NGINX_LDAP_TLS_STUNNEL` set to `false` in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) + - `NGINX_LDAP_TLS_STUNNEL` set to `false` in [`auth-common.env`](malcolm-config.md#MalcolmConfigEnvVars) - `url` should begin with `ldaps://` and its port should be either the default LDAPS port (636) or the default LDAPS Global Catalog port (3269) in `nginx/nginx_ldap.conf` * **Unencrypted** (clear text) (***not recommended***) - - `NGINX_LDAP_TLS_STUNNEL` set to `false` in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) + - `NGINX_LDAP_TLS_STUNNEL` set to `false` in [`auth-common.env`](malcolm-config.md#MalcolmConfigEnvVars) - `url` should begin with `ldap://` and its port should be either the default LDAP port (389) or the default Global Catalog port (3268) in `nginx/nginx_ldap.conf` For encrypted connections (whether using **StartTLS** or **LDAPS**), Malcolm will require and verify certificates when one or more trusted CA certificate files are placed in the `nginx/ca-trust/` directory. Otherwise, any certificate presented by the domain server will be accepted. @@ -102,4 +102,4 @@ When you [set up authentication](#AuthSetup) for Malcolm a set of unique [self-s Another option is to generate your own certificates (or have them issued to you) and have them placed in the `nginx/certs/` directory. The certificate and key file should be named `cert.pem` and `key.pem`, respectively. -A third possibility is to use a third-party reverse proxy (e.g., [Traefik](https://doc.traefik.io/traefik/) or [Caddy](https://caddyserver.com/docs/quick-starts/reverse-proxy)) to handle the issuance of the certificates for you and to broker the connections between clients and Malcolm. Reverse proxies such as these often implement the [ACME](https://datatracker.ietf.org/doc/html/rfc8555) protocol for domain name authentication and can be used to request certificates from certificate authorities like [Let's Encrypt](https://letsencrypt.org/how-it-works/). In this configuration, the reverse proxy will be encrypting the connections instead of Malcolm, so you'll need to set the `NGINX_SSL` environment variable to `false` in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) (or answer `no` to the "Require encrypted HTTPS connections?" question posed by `install.py`). If you are setting `NGINX_SSL` to `false`, **make sure** you understand what you are doing and ensure that external connections cannot reach ports over which Malcolm will be communicating without encryption, including verifying your local firewall configuration. \ No newline at end of file +A third possibility is to use a third-party reverse proxy (e.g., [Traefik](https://doc.traefik.io/traefik/) or [Caddy](https://caddyserver.com/docs/quick-starts/reverse-proxy)) to handle the issuance of the certificates for you and to broker the connections between clients and Malcolm. Reverse proxies such as these often implement the [ACME](https://datatracker.ietf.org/doc/html/rfc8555) protocol for domain name authentication and can be used to request certificates from certificate authorities like [Let's Encrypt](https://letsencrypt.org/how-it-works/). In this configuration, the reverse proxy will be encrypting the connections instead of Malcolm, so you'll need to set the `NGINX_SSL` environment variable to `false` in [`nginx.env`](malcolm-config.md#MalcolmConfigEnvVars) (or answer `no` to the "Require encrypted HTTPS connections?" question posed by `./scripts/configure`). If you are setting `NGINX_SSL` to `false`, **make sure** you understand what you are doing and ensure that external connections cannot reach ports over which Malcolm will be communicating without encryption, including verifying your local firewall configuration. \ No newline at end of file diff --git a/docs/contributing-dashboards.md b/docs/contributing-dashboards.md index 6f53fb810..f819e7251 100644 --- a/docs/contributing-dashboards.md +++ b/docs/contributing-dashboards.md @@ -32,7 +32,7 @@ Visualizations and dashboards can be [easily created](dashboards.md#BuildDashboa } } ``` -1. Include the new dashboard either by using a [bind mount](contributing-local-modifications.md#Bind) for the `./dashboards./dashboards/` directory or by [rebuilding](development.md#Build) the `dashboards-helper` Docker image. Dashboards are imported the first time Malcolm starts up. +1. Include the new dashboard either by using a [bind mount](contributing-local-modifications.md#Bind) for the `./dashboards/dashboards/` directory or by [rebuilding](development.md#Build) the `dashboards-helper` Docker image. Dashboards are imported the first time Malcolm starts up. ## OpenSearch Dashboards plugins diff --git a/docs/contributing-local-modifications.md b/docs/contributing-local-modifications.md index e8cf2b563..afec0a841 100644 --- a/docs/contributing-local-modifications.md +++ b/docs/contributing-local-modifications.md @@ -1,6 +1,6 @@ # Local modifications -There are several ways to customize Malcolm's runtime behavior via local changes to configuration files. Many commonly-tweaked settings are discussed in the project [README](README.md) (see [`docker-compose.yml` parameters](malcolm-config.md#DockerComposeYml) and [Customizing event severity scoring](severity.md#SeverityConfig) for some examples). +There are several ways to customize Malcolm's runtime behavior via local changes to configuration files. Many commonly-tweaked settings are discussed in the project [README](README.md) (see [Environment Variable Files](malcolm-config.md#MalcolmConfigEnvVars) and [Customizing event severity scoring](severity.md#SeverityConfig) for some examples). ## Docker bind mounts @@ -8,93 +8,205 @@ Some configuration changes can be put in place by modifying local copies of conf ``` $ grep -P "^( - ./| [\w-]+:)" docker-compose-standalone.yml - opensearch: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro - - ./.opensearch.secondary.curlrc:/var/local/curlrc/.opensearch.secondary.curlrc:ro - - ./opensearch/opensearch.keystore:/usr/share/opensearch/config/opensearch.keystore:rw - - ./opensearch:/usr/share/opensearch/data:delegated - - ./opensearch-backup:/opt/opensearch/backup:delegated - dashboards-helper: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro - dashboards: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro - logstash: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro - - ./.opensearch.secondary.curlrc:/var/local/curlrc/.opensearch.secondary.curlrc:ro - - ./logstash/maps/malcolm_severity.yaml:/etc/malcolm_severity.yaml:ro - - ./logstash/certs/ca.crt:/certs/ca.crt:ro - - ./logstash/certs/server.crt:/certs/server.crt:ro - - ./logstash/certs/server.key:/certs/server.key:ro - filebeat: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro - - ./zeek-logs:/zeek - - ./suricata-logs:/suricata - - ./filebeat/certs/ca.crt:/certs/ca.crt:ro - - ./filebeat/certs/client.crt:/certs/client.crt:ro - - ./filebeat/certs/client.key:/certs/client.key:ro - arkime: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro - - ./pcap:/data/pcap - - ./arkime-logs:/opt/arkime/logs - - ./arkime-raw:/opt/arkime/raw - zeek: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./pcap:/pcap - - ./zeek-logs/upload:/zeek/upload - - ./zeek-logs/extract_files:/zeek/extract_files - - ./zeek/intel:/opt/zeek/share/zeek/site/intel - zeek-live: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./zeek-logs/live:/zeek/live - - ./zeek-logs/extract_files:/zeek/extract_files - - ./zeek/intel:/opt/zeek/share/zeek/site/intel - suricata: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./suricata-logs:/var/log/suricata - - ./pcap:/data/pcap - - ./suricata/rules:/opt/suricata/rules:ro - suricata-live: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./suricata-logs:/var/log/suricata - - ./suricata/rules:/opt/suricata/rules:ro - file-monitor: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./zeek-logs/extract_files:/zeek/extract_files - - ./zeek-logs/current:/zeek/logs - - ./yara/rules:/yara-rules/custom:ro - pcap-capture: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./pcap/upload:/pcap - pcap-monitor: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro - - ./zeek-logs:/zeek - - ./pcap:/pcap - upload: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./pcap/upload:/var/www/upload/server/php/chroot/files - htadmin: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./htadmin/config.ini:/var/www/htadmin/config/config.ini:rw - - ./htadmin/metadata:/var/www/htadmin/config/metadata:rw - - ./nginx/htpasswd:/var/www/htadmin/auth/htpasswd:rw - freq: - - ./nginx/ca-trust:/var/local/ca-trust:ro - api: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro - nginx-proxy: - - ./nginx/ca-trust:/var/local/ca-trust:ro - - ./nginx/nginx_ldap.conf:/etc/nginx/nginx_ldap.conf:ro - - ./nginx/htpasswd:/etc/nginx/htpasswd:ro - - ./nginx/certs:/etc/nginx/certs:ro - - ./nginx/certs/dhparam.pem:/etc/nginx/dhparam/dhparam.pem:ro +opensearch: + - ./config/process.env + - ./config/ssl.env + - ./config/opensearch.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro + - ./.opensearch.secondary.curlrc:/var/local/curlrc/.opensearch.secondary.curlrc:ro + - ./opensearch:/usr/share/opensearch/data:delegated + - ./opensearch-backup:/opt/opensearch/backup:delegated + - ./opensearch/opensearch.keystore:/usr/share/opensearch/config/persist/opensearch.keystore:rw +dashboards-helper: + - ./config/process.env + - ./config/ssl.env + - ./config/opensearch.env + - ./config/dashboards-helper.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro + - ./.opensearch.secondary.curlrc:/var/local/curlrc/.opensearch.secondary.curlrc:ro +dashboards: + - ./config/process.env + - ./config/ssl.env + - ./config/opensearch.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro +logstash: + - ./config/process.env + - ./config/ssl.env + - ./config/opensearch.env + - ./config/netbox-common.env + - ./config/netbox.env + - ./config/beats-common.env + - ./config/lookup-common.env + - ./config/logstash.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro + - ./.opensearch.secondary.curlrc:/var/local/curlrc/.opensearch.secondary.curlrc:ro + - ./logstash/maps/malcolm_severity.yaml:/etc/malcolm_severity.yaml:ro + - ./logstash/certs/ca.crt:/certs/ca.crt:ro + - ./logstash/certs/server.crt:/certs/server.crt:ro + - ./logstash/certs/server.key:/certs/server.key:ro +filebeat: + - ./config/process.env + - ./config/ssl.env + - ./config/opensearch.env + - ./config/upload-common.env + - ./config/nginx.env + - ./config/beats-common.env + - ./config/filebeat.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro + - ./zeek-logs:/zeek + - ./suricata-logs:/suricata + - ./filebeat/certs/ca.crt:/certs/ca.crt:ro + - ./filebeat/certs/client.crt:/certs/client.crt:ro + - ./filebeat/certs/client.key:/certs/client.key:ro +arkime: + - ./config/process.env + - ./config/ssl.env + - ./config/opensearch.env + - ./config/upload-common.env + - ./config/auth.env + - ./config/arkime.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro + - ./pcap:/data/pcap + - ./arkime-logs:/opt/arkime/logs + - ./arkime-raw:/opt/arkime/raw +zeek: + - ./config/process.env + - ./config/ssl.env + - ./config/upload-common.env + - ./config/zeek.env + - ./config/zeek-offline.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./pcap:/pcap + - ./zeek-logs/upload:/zeek/upload + - ./zeek-logs/extract_files:/zeek/extract_files + - ./zeek/intel:/opt/zeek/share/zeek/site/intel +zeek-live: + - ./config/process.env + - ./config/ssl.env + - ./config/upload-common.env + - ./config/pcap-capture.env + - ./config/zeek.env + - ./config/zeek-live.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./zeek-logs/live:/zeek/live + - ./zeek-logs/extract_files:/zeek/extract_files + - ./zeek/intel:/opt/zeek/share/zeek/site/intel +suricata: + - ./config/process.env + - ./config/ssl.env + - ./config/upload-common.env + - ./config/suricata.env + - ./config/suricata-offline.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./suricata-logs:/var/log/suricata + - ./pcap:/data/pcap + - ./suricata/rules:/opt/suricata/rules:ro +suricata-live: + - ./config/process.env + - ./config/ssl.env + - ./config/upload-common.env + - ./config/pcap-capture.env + - ./config/suricata.env + - ./config/suricata-live.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./suricata-logs:/var/log/suricata + - ./suricata/rules:/opt/suricata/rules:ro +file-monitor: + - ./config/process.env + - ./config/ssl.env + - ./config/zeek.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./zeek-logs/extract_files:/zeek/extract_files + - ./zeek-logs/current:/zeek/logs + - ./yara/rules:/yara-rules/custom:ro +pcap-capture: + - ./config/process.env + - ./config/ssl.env + - ./config/pcap-capture.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./pcap/upload:/pcap +pcap-monitor: + - ./config/process.env + - ./config/ssl.env + - ./config/opensearch.env + - ./config/upload-common.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro + - ./zeek-logs:/zeek + - ./pcap:/pcap +upload: + - ./config/process.env + - ./config/ssl.env + - ./config/auth.env + - ./config/upload.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./pcap/upload:/var/www/upload/server/php/chroot/files +htadmin: + - ./config/process.env + - ./config/ssl.env + - ./config/auth-common.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./htadmin/config.ini:/var/www/htadmin/config/config.ini:rw + - ./htadmin/metadata:/var/www/htadmin/config/metadata:rw + - ./nginx/htpasswd:/var/www/htadmin/auth/htpasswd:rw +freq: + - ./config/process.env + - ./config/ssl.env + - ./config/lookup-common.env + - ./nginx/ca-trust:/var/local/ca-trust:ro +netbox: + - ./config/process.env + - ./config/ssl.env + - ./config/netbox-common.env + - ./config/netbox.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./netbox/config/configuration:/etc/netbox/config:ro + - ./netbox/config/reports:/etc/netbox/reports:ro + - ./netbox/config/scripts:/etc/netbox/scripts:ro + - ./netbox/media:/opt/netbox/netbox/media:rw + - ./net-map.json:/usr/local/share/net-map.json:ro +netbox-postgres: + - ./config/process.env + - ./config/ssl.env + - ./config/netbox-common.env + - ./config/netbox-postgres.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./netbox/postgres:/var/lib/postgresql/data:rw +netbox-redis: + - ./config/process.env + - ./config/ssl.env + - ./config/netbox-common.env + - ./config/netbox-redis.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./netbox/redis:/data +netbox-redis-cache: + - ./config/process.env + - ./config/ssl.env + - ./config/netbox-common.env + - ./config/netbox-redis-cache.env + - ./nginx/ca-trust:/var/local/ca-trust:ro +api: + - ./config/process.env + - ./config/ssl.env + - ./config/opensearch.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./.opensearch.primary.curlrc:/var/local/curlrc/.opensearch.primary.curlrc:ro +nginx-proxy: + - ./config/process.env + - ./config/ssl.env + - ./config/auth-common.env + - ./config/nginx.env + - ./nginx/ca-trust:/var/local/ca-trust:ro + - ./nginx/nginx_ldap.conf:/etc/nginx/nginx_ldap.conf:ro + - ./nginx/htpasswd:/etc/nginx/auth/htpasswd:ro + - ./nginx/certs:/etc/nginx/certs:ro + - ./nginx/certs/dhparam.pem:/etc/nginx/dhparam/dhparam.pem:ro ``` So, for example, if you wanted to make a change to the `nginx-proxy` container's `nginx.conf` file, you could add the following line to the `volumes:` section of the `nginx-proxy` service in your `docker-compose.yml` file: diff --git a/docs/contributing-logstash.md b/docs/contributing-logstash.md index d13c56b34..a628210ac 100644 --- a/docs/contributing-logstash.md +++ b/docs/contributing-logstash.md @@ -28,7 +28,7 @@ Logstash can then be easily extended to add more [`logstash/pipelines`]({{ site. So, in order to add a new **parse pipeline** for `cooltool` after tweaking [`filebeat.yml`]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/filebeat/filebeat.yml) as described above, create a `cooltool` directory under [`logstash/pipelines`]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/logstash/pipelines) which follows the same pattern as the `zeek` parse pipeline. This directory will have an input file (tiny), a filter file (possibly large), and an output file (tiny). In your filter file, be sure to set the field [`event.hash`](https://www.elastic.co/guide/en/ecs/master/ecs-event.html#field-event-hash) to a unique value to identify indexed documents in OpenSearch; the [fingerprint filter](https://www.elastic.co/guide/en/logstash/current/plugins-filters-fingerprint.html) may be useful for this. -Finally, in your `docker-compose` files, set a new `LOGSTASH_PARSE_PIPELINE_ADDRESSES` environment variable under `logstash-variables` to `cooltool-parse,zeek-parse,suricata-parse,beats-parse` (assuming you named the pipeline address from the previous step `cooltool-parse`) so that logs sent from `filebeat` to `logstash` are forwarded to all parse pipelines. +Finally, in the [`./config/logstash.env` file](malcolm-config.md#MalcolmConfigEnvVars), set a new `LOGSTASH_PARSE_PIPELINE_ADDRESSES` environment variable to `cooltool-parse,zeek-parse,suricata-parse,beats-parse` (assuming you named the pipeline address from the previous step `cooltool-parse`) so that logs sent from `filebeat` to `logstash` are forwarded to all parse pipelines. ## Parsing new Zeek logs diff --git a/docs/contributing-zeek.md b/docs/contributing-zeek.md index db8bfc935..6846bbb46 100644 --- a/docs/contributing-zeek.md +++ b/docs/contributing-zeek.md @@ -2,7 +2,7 @@ ## `local.zeek` -Some Zeek behavior can be tweaked without having to manually edit configuration files through the use of environment variables: search for `ZEEK` in the [`docker-compose.yml` parameters](malcolm-config.md#DockerComposeYml) section of the documentation. +Some Zeek behavior can be tweaked through the use of [environment variables](malcolm-config.md#MalcolmConfigEnvVars) in the `.env` files beginning with `zeek…`. Other changes to Zeek's behavior could be made by modifying [local.zeek]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/zeek/config/local.zeek) and either using a [bind mount](contributing-local-modifications.md#Bind) or [rebuilding](development.md#Build) the `zeek` Docker image with the modification. See the [Zeek documentation](https://docs.zeek.org/en/master/quickstart.html#local-site-customization) for more information on customizing a Zeek instance. Note that changing Zeek's behavior could result in changes to the format of the logs Zeek generates, which could break Malcolm's parsing of those logs, so exercise caution. diff --git a/docs/file-scanning.md b/docs/file-scanning.md index 6d6396a9e..7847bb144 100644 --- a/docs/file-scanning.md +++ b/docs/file-scanning.md @@ -1,6 +1,6 @@ # Automatic file extraction and scanning -Malcolm can leverage Zeek's knowledge of network protocols to automatically detect file transfers and extract those files from PCAPs as Zeek processes them. This behavior can be enabled globally by modifying the `ZEEK_EXTRACTOR_MODE` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml), or on a per-upload basis for PCAP files uploaded via the [browser-based upload form](upload.md#Upload) when **Analyze with Zeek** is selected. +Malcolm can leverage Zeek's knowledge of network protocols to automatically detect file transfers and extract those files from PCAPs as Zeek processes them. This behavior can be enabled globally by modifying the `ZEEK_EXTRACTOR_MODE` [variable in `zeek.env`](malcolm-config.md#MalcolmConfigEnvVars), or on a per-upload basis for PCAP files uploaded via the [browser-based upload form](upload.md#Upload) when **Analyze with Zeek** is selected. To specify which files should be extracted, the following values are acceptable in `ZEEK_EXTRACTOR_MODE`: @@ -12,17 +12,17 @@ To specify which files should be extracted, the following values are acceptable Extracted files can be examined through any of the following methods: -* submitting file hashes to [**VirusTotal**](https://www.virustotal.com/en/#search); to enable this method, specify the `VTOT_API2_KEY` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) -* scanning files with [**ClamAV**](https://www.clamav.net/); to enable this method, set the `EXTRACTED_FILE_ENABLE_CLAMAV` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) to `true` -* scanning files with [**Yara**](https://github.com/VirusTotal/yara); to enable this method, set the `EXTRACTED_FILE_ENABLE_YARA` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) to `true` -* scanning PE (portable executable) files with [**Capa**](https://github.com/fireeye/capa); to enable this method, set the `EXTRACTED_FILE_ENABLE_CAPA` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) to `true` +* submitting file hashes to [**VirusTotal**](https://www.virustotal.com/en/#search); to enable this method, specify the `VTOT_API2_KEY` [environment variable in `zeek.env`](malcolm-config.md#MalcolmConfigEnvVars) +* scanning files with [**ClamAV**](https://www.clamav.net/); to enable this method, set the `EXTRACTED_FILE_ENABLE_CLAMAV` [environment variable in `zeek.env`](malcolm-config.md#MalcolmConfigEnvVars) to `true` +* scanning files with [**Yara**](https://github.com/VirusTotal/yara); to enable this method, set the `EXTRACTED_FILE_ENABLE_YARA` [environment variable in `zeek.env`](malcolm-config.md#MalcolmConfigEnvVars) to `true` +* scanning PE (portable executable) files with [**Capa**](https://github.com/fireeye/capa); to enable this method, set the `EXTRACTED_FILE_ENABLE_CAPA` [environment variable in `zeek.env`](malcolm-config.md#MalcolmConfigEnvVars) to `true` Files which are flagged via any of these methods will be logged as Zeek `signatures.log` entries, and can be viewed in the **Signatures** dashboard in OpenSearch Dashboards. -The `EXTRACTED_FILE_PRESERVATION` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) determines the behavior for preservation of Zeek-extracted files: +The `EXTRACTED_FILE_PRESERVATION` [environment variable in `zeek.env`](malcolm-config.md#MalcolmConfigEnvVars) determines the behavior for preservation of Zeek-extracted files: * `quarantined`: preserve only flagged files in `./zeek-logs/extract_files/quarantine` * `all`: preserve flagged files in `./zeek-logs/extract_files/quarantine` and all other extracted files in `./zeek-logs/extract_files/preserved` * `none`: preserve no extracted files -The `EXTRACTED_FILE_HTTP_SERVER_…` [environment variables in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) configure access to the Zeek-extracted files path through the means of a simple HTTPS directory server. Beware that Zeek-extracted files may contain malware. As such, the files may be optionally encrypted upon download. \ No newline at end of file +The `EXTRACTED_FILE_HTTP_SERVER_…` [environment variables in `zeek.env`](malcolm-config.md#MalcolmConfigEnvVars) configure access to the Zeek-extracted files path through the means of a simple HTTPS directory server. Beware that Zeek-extracted files may contain malware. As such, the files may be optionally encrypted upon download (and decrypted using `openssl`, e.g., `openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe`) diff --git a/docs/host-config-windows.md b/docs/host-config-windows.md index c024bf79d..9545bea36 100644 --- a/docs/host-config-windows.md +++ b/docs/host-config-windows.md @@ -13,4 +13,4 @@ Installing and configuring [Docker to run under Windows](https://docs.docker.com ## Finish Malcolm's configuration -Once Docker is installed, configured and running as described in the previous section, run [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning) to finish configuration of the local Malcolm installation. Malcolm will be controlled and run from within your WSL distribution's terminal environment. \ No newline at end of file +Once Docker is installed, configured and running as described in the previous section, run [`./scripts/configure`](malcolm-config.md#ConfigAndTuning) to finish configuration of the local Malcolm installation. Malcolm will be controlled and run from within your WSL distribution's terminal environment. \ No newline at end of file diff --git a/docs/ics-best-guess.md b/docs/ics-best-guess.md index 08f5bb0c9..4e7ee6a6e 100644 --- a/docs/ics-best-guess.md +++ b/docs/ics-best-guess.md @@ -8,4 +8,4 @@ Naturally, these lookups could produce false positives, so these connections are ![](./images/screenshots/dashboards_bestguess.png) -This feature is disabled by default, but it can be enabled by clearing (setting to `''`) the value of the `ZEEK_DISABLE_BEST_GUESS_ICS` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). \ No newline at end of file +This feature is disabled by default, but it can be enabled by clearing (setting to `''`) the value of the `ZEEK_DISABLE_BEST_GUESS_ICS` [environment variable in `zeek.env`](malcolm-config.md#MalcolmConfigEnvVars). \ No newline at end of file diff --git a/docs/index-management.md b/docs/index-management.md index 2277fbd07..19441fb4f 100644 --- a/docs/index-management.md +++ b/docs/index-management.md @@ -2,6 +2,6 @@ Malcolm releases prior to v6.2.0 used environment variables to configure OpenSearch [Index State Management](https://opensearch.org/docs/latest/im-plugin/ism/index/) [policies](https://opensearch.org/docs/latest/im-plugin/ism/policies/). -Since then, OpenSearch Dashboards has developed and released plugins with UIs for [Index State Management](https://opensearch.org/docs/latest/im-plugin/ism/index/) and [Snapshot Management](https://opensearch.org/docs/latest/opensearch/snapshots/sm-dashboards/). Because these plugins provide a more comprehensive and user-friendly interfaces for these features, the old environment variable-based configuration code has been removed from Malcolm, with the exception of the code that uses `OPENSEARCH_INDEX_SIZE_PRUNE_LIMIT` and `OPENSEARCH_INDEX_SIZE_PRUNE_NAME_SORT` which deals with deleting the oldest network session metadata indices when the database exceeds a certain size. +Since then, OpenSearch Dashboards has developed and released plugins with UIs for [Index State Management](https://opensearch.org/docs/latest/im-plugin/ism/index/) and [Snapshot Management](https://opensearch.org/docs/latest/opensearch/snapshots/sm-dashboards/). Because these plugins provide a more comprehensive and user-friendly interfaces for these features, the old environment variable-based configuration code has been removed from Malcolm, with the exception of the code that uses the `OPENSEARCH_INDEX_SIZE_PRUNE_LIMIT` and `OPENSEARCH_INDEX_SIZE_PRUNE_NAME_SORT` [variables in `dashboards-helper.env`](malcolm-config.md#MalcolmConfigEnvVars) and which deals with deleting the oldest network session metadata indices when the database exceeds a certain size. -Note that OpenSearch index state management and snapshot management only deals with disk space consumed by OpenSearch indices: it does not have anything to do with PCAP file storage. The `MANAGE_PCAP_FILES` environment variable in the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file can be used to allow Arkime to prune old PCAP files based on available disk space. \ No newline at end of file +Note that OpenSearch index state management and snapshot management only deals with disk space consumed by OpenSearch indices: it does not have anything to do with PCAP file storage. The `MANAGE_PCAP_FILES` environment variable in the [`arkime.env` file](malcolm-config.md#MalcolmConfigEnvVars) can be used to allow Arkime to prune old PCAP files based on available disk space. \ No newline at end of file diff --git a/docs/live-analysis.md b/docs/live-analysis.md index a43e161f4..d08702b5b 100644 --- a/docs/live-analysis.md +++ b/docs/live-analysis.md @@ -18,19 +18,19 @@ Please see the [Hedgehog Linux README](hedgehog.md) for more information. ## Monitoring local network interfaces -Malcolm's `pcap-capture`, `suricata-live` and `zeek-live` containers can monitor one or more local network interfaces, specified by the `PCAP_IFACE` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). These containers are started with additional privileges (`IPC_LOCK`, `NET_ADMIN`, `NET_RAW`, and `SYS_ADMIN`) to allow opening network interfaces in promiscuous mode for capture. +Malcolm's `pcap-capture`, `suricata-live` and `zeek-live` containers can monitor one or more local network interfaces, specified by the `PCAP_IFACE` environment variable in [`pcap-capture.env`](malcolm-config.md#MalcolmConfigEnvVars). These containers are started with additional privileges (`IPC_LOCK`, `NET_ADMIN`, `NET_RAW`, and `SYS_ADMIN`) to allow opening network interfaces in promiscuous mode for capture. -The instances of Zeek and Suricata (in the `suricata-live` and `zeek-live` containers when the `SURICATA_LIVE_CAPTURE` and `ZEEK_LIVE_CAPTURE` environment variables in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) are set to `true`, respectively) analyze traffic on-the-fly and generate log files containing network session metadata. These log files are in turn scanned by Filebeat and forwarded to Logstash for enrichment and indexing into the OpenSearch document store. +The instances of Zeek and Suricata (in the `suricata-live` and `zeek-live` containers when the `SURICATA_LIVE_CAPTURE` and `ZEEK_LIVE_CAPTURE` [environment variables](malcolm-config.md#MalcolmConfigEnvVars) are set to `true`, respectively) analyze traffic on-the-fly and generate log files containing network session metadata. These log files are in turn scanned by Filebeat and forwarded to Logstash for enrichment and indexing into the OpenSearch document store. -In contrast, the `pcap-capture` container buffers traffic to PCAP files and periodically rotates these files for processing (by Arkime's `capture` utlity in the `arkime` container) according to the thresholds defined by the `PCAP_ROTATE_MEGABYTES` and `PCAP_ROTATE_MINUTES` environment variables in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). If for some reason (e.g., a low resources environment) you also want Zeek and Suricata to process these intermediate PCAP files rather than monitoring the network interfaces directly, you can set `SURICATA_ROTATED_PCAP`/`ZEEK_ROTATED_PCAP` to `true` and `SURICATA_LIVE_CAPTURE`/`ZEEK_LIVE_CAPTURE` to false. +In contrast, the `pcap-capture` container buffers traffic to PCAP files and periodically rotates these files for processing (by Arkime's `capture` utlity in the `arkime` container) according to the thresholds defined by the `PCAP_ROTATE_MEGABYTES` and `PCAP_ROTATE_MINUTES` environment variables in [`pcap-capture.env`](malcolm-config.md#MalcolmConfigEnvVars). If for some reason (e.g., a low resources environment) you also want Zeek and Suricata to process these intermediate PCAP files rather than monitoring the network interfaces directly, you can set `SURICATA_ROTATED_PCAP`/`ZEEK_ROTATED_PCAP` to `true` and `SURICATA_LIVE_CAPTURE`/`ZEEK_LIVE_CAPTURE` to false. -These various options for monitoring traffic on local network interfaces can also be configured by running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning). +These various options for monitoring traffic on local network interfaces can also be configured by running [`./scripts/configure`](malcolm-config.md#ConfigAndTuning). Note that currently Microsoft Windows and Apple macOS platforms run Docker inside of a virtualized environment. Live traffic capture and analysis on those platforms would require additional configuration of virtual interfaces and port forwarding in Docker which is outside of the scope of this document. ## Manually forwarding logs from an external source -Malcolm's Logstash instance can also be configured to accept logs from a [remote forwarder](https://www.elastic.co/products/beats/filebeat) by running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning) and answering "yes" to "`Expose Logstash port to external hosts?`." Enabling encrypted transport of these logs files is discussed in [Configure authentication](authsetup.md#AuthSetup) and the description of the `BEATS_SSL` environment variable in the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file. +Malcolm's Logstash instance can also be configured to accept logs from a [remote forwarder](https://www.elastic.co/products/beats/filebeat) by running [`./scripts/configure`](malcolm-config.md#ConfigAndTuning) and answering "yes" to "`Expose Logstash port to external hosts?`." Enabling encrypted transport of these logs files is discussed in [Configure authentication](authsetup.md#AuthSetup) and the description of the `BEATS_SSL` environment variable in [`beats-common.env`](malcolm-config.md#MalcolmConfigEnvVars). Configuring Filebeat to forward Zeek logs to Malcolm might look something like this example [`filebeat.yml`](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-reference-yml.html): ``` diff --git a/docs/malcolm-config.md b/docs/malcolm-config.md index 7817a1419..0fa5d4789 100644 --- a/docs/malcolm-config.md +++ b/docs/malcolm-config.md @@ -1,78 +1,92 @@ # Malcolm Configuration -If you already have Docker and Docker Compose installed, the `install.py` script can still help you tune system configuration and `docker-compose.yml` parameters for Malcolm. To run it in "configuration only" mode, bypassing the steps to install Docker and Docker Compose, run it like this: -``` -./scripts/install.py --configure -``` +Malcolm's runtime settings are stored (with a few exceptions) as environment variables in configuration files ending with a `.env` suffix in the `./config` directory. The `./scripts/configure` script can help you configure and tune these settings. -Although `install.py` will attempt to automate many of the following configuration and tuning parameters, they are nonetheless listed in the following sections for reference: +Run `./scripts/configure` and answer the questions to configure Malcolm. For an in-depth treatment of these configuration questions, see the **Configuration** section in **[End-to-end Malcolm and Hedgehog Linux ISO Installation](malcolm-hedgehog-e2e-iso-install.md#MalcolmConfig)**. -## `docker-compose.yml` parameters +## Environment Variable Files -Edit `docker-compose.yml` and search for the `OPENSEARCH_JAVA_OPTS` key. Edit the `-Xms4g -Xmx4g` values, replacing `4g` with a number that is half of your total system memory, or just under 32 gigabytes, whichever is less. So, for example, if I had 64 gigabytes of memory I would edit those values to be `-Xms31g -Xmx31g`. This indicates how much memory can be allocated to the OpenSearch heaps. For a pleasant experience, I would suggest not using a value under 10 gigabytes. Similar values can be modified for Logstash with `LS_JAVA_OPTS`, where using 3 or 4 gigabytes is recommended. +Although the configuration script automates many of the following configuration and tuning parameters, some environment variables of particular interest are listed here for reference. -Various other environment variables inside of `docker-compose.yml` can be tweaked to control aspects of how Malcolm behaves, particularly with regards to processing PCAP files and Zeek logs. The environment variables of particular interest are located near the top of that file under **Commonly tweaked configuration options**, which include: - -* `ARKIME_ANALYZE_PCAP_THREADS` – the number of threads available to Arkime for analyzing PCAP files (default `1`) -* `AUTO_TAG` – if set to `true`, Malcolm will automatically create Arkime sessions and Zeek logs with tags based on the filename, as described in [Tagging](upload.md#Tagging) (default `true`) -* `BEATS_SSL` – if set to `true`, Logstash will use require encrypted communications for any external [Beats](https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html)-based forwarders from which it will accept logs (default `true`) -* `CONNECTION_SECONDS_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the duration threshold (in seconds) for assigning severity to long connections (default `3600`) -* `DASHBOARDS_DARKMODE` – if set to `true`, [OpenSearch Dashboards](dashboards.md#DashboardsVisualizations) will be set to dark mode upon initialization (default `true`) -* `EXTRACTED_FILE_CAPA_VERBOSE` – if set to `true`, all Capa rule hits will be logged; otherwise (`false`) only [MITRE ATT&CK® technique](https://attack.mitre.org/techniques) classifications will be logged -* `EXTRACTED_FILE_ENABLE_CAPA` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) that are determined to be PE (portable executable) files will be scanned with [Capa](https://github.com/fireeye/capa) -* `EXTRACTED_FILE_ENABLE_CLAMAV` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be scanned with [ClamAV](https://www.clamav.net/) -* `EXTRACTED_FILE_ENABLE_YARA` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be scanned with [Yara](https://github.com/VirusTotal/yara) -* `EXTRACTED_FILE_HTTP_SERVER_ENABLE` – if set to `true`, the directory containing [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be served over HTTP at `./extracted-files/` (e.g., [https://localhost/extracted-files/](https://localhost/extracted-files/) if you are connecting locally) -* `EXTRACTED_FILE_HTTP_SERVER_ENCRYPT` – if set to `true`, those Zeek-extracted files will be AES-256-CBC-encrypted in an `openssl enc`-compatible format (e.g., `openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe`) -* `EXTRACTED_FILE_HTTP_SERVER_KEY` – specifies the AES-256-CBC decryption password for encrypted Zeek-extracted files; used in conjunction with `EXTRACTED_FILE_HTTP_SERVER_ENCRYPT` -* `EXTRACTED_FILE_IGNORE_EXISTING` – if set to `true`, files extant in `./zeek-logs/extract_files/` directory will be ignored on startup rather than scanned -* `EXTRACTED_FILE_PRESERVATION` – determines behavior for preservation of [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) -* `EXTRACTED_FILE_UPDATE_RULES` – if set to `true`, file scanner engines (e.g., ClamAV, Capa, Yara) will periodically update their rule definitions (default `false`) -* `EXTRACTED_FILE_YARA_CUSTOM_ONLY` – if set to `true`, Malcolm will bypass the default Yara rulesets ([Neo23x0/signature-base](https://github.com/Neo23x0/signature-base) and [bartblaze/Yara-rules](https://github.com/bartblaze/Yara-rules)) and use only user-defined rules in `./yara/rules` -* `FREQ_LOOKUP` - if set to `true`, domain names (from DNS queries and SSL server names) will be assigned entropy scores as calculated by [`freq`](https://github.com/MarkBaggett/freq) (default `false`) -* `FREQ_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the entropy threshold for assigning severity to events with entropy scores calculated by [`freq`](https://github.com/MarkBaggett/freq); a lower value will only assign severity scores to fewer domain names with higher entropy (e.g., `2.0` for `NQZHTFHRMYMTVBQJE.COM`), while a higher value will assign severity scores to more domain names with lower entropy (e.g., `7.5` for `naturallanguagedomain.example.org`) (default `2.0`) -* `LOGSTASH_OUI_LOOKUP` – if set to `true`, Logstash will map MAC addresses to vendors for all source and destination MAC addresses when analyzing Zeek logs (default `true`) -* `LOGSTASH_REVERSE_DNS` – if set to `true`, Logstash will perform a reverse DNS lookup for all external source and destination IP address values when analyzing Zeek logs (default `false`) -* `LOGSTASH_SEVERITY_SCORING` - if set to `true`, Logstash will perform [severity scoring](severity.md#Severity) when analyzing Zeek logs (default `true`) -* `LOGSTASH_NETBOX_ENRICHMENT` - if set to `true`, Logstash will enrich network traffic metadata via NetBox API calls -* `MANAGE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will be marked as available for deletion by Arkime if available storage space becomes too low (default `false`) -* `MAXMIND_GEOIP_DB_LICENSE_KEY` - Malcolm uses MaxMind's free GeoLite2 databases for GeoIP lookups. As of December 30, 2019, these databases are [no longer available](https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/) for download via a public URL. Instead, they must be downloaded using a MaxMind license key (available without charge [from MaxMind](https://www.maxmind.com/en/geolite2/signup)). The license key can be specified here for GeoIP database downloads during build- and run-time. -* `OPENSEARCH_LOCAL` - if set to `true`, Malcolm will use its own internal [OpenSearch instance](opensearch-instances.md#OpenSearchInstance) (default `true`) -* `OPENSEARCH_URL` - when using Malcolm's internal OpenSearch instance (i.e., `OPENSEARCH_LOCAL` is `true`) this should be `http://opensearch:9200`, otherwise this value specifies the primary remote instance URL in the format `protocol://host:port` (default `http://opensearch:9200`) -* `OPENSEARCH_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the primary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`) -* `OPENSEARCH_SECONDARY` - if set to `true`, Malcolm will forward logs to a secondary remote OpenSearch instance in addition to the primary (local or remote) OpenSearch instance (default `false`) -* `OPENSEARCH_SECONDARY_URL` - when forwarding to a secondary remote OpenSearch instance (i.e., `OPENSEARCH_SECONDARY` is `true`) this value specifies the secondary remote instance URL in the format `protocol://host:port` -* `OPENSEARCH_SECONDARY_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the secondary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`) -* `NETBOX_DISABLED` - if set to `true`, Malcolm will **not** start and manage a [NetBox](asset-interaction-analysis.md#AssetInteractionAnalysis) instance (default `true`) -* `NGINX_BASIC_AUTH` - if set to `true`, use [TLS-encrypted HTTP basic](authsetup.md#AuthBasicAccountManagement) authentication (default); if set to `false`, use [Lightweight Directory Access Protocol (LDAP)](authsetup.md#AuthLDAP) authentication -* `NGINX_LOG_ACCESS_AND_ERRORS` - if set to `true`, all access to Malcolm via its [web interfaces](quickstart.md#UserInterfaceURLs) will be logged to OpenSearch (default `false`) -* `NGINX_SSL` - if set to `true`, require HTTPS connections to Malcolm's `nginx-proxy` container (default); if set to `false`, use unencrypted HTTP connections (using unsecured HTTP connections is **NOT** recommended unless you are running Malcolm behind another reverse proxy like Traefik, Caddy, etc.) -* `PCAP_ENABLE_NETSNIFF` – if set to `true`, Malcolm will capture network traffic on the local network interface(s) indicated in `PCAP_IFACE` using [netsniff-ng](http://netsniff-ng.org/) -* `PCAP_ENABLE_TCPDUMP` – if set to `true`, Malcolm will capture network traffic on the local network interface(s) indicated in `PCAP_IFACE` using [tcpdump](https://www.tcpdump.org/); there is no reason to enable *both* `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP` -* `PCAP_FILTER` – specifies a tcpdump-style filter expression for local packet capture; leave blank to capture all traffic -* `PCAP_IFACE` – used to specify the network interface(s) for local packet capture if `PCAP_ENABLE_NETSNIFF`, `PCAP_ENABLE_TCPDUMP`, `ZEEK_LIVE_CAPTURE` or `SURICATA_LIVE_CAPTURE` are enabled; for multiple interfaces, separate the interface names with a comma (e.g., `'enp0s25'` or `'enp10s0,enp11s0'`) -* `PCAP_IFACE_TWEAK` - if set to `true`, Malcolm will [use `ethtool`]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/nic-capture-setup.sh) to disable NIC hardware offloading features and adjust ring buffer sizes for capture interface(s); this should be `true` if the interface(s) are being used for capture only, `false` if they are being used for management/communication -* `PCAP_ROTATE_MEGABYTES` – used to specify how large a locally-captured PCAP file can become (in megabytes) before it is closed for processing and a new PCAP file created -* `PCAP_ROTATE_MINUTES` – used to specify a time interval (in minutes) after which a locally-captured PCAP file will be closed for processing and a new PCAP file created -* `pipeline.workers`, `pipeline.batch.size` and `pipeline.batch.delay` - these settings are used to tune the performance and resource utilization of the the `logstash` container; see [Tuning and Profiling Logstash Performance](https://www.elastic.co/guide/en/logstash/current/tuning-logstash.html), [`logstash.yml`](https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html) and [Multiple Pipelines](https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html) -* `PUID` and `PGID` - Docker runs all of its containers as the privileged `root` user by default. For better security, Malcolm immediately drops to non-privileged user accounts for executing internal processes wherever possible. The `PUID` (**p**rocess **u**ser **ID**) and `PGID` (**p**rocess **g**roup **ID**) environment variables allow Malcolm to map internal non-privileged user accounts to a corresponding [user account](https://en.wikipedia.org/wiki/User_identifier) on the host. Note that a few containers (including the `logstash` and `netbox` containers) may take a few extra minutes during startup if `PUID` and `PGID` are set to values other than the default `1000`. This is expected and should not affect operation after the initial startup. -* `SENSITIVE_COUNTRY_CODES` - when [severity scoring](severity.md#Severity) is enabled, this variable defines a comma-separated list of sensitive countries (using [ISO 3166-1 alpha-2 codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Current_codes)) (default `'AM,AZ,BY,CN,CU,DZ,GE,HK,IL,IN,IQ,IR,KG,KP,KZ,LY,MD,MO,PK,RU,SD,SS,SY,TJ,TM,TW,UA,UZ'`, taken from the U.S. Department of Energy Sensitive Country List) -* `SURICATA_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Suricata, and the resulting logs will also be imported (default `false`) -* `SURICATA_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Suricata logs (default `1`) -* `SURICATA_CUSTOM_RULES_ONLY` – if set to `true`, Malcolm will bypass the default [Suricata ruleset](https://github.com/OISF/suricata/tree/master/rules) and use only user-defined rules (`./suricata/rules/*.rules`). -* `SURICATA_UPDATE_RULES` – if set to `true`, Suricata signatures will periodically be updated (default `false`) -* `SURICATA_LIVE_CAPTURE` - if set to `true`, Suricata will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` -* `SURICATA_ROTATED_PCAP` - if set to `true`, Suricata can analyze captured PCAP files captured by `netsniff-ng` or `tcpdump` (see `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP`, as well as `SURICATA_AUTO_ANALYZE_PCAP_FILES`); if `SURICATA_LIVE_CAPTURE` is `true`, this should be false, otherwise Suricata will see duplicate traffic -* `SURICATA_…` - the [`suricata` container entrypoint script]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/suricata_config_populate.py) can use **many** more environment variables to tweak [suricata.yaml](https://github.com/OISF/suricata/blob/master/suricata.yaml.in); in that script, `DEFAULT_VARS` defines those variables (albeit without the `SURICATA_` prefix you must add to each for use) -* `TOTAL_MEGABYTES_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the size threshold (in megabytes) for assigning severity to large connections or file transfers (default `1000`) -* `VTOT_API2_KEY` – used to specify a [VirusTotal Public API v.20](https://www.virustotal.com/en/documentation/public-api/) key, which, if specified, will be used to submit hashes of [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) to VirusTotal -* `ZEEK_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Zeek, and the resulting logs will also be imported (default `false`) -* `ZEEK_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Zeek logs (default `1`) -* `ZEEK_DISABLE_…` - if set to any non-blank value, each of these variables can be used to disable a certain Zeek function when it analyzes PCAP files (for example, setting `ZEEK_DISABLE_LOG_PASSWORDS` to `true` to disable logging of cleartext passwords) -* `ZEEK_DISABLE_BEST_GUESS_ICS` - see ["Best Guess" Fingerprinting for ICS Protocols](ics-best-guess.md#ICSBestGuess) -* `ZEEK_EXTRACTOR_MODE` – determines the file extraction behavior for file transfers detected by Zeek; see [Automatic file extraction and scanning](file-scanning.md#ZeekFileExtraction) for more details -* `ZEEK_INTEL_FEED_SINCE` - when querying a [TAXII](zeek-intel.md#ZeekIntelSTIX) or [MISP](zeek-intel.md#ZeekIntelMISP) feed, only process threat indicators that have been created or modified since the time represented by this value; it may be either a fixed date/time (`01/01/2021`) or relative interval (`30 days ago`) -* `ZEEK_INTEL_ITEM_EXPIRATION` - specifies the value for Zeek's [`Intel::item_expiration`](https://docs.zeek.org/en/current/scripts/base/frameworks/intel/main.zeek.html#id-Intel::item_expiration) timeout as used by the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) (default `-1min`, which disables item expiration) -* `ZEEK_INTEL_REFRESH_CRON_EXPRESSION` - specifies a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) indicating the refresh interval for generating the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) files (defaults to empty, which disables automatic refresh) -* `ZEEK_LIVE_CAPTURE` - if set to `true`, Zeek will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` -* `ZEEK_ROTATED_PCAP` - if set to `true`, Zeek can analyze captured PCAP files captured by `netsniff-ng` or `tcpdump` (see `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP`, as well as `ZEEK_AUTO_ANALYZE_PCAP_FILES`); if `ZEEK_LIVE_CAPTURE` is `true`, this should be false, otherwise Zeek will see duplicate traffic \ No newline at end of file +* **`arkime.env`** - settings for [Arkime](https://arkime.com/) + - `ARKIME_ANALYZE_PCAP_THREADS` – the number of threads available to Arkime for analyzing PCAP files (default `1`) + - `MANAGE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will be marked as available for deletion by Arkime if available storage space becomes too low (default `false`) + - `MAXMIND_GEOIP_DB_LICENSE_KEY` - Malcolm uses MaxMind's free GeoLite2 databases for GeoIP lookups. As of December 30, 2019, these databases are [no longer available](https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/) for download via a public URL. Instead, they must be downloaded using a MaxMind license key (available without charge [from MaxMind](https://www.maxmind.com/en/geolite2/signup)). The license key can be specified here for GeoIP database downloads during build- and run-time. +* **`auth-common.env`** - [authentication](#MalcolmAuthSetup)-related settings + - `NGINX_BASIC_AUTH` - if set to `true`, use [TLS-encrypted HTTP basic](authsetup.md#AuthBasicAccountManagement) authentication (default); if set to `false`, use [Lightweight Directory Access Protocol (LDAP)](authsetup.md#AuthLDAP) authentication +* **`auth.env`** - stores the Malcolm administrator's username and password hash for its nginx reverse proxy +* **`beats-common.env`** - settings for interactions between [Logstash](https://www.elastic.co/products/logstash) and [Filebeat](https://www.elastic.co/products/beats/filebeat) + - `BEATS_SSL` – if set to `true`, Logstash will use require encrypted communications for any external [Beats](https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html)-based forwarders from which it will accept logs (default `true`) +* **`dashboards-helper.env`** - settings for the container that helps configure and maintain [OpenSearch](https://opensearch.org/) and [OpenSearch Dashboards](https://opensearch.org/docs/latest/dashboards/index/) + - `DASHBOARDS_DARKMODE` – if set to `true`, [OpenSearch Dashboards](dashboards.md#DashboardsVisualizations) will be set to dark mode upon initialization (default `true`) +* **`filebeat.env`** - settings specific to [Filebeat](https://www.elastic.co/products/beats/filebeat), particularly for how Filebeat watches for new log files to parse and how it receives and stores [third-Party logs](third-party-logs.md#ThirdPartyLogs) +* **`logstash.env`** - settings specific to [Logstash](https://www.elastic.co/products/logstash) + - `LOGSTASH_OUI_LOOKUP` – if set to `true`, Logstash will map MAC addresses to vendors for all source and destination MAC addresses when analyzing Zeek logs (default `true`) + - `LOGSTASH_REVERSE_DNS` – if set to `true`, Logstash will perform a reverse DNS lookup for all external source and destination IP address values when analyzing Zeek logs (default `false`) + - `LOGSTASH_SEVERITY_SCORING` - if set to `true`, Logstash will perform [severity scoring](severity.md#Severity) when analyzing Zeek logs (default `true`) + - `LOGSTASH_NETBOX_ENRICHMENT` - if set to `true`, Logstash will enrich network traffic metadata via NetBox API calls + - `LS_JAVA_OPTS` - part of LogStash's [JVM settings](https://www.elastic.co/guide/en/logstash/current/jvm-settings.html), the `-Xms` and `-Xmx` values set the size of LogStash's Java heap (we recommend somewhere between `1500m` and `4g`) + * `pipeline.workers`, `pipeline.batch.size` and `pipeline.batch.delay` - these settings are used to tune the performance and resource utilization of the the `logstash` container; see [Tuning and Profiling Logstash Performance](https://www.elastic.co/guide/en/logstash/current/tuning-logstash.html), [`logstash.yml`](https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html) and [Multiple Pipelines](https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html) +* **`lookup-common.env`** - settings for enrichment lookups, including those used for [customizing event severity scoring](severity.md#SeverityConfig) + - `CONNECTION_SECONDS_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the duration threshold (in seconds) for assigning severity to long connections (default `3600`) + - `FREQ_LOOKUP` - if set to `true`, domain names (from DNS queries and SSL server names) will be assigned entropy scores as calculated by [`freq`](https://github.com/MarkBaggett/freq) (default `false`) + - `FREQ_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the entropy threshold for assigning severity to events with entropy scores calculated by [`freq`](https://github.com/MarkBaggett/freq); a lower value will only assign severity scores to fewer domain names with higher entropy (e.g., `2.0` for `NQZHTFHRMYMTVBQJE.COM`), while a higher value will assign severity scores to more domain names with lower entropy (e.g., `7.5` for `naturallanguagedomain.example.org`) (default `2.0`) + - `SENSITIVE_COUNTRY_CODES` - when [severity scoring](severity.md#Severity) is enabled, this variable defines a comma-separated list of sensitive countries (using [ISO 3166-1 alpha-2 codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Current_codes)) (default `'AM,AZ,BY,CN,CU,DZ,GE,HK,IL,IN,IQ,IR,KG,KP,KZ,LY,MD,MO,PK,RU,SD,SS,SY,TJ,TM,TW,UA,UZ'`, taken from the U.S. Department of Energy Sensitive Country List) + - `TOTAL_MEGABYTES_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the size threshold (in megabytes) for assigning severity to large connections or file transfers (default `1000`) +* **`netbox-common.env`**, `netbox.env`, `netbox-postgres.env`, `netbox-redis-cache.env` and `netbox-redis.env` - settings related to [NetBox](https://netbox.dev/) and [Asset Interaction Analysis](asset-interaction-analysis.md#AssetInteractionAnalysis) + - `NETBOX_DISABLED` - if set to `true`, Malcolm will **not** start and manage a [NetBox](asset-interaction-analysis.md#AssetInteractionAnalysis) instance (default `true`) +* **`nginx.env`** - settings specific to Malcolm's nginx reverse proxy + - `NGINX_LOG_ACCESS_AND_ERRORS` - if set to `true`, all access to Malcolm via its [web interfaces](quickstart.md#UserInterfaceURLs) will be logged to OpenSearch (default `false`) + - `NGINX_SSL` - if set to `true`, require HTTPS connections to Malcolm's `nginx-proxy` container (default); if set to `false`, use unencrypted HTTP connections (using unsecured HTTP connections is **NOT** recommended unless you are running Malcolm behind another reverse proxy like Traefik, Caddy, etc.) +* **`opensearch.env`** - settings specific to [OpenSearch](https://opensearch.org/) + - `OPENSEARCH_JAVA_OPTS` - one of OpenSearch's most [important settings](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/index/#important-settings), the `-Xms` and `-Xmx` values set the size of OpenSearch's Java heap (we recommend setting this value to half of system RAM, up to 32 gigabytes) + - `OPENSEARCH_LOCAL` - if set to `true`, Malcolm will use its own internal [OpenSearch instance](opensearch-instances.md#OpenSearchInstance) (default `true`) + - `OPENSEARCH_URL` - when using Malcolm's internal OpenSearch instance (i.e., `OPENSEARCH_LOCAL` is `true`) this should be `http://opensearch:9200`, otherwise this value specifies the primary remote instance URL in the format `protocol://host:port` (default `http://opensearch:9200`) + - `OPENSEARCH_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the primary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`) + - `OPENSEARCH_SECONDARY` - if set to `true`, Malcolm will forward logs to a secondary remote OpenSearch instance in addition to the primary (local or remote) OpenSearch instance (default `false`) + - `OPENSEARCH_SECONDARY_URL` - when forwarding to a secondary remote OpenSearch instance (i.e., `OPENSEARCH_SECONDARY` is `true`) this value specifies the secondary remote instance URL in the format `protocol://host:port` + - `OPENSEARCH_SECONDARY_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the secondary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`) +* **`pcap-capture.env`** - settings specific to capturing traffic for [live traffic analysis](live-analysis.md#LocalPCAP) + - `PCAP_ENABLE_NETSNIFF` – if set to `true`, Malcolm will capture network traffic on the local network interface(s) indicated in `PCAP_IFACE` using [netsniff-ng](http://netsniff-ng.org/) + - `PCAP_ENABLE_TCPDUMP` – if set to `true`, Malcolm will capture network traffic on the local network interface(s) indicated in `PCAP_IFACE` using [tcpdump](https://www.tcpdump.org/); there is no reason to enable *both* `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP` + - `PCAP_FILTER` – specifies a tcpdump-style filter expression for local packet capture; leave blank to capture all traffic + - `PCAP_IFACE` – used to specify the network interface(s) for local packet capture if `PCAP_ENABLE_NETSNIFF`, `PCAP_ENABLE_TCPDUMP`, `ZEEK_LIVE_CAPTURE` or `SURICATA_LIVE_CAPTURE` are enabled; for multiple interfaces, separate the interface names with a comma (e.g., `'enp0s25'` or `'enp10s0,enp11s0'`) + - `PCAP_IFACE_TWEAK` - if set to `true`, Malcolm will [use `ethtool`]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/nic-capture-setup.sh) to disable NIC hardware offloading features and adjust ring buffer sizes for capture interface(s); this should be `true` if the interface(s) are being used for capture only, `false` if they are being used for management/communication + - `PCAP_ROTATE_MEGABYTES` – used to specify how large a locally-captured PCAP file can become (in megabytes) before it is closed for processing and a new PCAP file created + - `PCAP_ROTATE_MINUTES` – used to specify a time interval (in minutes) after which a locally-captured PCAP file will be closed for processing and a new PCAP file created +* **`process.env`** - settings for how the processes running inside Malcolm containers are executed + - `PUID` and `PGID` - Docker runs all of its containers as the privileged `root` user by default. For better security, Malcolm immediately drops to non-privileged user accounts for executing internal processes wherever possible. The `PUID` (**p**rocess **u**ser **ID**) and `PGID` (**p**rocess **g**roup **ID**) environment variables allow Malcolm to map internal non-privileged user accounts to a corresponding [user account](https://en.wikipedia.org/wiki/User_identifier) on the host. Note that a few containers (including the `logstash` and `netbox` containers) may take a few extra minutes during startup if `PUID` and `PGID` are set to values other than the default `1000`. This is expected and should not affect operation after the initial startup. +* **`ssl.env`** - TLS-related settings used by many containers +* **`suricata.env`**, **`suricata-live.env`** and **`suricata-offline.env`** - settings for [Suricata](https://suricata.io/) + - `SURICATA_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Suricata, and the resulting logs will also be imported (default `false`) + - `SURICATA_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Suricata logs (default `1`) + - `SURICATA_CUSTOM_RULES_ONLY` – if set to `true`, Malcolm will bypass the default [Suricata ruleset](https://github.com/OISF/suricata/tree/master/rules) and use only user-defined rules (`./suricata/rules/*.rules`). + - `SURICATA_UPDATE_RULES` – if set to `true`, Suricata signatures will periodically be updated (default `false`) + - `SURICATA_LIVE_CAPTURE` - if set to `true`, Suricata will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` + - `SURICATA_ROTATED_PCAP` - if set to `true`, Suricata can analyze captured PCAP files captured by `netsniff-ng` or `tcpdump` (see `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP`, as well as `SURICATA_AUTO_ANALYZE_PCAP_FILES`); if `SURICATA_LIVE_CAPTURE` is `true`, this should be false, otherwise Suricata will see duplicate traffic + - `SURICATA_…` - the [`suricata` container entrypoint script]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/suricata_config_populate.py) can use **many** more environment variables to tweak [suricata.yaml](https://github.com/OISF/suricata/blob/master/suricata.yaml.in); in that script, `DEFAULT_VARS` defines those variables (albeit without the `SURICATA_` prefix you must add to each for use) +* **`upload-common.env`** and **`upload.env`** - settings for dealing with PCAP files [uploaded](upload.md#Upload) to Malcolm for analysis + - `AUTO_TAG` – if set to `true`, Malcolm will automatically create Arkime sessions and Zeek logs with tags based on the filename, as described in [Tagging](upload.md#Tagging) (default `true`) +* **`zeek.env`**, **`zeek-live.env`** and **`zeek-offline.env`** - settings for [Zeek](https://www.zeek.org/index.html) and for scanning [extracted files](file-scanning.md#ZeekFileExtraction) Zeek observes in network traffic + - `EXTRACTED_FILE_CAPA_VERBOSE` – if set to `true`, all Capa rule hits will be logged; otherwise (`false`) only [MITRE ATT&CK® technique](https://attack.mitre.org/techniques) classifications will be logged + - `EXTRACTED_FILE_ENABLE_CAPA` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) that are determined to be PE (portable executable) files will be scanned with [Capa](https://github.com/fireeye/capa) + - `EXTRACTED_FILE_ENABLE_CLAMAV` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be scanned with [ClamAV](https://www.clamav.net/) + - `EXTRACTED_FILE_ENABLE_YARA` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be scanned with [Yara](https://github.com/VirusTotal/yara) + - `EXTRACTED_FILE_HTTP_SERVER_ENABLE` – if set to `true`, the directory containing [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be served over HTTP at `./extracted-files/` (e.g., [https://localhost/extracted-files/](https://localhost/extracted-files/) if you are connecting locally) + - `EXTRACTED_FILE_HTTP_SERVER_ENCRYPT` – if set to `true`, those Zeek-extracted files will be AES-256-CBC-encrypted in an `openssl enc`-compatible format (e.g., `openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe`) + - `EXTRACTED_FILE_HTTP_SERVER_KEY` – specifies the AES-256-CBC decryption password for encrypted Zeek-extracted files; used in conjunction with `EXTRACTED_FILE_HTTP_SERVER_ENCRYPT` + - `EXTRACTED_FILE_IGNORE_EXISTING` – if set to `true`, files extant in `./zeek-logs/extract_files/` directory will be ignored on startup rather than scanned + - `EXTRACTED_FILE_PRESERVATION` – determines behavior for preservation of [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) + - `EXTRACTED_FILE_UPDATE_RULES` – if set to `true`, file scanner engines (e.g., ClamAV, Capa, Yara) will periodically update their rule definitions (default `false`) + - `EXTRACTED_FILE_YARA_CUSTOM_ONLY` – if set to `true`, Malcolm will bypass the default Yara rulesets ([Neo23x0/signature-base](https://github.com/Neo23x0/signature-base) and [bartblaze/Yara-rules](https://github.com/bartblaze/Yara-rules)) and use only user-defined rules in `./yara/rules` + - `VTOT_API2_KEY` – used to specify a [VirusTotal Public API v.20](https://www.virustotal.com/en/documentation/public-api/) key, which, if specified, will be used to submit hashes of [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) to VirusTotal + - `ZEEK_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Zeek, and the resulting logs will also be imported (default `false`) + - `ZEEK_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Zeek logs (default `1`) + - `ZEEK_DISABLE_…` - if set to any non-blank value, each of these variables can be used to disable a certain Zeek function when it analyzes PCAP files (for example, setting `ZEEK_DISABLE_LOG_PASSWORDS` to `true` to disable logging of cleartext passwords) + - `ZEEK_DISABLE_BEST_GUESS_ICS` - see ["Best Guess" Fingerprinting for ICS Protocols](ics-best-guess.md#ICSBestGuess) + - `ZEEK_EXTRACTOR_MODE` – determines the file extraction behavior for file transfers detected by Zeek; see [Automatic file extraction and scanning](file-scanning.md#ZeekFileExtraction) for more details + - `ZEEK_INTEL_FEED_SINCE` - when querying a [TAXII](zeek-intel.md#ZeekIntelSTIX) or [MISP](zeek-intel.md#ZeekIntelMISP) feed, only process threat indicators that have been created or modified since the time represented by this value; it may be either a fixed date/time (`01/01/2021`) or relative interval (`30 days ago`) + - `ZEEK_INTEL_ITEM_EXPIRATION` - specifies the value for Zeek's [`Intel::item_expiration`](https://docs.zeek.org/en/current/scripts/base/frameworks/intel/main.zeek.html#id-Intel::item_expiration) timeout as used by the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) (default `-1min`, which disables item expiration) + - `ZEEK_INTEL_REFRESH_CRON_EXPRESSION` - specifies a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) indicating the refresh interval for generating the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) files (defaults to empty, which disables automatic refresh) + - `ZEEK_LIVE_CAPTURE` - if set to `true`, Zeek will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` + - `ZEEK_ROTATED_PCAP` - if set to `true`, Zeek can analyze captured PCAP files captured by `netsniff-ng` or `tcpdump` (see `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP`, as well as `ZEEK_AUTO_ANALYZE_PCAP_FILES`); if `ZEEK_LIVE_CAPTURE` is `true`, this should be false, otherwise Zeek will see duplicate traffic \ No newline at end of file diff --git a/docs/malcolm-hedgehog-e2e-iso-install.md b/docs/malcolm-hedgehog-e2e-iso-install.md index 7303d9ad4..c43df9d9b 100644 --- a/docs/malcolm-hedgehog-e2e-iso-install.md +++ b/docs/malcolm-hedgehog-e2e-iso-install.md @@ -131,11 +131,12 @@ The panel bordering the top of the Malcolm desktop is home to a number of useful ### Configuration -The first time the Malcolm base operating system boots the **Malcolm Configuration** wizard will start automatically. This same configuration script can be run again later by running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning) from the Malcolm installation directory, or clicking the **Configure Malcolm** 🔳 icon in the top panel. +The first time the Malcolm base operating system boots the **Malcolm Configuration** wizard will start automatically. This same configuration script can be run again later by running [`./scripts/configure`](malcolm-config.md#ConfigAndTuning) from the Malcolm installation directory, or clicking the **Configure Malcolm** 🔳 icon in the top panel. ![Malcolm Configuration on first boot](./images/screenshots/malcolm_first_boot_config.png) -The [configuration and tuning](malcolm-config.md#ConfigAndTuning) wizard's questions proceed as follows. Note that you may not necessarily see every question listed here depending on how you answered earlier questions. Usually the default selection is what you'll want to select unless otherwise indicated below. +The [configuration and tuning](malcolm-config.md#ConfigAndTuning) wizard's questions proceed as follows. Note that you may not necessarily see every question listed here depending on how you answered earlier questions. Usually the default selection is what you'll want to select unless otherwise indicated below. The configuration values resulting from these questions are stored in [environment variable files](malcolm-config.md#MalcolmConfigEnvVars) in the `./config` directory. + * Malcolm processes will run as UID 1000 and GID 1000. Is this OK? - Docker runs all of its containers as the privileged `root` user by default. For better security, Malcolm immediately drops to non-privileged user accounts for executing internal processes wherever possible. The `PUID` (**p**rocess **u**ser **ID**) and `PGID` (**p**rocess **g**roup **ID**) environment variables allow Malcolm to map internal non-privileged user accounts to a corresponding [user account](https://en.wikipedia.org/wiki/User_identifier) on the host. diff --git a/docs/malcolm-iso.md b/docs/malcolm-iso.md index d0b1530ff..89fcf1f23 100644 --- a/docs/malcolm-iso.md +++ b/docs/malcolm-iso.md @@ -82,6 +82,6 @@ Following these prompts, the installer will reboot and the Malcolm base operatin When the system boots for the first time, the Malcolm Docker images will load if the installer was built with pre-packaged installation files as described above. Wait for this operation to continue (the progress dialog will disappear when they have finished loading) before continuing the setup. -Open a terminal (click the red terminal 🗔 icon next to the Debian swirl logo 🍥 menu button in the menu bar). At this point, setup is similar to the steps described in the [Quick start](quickstart.md#QuickStart) section. Navigate to the Malcolm directory (`cd ~/Malcolm`) and run [`auth_setup`](authsetup.md#AuthSetup) to configure authentication. If the ISO didn't have pre-packaged Malcolm images, or if you'd like to retrieve the latest updates, run `docker-compose pull`. Finalize your configuration by running `scripts/install.py --configure` and follow the prompts as illustrated in the [installation example](ubuntu-install-example.md#InstallationExample). +Open a terminal (click the red terminal 🗔 icon next to the Debian swirl logo 🍥 menu button in the menu bar). At this point, setup is similar to the steps described in the [Quick start](quickstart.md#QuickStart) section. Navigate to the Malcolm directory (`cd ~/Malcolm`) and run [`auth_setup`](authsetup.md#AuthSetup) to configure authentication. If the ISO didn't have pre-packaged Malcolm images, or if you'd like to retrieve the latest updates, run `docker-compose pull`. Finalize your configuration by running `scripts/configure` and follow the prompts as illustrated in the [installation example](malcolm-hedgehog-e2e-iso-install.md#MalcolmConfig). Once Malcolm is configured, you can [start Malcolm](running.md#Starting) via the command line or by clicking the circular yellow Malcolm icon in the menu bar. \ No newline at end of file diff --git a/docs/malcolm-preparation.md b/docs/malcolm-preparation.md index 6e0861c60..468138b53 100644 --- a/docs/malcolm-preparation.md +++ b/docs/malcolm-preparation.md @@ -3,7 +3,7 @@ * [Configuration](#Configuration) - [Recommended system requirements](system-requirements.md#SystemRequirements) - [Malcolm Configuration](malcolm-config.md#ConfigAndTuning) - + [`docker-compose.yml` parameters](malcolm-config.md#DockerComposeYml) + + [Environment Variable Files](malcolm-config.md#MalcolmConfigEnvVars) - [Configure authentication](authsetup.md#AuthSetup) + [Local account management](authsetup.md#AuthBasicAccountManagement) + [Lightweight Directory Access Protocol (LDAP) authentication](authsetup.md#AuthLDAP) diff --git a/docs/malcolm-upgrade.md b/docs/malcolm-upgrade.md index 75e898ce1..8fc5ea6a0 100644 --- a/docs/malcolm-upgrade.md +++ b/docs/malcolm-upgrade.md @@ -23,7 +23,7 @@ If you checked out a working copy of the Malcolm repository from GitHub with a ` 5. apply saved configuration change stashed earlier * `git stash pop` 6. if you see `Merge conflict` messages, resolve the [conflicts](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging#_basic_merge_conflicts) with your favorite text editor -7. you may wish to re-run `install.py --configure` as described in [System configuration and tuning](malcolm-config.md#ConfigAndTuning) in case there are any new `docker-compose.yml` parameters for Malcolm that need to be set up +7. you may wish to re-run `./scripts/configure` as described in [Malcolm Configuration](malcolm-config.md#ConfigAndTuning) in case there are any new configuration parameters for Malcolm that need to be set up 8. start Malcolm * `./scripts/start` 9. you may be prompted to [configure authentication](authsetup.md#AuthSetup) if there are new authentication-related files that need to be generated @@ -45,8 +45,8 @@ If you installed Malcolm from [pre-packaged installation files]({{ site.github.r * `cp -r ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/scripts ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/README.md ./` 4. replace (overwrite) `docker-compose.yml` file with new version * `cp ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/docker-compose.yml ./docker-compose.yml` -5. re-run `./scripts/install.py --configure` as described in [System configuration and tuning](malcolm-config.md#ConfigAndTuning) -6. using a file comparison tool (e.g., `diff`, `meld`, `Beyond Compare`, etc.), compare `docker-compose.yml` and the `docker-compare.yml` file you backed up in step 3, and manually migrate over any customizations you wish to preserve from that file (e.g., `PCAP_FILTER`, `MAXMIND_GEOIP_DB_LICENSE_KEY`, `MANAGE_PCAP_FILES`; [anything else](malcolm-config.md#DockerComposeYml) you may have edited by hand in `docker-compose.yml` that's not prompted for in `install.py --configure`) +5. re-run `./scripts/configure` as described in [Malcolm Configuration](malcolm-config.md#ConfigAndTuning) +6. using a file comparison tool (e.g., `diff`, `meld`, `Beyond Compare`, etc.), compare `docker-compose.yml` and the `docker-compare.yml` file you backed up in step 3, and manually migrate over any customizations you wish to preserve from that file (e.g., `PCAP_FILTER`, `MAXMIND_GEOIP_DB_LICENSE_KEY`, `MANAGE_PCAP_FILES`; [anything else](malcolm-config.md#MalcolmConfigEnvVars) you may have edited by hand in `docker-compose.yml` that's not prompted for in `configure`) 7. pull the new docker images (this will take a while) * `docker-compose pull` to pull them from [GitHub](https://github.com/orgs/idaholab/packages?repo_name=Malcolm) or `docker-compose load -i malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz` if you have an offline tarball of the Malcolm docker images 8. start Malcolm diff --git a/docs/opensearch-instances.md b/docs/opensearch-instances.md index 3c981d5c2..b68633051 100644 --- a/docs/opensearch-instances.md +++ b/docs/opensearch-instances.md @@ -7,7 +7,7 @@ Malcolm's default standalone configuration is to use a local [OpenSearch](https: As the permutations of OpenSearch cluster configurations are numerous, it is beyond Malcolm's scope to set up multi-node clusters. However, Malcolm can be configured to use a remote OpenSearch cluster rather than its own internal instance. -The `OPENSEARCH_…` [environment variables in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) control whether Malcolm uses its own local OpenSearch instance or a remote OpenSearch instance as its primary data store. The configuration portion of Malcolm install script ([`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning)) can help you configure these options. +The `OPENSEARCH_…` [environment variables in `opensearch.env`](malcolm-config.md#MalcolmConfigEnvVars) control whether Malcolm uses its own local OpenSearch instance or a remote OpenSearch instance as its primary data store. The configuration portion of Malcolm install script ([`./scripts/configure`](malcolm-config.md#ConfigAndTuning)) can help you configure these options. For example, to use the default standalone configuration, answer `Y` when prompted `Should Malcolm use and maintain its own OpenSearch instance?`. @@ -25,7 +25,7 @@ You must run auth_setup after install.py to store OpenSearch connection credenti … ``` -Whether the primary OpenSearch instance is a locally maintained single-node instance or is a remote cluster, Malcolm can be configured additionally forward logs to a secondary remote OpenSearch instance. The `OPENSEARCH_SECONDARY_…` [environment variables in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) control this behavior. Configuration of a remote secondary OpenSearch instance is similar to that of a remote primary OpenSearch instance: +Whether the primary OpenSearch instance is a locally maintained single-node instance or is a remote cluster, Malcolm can be configured additionally forward logs to a secondary remote OpenSearch instance. The `OPENSEARCH_SECONDARY_…` [environment variables in `opensearch.env`](malcolm-config.md#MalcolmConfigEnvVars) control this behavior. Configuration of a remote secondary OpenSearch instance is similar to that of a remote primary OpenSearch instance: ``` @@ -42,7 +42,7 @@ You must run auth_setup after install.py to store OpenSearch connection credenti ## Authentication and authorization for remote OpenSearch clusters -In addition to setting the environment variables in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) as described above, you must provide Malcolm with credentials for it to be able to communicate with remote OpenSearch instances. These credentials are stored in the Malcolm installation directory as `.opensearch.primary.curlrc` and `.opensearch.secondary.curlrc` for the primary and secondary OpenSearch connections, respectively, and are bind mounted into the Docker containers which need to communicate with OpenSearch. These [cURL-formatted](https://everything.curl.dev/cmdline/configfile) config files can be generated for you by the [`auth_setup`](authsetup.md#AuthSetup) script as illustrated: +In addition to setting the environment variables in [`opensearch.env`](malcolm-config.md#MalcolmConfigEnvVars) as described above, you must provide Malcolm with credentials for it to be able to communicate with remote OpenSearch instances. These credentials are stored in the Malcolm installation directory as `.opensearch.primary.curlrc` and `.opensearch.secondary.curlrc` for the primary and secondary OpenSearch connections, respectively, and are bind mounted into the Docker containers which need to communicate with OpenSearch. These [cURL-formatted](https://everything.curl.dev/cmdline/configfile) config files can be generated for you by the [`auth_setup`](authsetup.md#AuthSetup) script as illustrated: ``` $ ./scripts/auth_setup diff --git a/docs/quickstart.md b/docs/quickstart.md index 1f1b23933..ebf741e77 100644 --- a/docs/quickstart.md +++ b/docs/quickstart.md @@ -20,9 +20,9 @@ The `build.sh` script can build Malcolm's Docker images from scratch. See [Build ### Initial configuration -The scripts to control Malcolm require Python 3. The [`install.py`](malcolm-config.md#ConfigAndTuning) script requires the [requests](https://docs.python-requests.org/en/latest/) module for Python 3, and will make use of the [pythondialog](https://pythondialog.sourceforge.io/) module for user interaction (on Linux) if it is available. +The scripts to control Malcolm require Python 3. The [`install.py`](malcolm-config.md#ConfigAndTuning) script requires the [dotenv](https://github.com/theskumar/python-dotenv), [requests](https://docs.python-requests.org/en/latest/) and [PyYAML](https://pyyaml.org/) modules for Python 3, and will make use of the [pythondialog](https://pythondialog.sourceforge.io/) module for user interaction (on Linux) if it is available. -You must run [`auth_setup`](authsetup.md#AuthSetup) prior to pulling Malcolm's Docker images. You should also ensure your system configuration and `docker-compose.yml` settings are tuned by running `./scripts/install.py` or `./scripts/install.py --configure` (see [System configuration and tuning](malcolm-config.md#ConfigAndTuning)). +You must run [`auth_setup`](authsetup.md#AuthSetup) prior to pulling Malcolm's Docker images. You should also ensure your system configuration and Malcolm settings are tuned by running `./scripts/install.py` and `./scripts/configure` (see [Malcolm Configuration](malcolm-config.md#ConfigAndTuning)). ### Pull Malcolm's Docker images diff --git a/docs/running.md b/docs/running.md index 72dd39bbf..3f02502be 100644 --- a/docs/running.md +++ b/docs/running.md @@ -24,7 +24,7 @@ You can also use `docker stats` to monitor the resource utilization of running c You can run `./scripts/stop` to stop the docker containers and remove their virtual network. Alternatively, `./scripts/restart` will restart an instance of Malcolm. Because the data on disk is stored on the host in docker volumes, doing these operations will not result in loss of data. -Malcolm can be configured to be automatically restarted when the Docker system daemon restart (for example, on system reboot). This behavior depends on the [value](https://docs.docker.com/config/containers/start-containers-automatically/) of the [`restart:`](https://docs.docker.com/compose/compose-file/#restart) setting for each service in the `docker-compose.yml` file. This value can be set by running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning) and answering "yes" to "`Restart Malcolm upon system or Docker daemon restart?`." +Malcolm can be configured to be automatically restarted when the Docker system daemon restart (for example, on system reboot). This behavior depends on the [value](https://docs.docker.com/config/containers/start-containers-automatically/) of the [`restart:`](https://docs.docker.com/compose/compose-file/#restart) setting for each service in the `docker-compose.yml` file. This value can be set by running [`./scripts/configure`](malcolm-config.md#ConfigAndTuning) and answering "yes" to "`Restart Malcolm upon system or Docker daemon restart?`." ## Clearing Malcolm's data diff --git a/docs/severity.md b/docs/severity.md index 45bd31d03..88f3b230d 100644 --- a/docs/severity.md +++ b/docs/severity.md @@ -8,9 +8,9 @@ As Zeek logs are parsed and enriched prior to indexing, a severity score up to ` * cross-segment network traffic (if [network subnets were defined](asset-interaction-analysis.md#AssetInteractionAnalysis)) * connection origination and destination (e.g., inbound, outbound, external, internal) * traffic to or from sensitive countries - - The comma-separated list of countries (by [ISO 3166-1 alpha-2 code](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Current_codes)) can be customized by setting the `SENSITIVE_COUNTRY_CODES` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). + - The comma-separated list of countries (by [ISO 3166-1 alpha-2 code](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Current_codes)) can be customized by setting the `SENSITIVE_COUNTRY_CODES` environment variable in [`lookup-common.env`](malcolm-config.md#MalcolmConfigEnvVars). * domain names (from DNS queries and SSL server names) with high entropy as calculated by [freq](https://github.com/MarkBaggett/freq) - - The entropy threshold for this condition to trigger can be adjusted by setting the `FREQ_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). A lower value will only assign severity scores to fewer domain names with higher entropy (e.g., `2.0` for `NQZHTFHRMYMTVBQJE.COM`), while a higher value will assign severity scores to more domain names with lower entropy (e.g., `7.5` for `naturallanguagedomain.example.org`). + - The entropy threshold for this condition to trigger can be adjusted by setting the `FREQ_SEVERITY_THRESHOLD` environment variable in [`lookup-common.env`](malcolm-config.md#MalcolmConfigEnvVars). A lower value will only assign severity scores to fewer domain names with higher entropy (e.g., `2.0` for `NQZHTFHRMYMTVBQJE.COM`), while a higher value will assign severity scores to more domain names with lower entropy (e.g., `7.5` for `naturallanguagedomain.example.org`). * file transfers (categorized by mime type) * `notice.log`, [`intel.log`](zeek-intel.md#ZeekIntel) and `weird.log` entries, including those generated by Zeek plugins detecting vulnerabilities (see the list of Zeek plugins under [Components](components.md#Components)) * detection of cleartext passwords @@ -20,9 +20,9 @@ As Zeek logs are parsed and enriched prior to indexing, a severity score up to ` * common network services communicating over non-standard ports * file scanning engine hits on [extracted files](file-scanning.md#ZeekFileExtraction) * large connection or file transfer - - The size (in megabytes) threshold for this condition to trigger can be adjusted by setting the `TOTAL_MEGABYTES_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). + - The size (in megabytes) threshold for this condition to trigger can be adjusted by setting the `TOTAL_MEGABYTES_SEVERITY_THRESHOLD` environment variable in [`lookup-common.env`](malcolm-config.md#MalcolmConfigEnvVars). * long connection duration - - The duration (in seconds) threshold for this condition to trigger can be adjusted by setting the `CONNECTION_SECONDS_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). + - The duration (in seconds) threshold for this condition to trigger can be adjusted by setting the `CONNECTION_SECONDS_SEVERITY_THRESHOLD` environment variable in [`lookup-common.env`](malcolm-config.md#MalcolmConfigEnvVars). As this [feature]({{ site.github.repository_url }}/issues/19) is improved it's expected that additional categories will be identified and implemented for severity scoring. @@ -44,4 +44,4 @@ These categories' severity scores can be customized by editing `logstash/maps/ma Restart Logstash after modifying `malcolm_severity.yaml` for the changes to take effect. -Severity scoring can be disabled globally by setting the `LOGSTASH_SEVERITY_SCORING` environment variable to `false` in the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file and [restarting Malcolm](running.md#StopAndRestart). \ No newline at end of file +Severity scoring can be disabled globally by setting the `LOGSTASH_SEVERITY_SCORING` environment variable to `false` in the [`logstash.env`](malcolm-config.md#MalcolmConfigEnvVars) file and [restarting Malcolm](running.md#StopAndRestart). \ No newline at end of file diff --git a/docs/third-party-logs.md b/docs/third-party-logs.md index 54090ff6b..a3fb8aaff 100644 --- a/docs/third-party-logs.md +++ b/docs/third-party-logs.md @@ -26,7 +26,7 @@ The types of third-party logs and metrics discussed in this document are *not* t ## Configuring Malcolm -The environment variables in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) for configuring how Malcolm accepts external logs are prefixed with `FILEBEAT_TCP_…`. These values can be specified during Malcolm configuration (i.e., when running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning)), as can be seen from the following excerpt from the [Installation example](ubuntu-install-example.md#InstallationExample): +The environment variables in [`filebeat.env`](malcolm-config.md#MalcolmConfigEnvVars) for configuring how Malcolm accepts external logs are prefixed with `FILEBEAT_TCP_…`. These values can be specified during Malcolm configuration (i.e., when running [`./scripts/configure`](malcolm-config.md#ConfigAndTuning)), as can be seen from the following excerpt from the [Installation example](ubuntu-install-example.md#InstallationExample): ``` … @@ -47,7 +47,7 @@ Tag to apply to messages sent to Filebeat TCP listener (_malcolm_beats): _malcol … ``` -The variables corresponding to these questions can be found in the `filebeat-variables` section of`docker-compose.yml`: +The variables corresponding to these questions can be found in [`filebeat.env`](malcolm-config.md#MalcolmConfigEnvVars): * `FILEBEAT_TCP_LISTEN` - whether or not to expose a [Filebeat TCP input listener](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-tcp.html) to which logs may be sent (the default TCP port is `5045`: you may need to adjust your firewall accordingly) * `FILEBEAT_TCP_LOG_FORMAT` - log format expected for logs sent to the Filebeat TCP input listener (`json` or `raw`) @@ -60,7 +60,7 @@ These variables' values will depend on your forwarder and the format of the data ### Secure communication -In order to maintain the integrity and confidentiality of your data, Malcolm's default (set via the `BEATS_SSL` environment variable in `docker-compose.yml`) is to require connections from external forwarders to be encrypted using TLS. When [`./scripts/auth_setup`](authsetup.md#AuthSetup) is run, self-signed certificates are generated which may be used by remote log forwarders. Located in the `filebeat/certs/` directory, the certificate authority and client certificate and key files should be copied to the host on which your forwarder is running and used when defining its settings for connecting to Malcolm. +In order to maintain the integrity and confidentiality of your data, Malcolm's default (set via the `BEATS_SSL` environment variable in [`beats-common.env`](malcolm-config.md#MalcolmConfigEnvVars)) is to require connections from external forwarders to be encrypted using TLS. When [`./scripts/auth_setup`](authsetup.md#AuthSetup) is run, self-signed certificates are generated which may be used by remote log forwarders. Located in the `filebeat/certs/` directory, the certificate authority and client certificate and key files should be copied to the host on which your forwarder is running and used when defining its settings for connecting to Malcolm. ## Fluent Bit @@ -276,7 +276,7 @@ Running fluentbit_winev... fluentbit_winevtlog Elastic [Beats](https://www.elastic.co/beats/) can also be used to forward data to Malcolm's Filebeat TCP listener. Follow the [Get started with Beats](https://www.elastic.co/guide/en/beats/libbeat/current/getting-started.html) documentation for configuring Beats on your system. -In contrast to Fluent Bit, Beats forwarders write to Malcolm's Logstash input over TCP port 5044 (rather than its Filebeat TCP input). Answer `Y` when prompted `Expose Logstash port to external hosts?` during Malcolm configuration (i.e., when running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning)) to allow external remote Beats forwarders to send logs to Logstash. +In contrast to Fluent Bit, Beats forwarders write to Malcolm's Logstash input over TCP port 5044 (rather than its Filebeat TCP input). Answer `Y` when prompted `Expose Logstash port to external hosts?` during Malcolm configuration (i.e., when running [`./scripts/configure`](malcolm-config.md#ConfigAndTuning)) to allow external remote Beats forwarders to send logs to Logstash. Your Beat's [configuration YML file](https://www.elastic.co/guide/en/beats/libbeat/current/config-file-format.html) file might look something like this sample [filebeat.yml](https://www.elastic.co/guide/en/beats/filebeat/current/configuring-howto-filebeat.html) file: diff --git a/docs/ubuntu-install-example.md b/docs/ubuntu-install-example.md index a3c0ea965..fed210823 100644 --- a/docs/ubuntu-install-example.md +++ b/docs/ubuntu-install-example.md @@ -8,7 +8,7 @@ The commands in this example should be executed as a non-root user. You can use `git` to clone Malcolm into a local working copy, or you can download and extract the artifacts from the [latest release]({{ site.github.repository_url }}/releases). -To install Malcolm from the latest Malcolm release, browse to the [Malcolm releases page on GitHub]({{ site.github.repository_url }}/releases) and download at a minimum `install.py` and the `malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz` file, then navigate to your downloads directory: +To install Malcolm from the latest Malcolm release, browse to the [Malcolm releases page on GitHub]({{ site.github.repository_url }}/releases) and download at a minimum the files ending in `.py` and the `malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz` file, then navigate to your downloads directory: ``` user@host:~$ cd Downloads/ user@host:~/Downloads$ ls @@ -83,12 +83,12 @@ vm.dirty_ratio= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y /etc/security/limits.d/limits.conf does not exist, create it? (Y/n): y ``` -If you are configuring Malcolm from within a git working copy, `install.py` will now exit. Run `install.py` again like you did at the beginning of the example, only remove the `sudo` and add `--configure` to run `install.py` in "configuration only" mode. +If you are configuring Malcolm from within a git working copy, `install.py` will now exit. Run `./scripts/configure` to continue with configuration: ``` -user@host:~/Malcolm$ ./scripts/install.py --configure +user@host:~/Malcolm$ ./scripts/configure ``` -Alternately, if you are configuring Malcolm from the release tarball you will be asked if you would like to extract the contents of the tarball and to specify the installation directory and `install.py` will continue: +Alternately, if you are configuring Malcolm from the release tarball you will be asked if you would like to extract the contents of the tarball and to specify the installation directory and Malcolm configuration will continue: ``` Extract Malcolm runtime files from /home/user/Downloads/malcolm_20190611_095410_ce2d8de.tar.gz (Y/n): y diff --git a/docs/upload.md b/docs/upload.md index a0a4e5fe1..c48ee3938 100644 --- a/docs/upload.md +++ b/docs/upload.md @@ -21,10 +21,10 @@ Files uploaded via these methods are monitored and moved automatically to other ## Tagging -In addition to be processed for uploading, Malcolm events will be tagged according to the components of the filenames of the PCAP files or Zeek log archives files from which the events were parsed. For example, records created from a PCAP file named `ACME_Scada_VLAN10.pcap` would be tagged with `ACME`, `Scada`, and `VLAN10`. Tags are extracted from filenames by splitting on the characters `,` (comma), `-` (dash), and `_` (underscore). These tags are viewable and searchable (via the `tags` field) in Arkime and OpenSearch Dashboards. This behavior can be changed by modifying the `AUTO_TAG` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml). +In addition to be processed for uploading, Malcolm events will be tagged according to the components of the filenames of the PCAP files or Zeek log archives files from which the events were parsed. For example, records created from a PCAP file named `ACME_Scada_VLAN10.pcap` would be tagged with `ACME`, `Scada`, and `VLAN10`. Tags are extracted from filenames by splitting on the characters `,` (comma), `-` (dash), and `_` (underscore). These tags are viewable and searchable (via the `tags` field) in Arkime and OpenSearch Dashboards. This behavior can be changed by modifying the `AUTO_TAG` [environment variable in `upload-common.env`](malcolm-config.md#MalcolmConfigEnvVars). Tags may also be specified manually with the [browser-based upload form](#Upload). ## Processing uploaded PCAPs with Zeek and Suricata -The **Analyze with Zeek** and **Analyze with Suricata** checkboxes may be used when uploading PCAP files to cause them to be analyzed by Zeek and Suricata, respectively. This is functionally equivalent to the `ZEEK_AUTO_ANALYZE_PCAP_FILES` and `SURICATA_AUTO_ANALYZE_PCAP_FILES` environment variables [described above](malcolm-config.md#DockerComposeYml), only on a per-upload basis. Zeek can also automatically carve out files from file transfers; see [Automatic file extraction and scanning](file-scanning.md#ZeekFileExtraction) for more details. +The **Analyze with Zeek** and **Analyze with Suricata** checkboxes may be used when uploading PCAP files to cause them to be analyzed by Zeek and Suricata, respectively. This is functionally equivalent to the `ZEEK_AUTO_ANALYZE_PCAP_FILES` and `SURICATA_AUTO_ANALYZE_PCAP_FILES` environment variables [described above](malcolm-config.md#MalcolmConfigEnvVars), only on a per-upload basis. Zeek can also automatically carve out files from file transfers; see [Automatic file extraction and scanning](file-scanning.md#ZeekFileExtraction) for more details. diff --git a/malcolm-iso/build.sh b/malcolm-iso/build.sh index 1d2e0445b..38721dd79 100755 --- a/malcolm-iso/build.sh +++ b/malcolm-iso/build.sh @@ -138,6 +138,7 @@ if [ -d "$WORKDIR" ]; then ln -s ./control.py status ln -s ./control.py stop ln -s ./control.py wipe + ln -s ./install.py configure popd >/dev/null 2>&1 cp ./scripts/malcolm_common.py "$MALCOLM_DEST_DIR/scripts/" cp ./scripts/malcolm_kubernetes.py "$MALCOLM_DEST_DIR/scripts/" diff --git a/scripts/configure b/scripts/configure new file mode 120000 index 000000000..7f4fe4b08 --- /dev/null +++ b/scripts/configure @@ -0,0 +1 @@ +install.py \ No newline at end of file diff --git a/scripts/install.py b/scripts/install.py index 895dbb27c..e5e9ab45a 100755 --- a/scripts/install.py +++ b/scripts/install.py @@ -2680,6 +2680,9 @@ def main(): parser.print_help() exit(2) + if os.path.islink(os.path.join(ScriptPath, ScriptName)) and ScriptName.startswith('configure'): + args.configOnly = True + if args.debug: eprint(os.path.join(ScriptPath, ScriptName)) eprint(f"Arguments: {sys.argv[1:]}") diff --git a/scripts/malcolm_appliance_packager.sh b/scripts/malcolm_appliance_packager.sh index 8855e03ef..27a9621d3 100755 --- a/scripts/malcolm_appliance_packager.sh +++ b/scripts/malcolm_appliance_packager.sh @@ -116,6 +116,7 @@ if mkdir "$DESTDIR"; then ln -s ./control.py status ln -s ./control.py stop ln -s ./control.py wipe + ln -s ./install.py configure popd >/dev/null 2>&1 pushd .. >/dev/null 2>&1 DESTNAME="$RUN_PATH/$(basename $DESTDIR).tar.gz"