diff --git a/docs/main.md b/README.md similarity index 70% rename from docs/main.md rename to README.md index 9460bf79b..8db7ab20e 100644 --- a/docs/main.md +++ b/README.md @@ -22,93 +22,93 @@ You can help steer Malcolm's development by sharing your ideas and feedback. Ple ## Table of Contents * [Automated Build Workflows Status](#BuildBadges) -* [Quick start](quickstart.md#QuickStart) - * [Getting Malcolm](#GetMalcolm) - * [User interface](#UserInterfaceURLs) +* [Quick start](docs/quickstart.md#QuickStart) + * [Getting Malcolm](docs/quickstart.md#GetMalcolm) + * [User interface](docs/quickstart.md#UserInterfaceURLs) * [Overview](#Overview) -* [Components](#Components) -* [Supported Protocols](#Protocols) -* [Development](#Development) - * [Building from source](#Build) -* [Pre-Packaged installation files](#Packager) +* [Components](docs/components.md#Components) +* [Supported Protocols](docs/protocols.md#Protocols) +* [Development](docs/development.md#Development) + * [Building from source](docs/development.md#Build) + * [Pre-Packaged installation files](docs/development.md#Packager) * [Preparing your system](#Preparing) * [Recommended system requirements](#SystemRequirements) - * [System configuration and tuning](#ConfigAndTuning) - * [`docker-compose.yml` parameters](#DockerComposeYml) + * [System configuration and tuning](docs/malcolm-config.md#ConfigAndTuning) + * [`docker-compose.yml` parameters](docs/malcolm-config.md#DockerComposeYml) * [Linux host system configuration](#HostSystemConfigLinux) * [macOS host system configuration](#HostSystemConfigMac) * [Windows host system configuration](#HostSystemConfigWindows) * [Running Malcolm](#Running) - * [OpenSearch instances](#OpenSearchInstance) + * [OpenSearch instances](docs/opensearch-instances.md#OpenSearchInstance) * [Authentication and authorization for remote OpenSearch clusters](#OpenSearchAuth) - * [Configure authentication](#AuthSetup) - * [Local account management](#AuthBasicAccountManagement) - * [Lightweight Directory Access Protocol (LDAP) authentication](#AuthLDAP) + * [Configure authentication](docs/authsetup.md#AuthSetup) + * [Local account management](docs/authsetup.md#AuthBasicAccountManagement) + * [Lightweight Directory Access Protocol (docs/LDAP) authentication](authsetup.md#AuthLDAP) - [LDAP connection security](#AuthLDAPSecurity) - * [TLS certificates](#TLSCerts) - * [Starting Malcolm](#Starting) - * [Stopping and restarting Malcolm](#StopAndRestart) - * [Clearing Malcolm's data](#Wipe) + * [TLS certificates](docs/authsetup.md#TLSCerts) + * [Starting Malcolm](docs/running.md#Starting) + * [Stopping and restarting Malcolm](docs/running.md#StopAndRestart) + * [Clearing Malcolm's data](docs/running.md#Wipe) * [Temporary read-only interface](#ReadOnlyUI) -* [Capture file and log archive upload](#Upload) - - [Tagging](#Tagging) - - [Processing uploaded PCAPs with Zeek and Suricata](#UploadPCAPProcessors) +* [Capture file and log archive upload](docs/upload.md#Upload) + - [Tagging](docs/upload.md#Tagging) + - [Processing uploaded PCAPs with Zeek and Suricata](docs/upload.md#UploadPCAPProcessors) * [Live analysis](#LiveAnalysis) - * [Using a network sensor appliance](#Hedgehog) + * [Using a network sensor appliance](docs/live-analysis.md#Hedgehog) * [Monitoring local network interfaces](#LocalPCAP) * [Manually forwarding logs from an external source](#ExternalForward) * [Arkime](#Arkime) - * [Zeek log integration](#ArkimeZeek) - - [Correlating Zeek logs and Arkime sessions](#ZeekArkimeFlowCorrelation) + * [Zeek log integration](docs/arkime.md#ArkimeZeek) + - [Correlating Zeek logs and Arkime sessions](docs/arkime.md#ZeekArkimeFlowCorrelation) * [Help](#ArkimeHelp) - * [Sessions](#ArkimeSessions) + * [Sessions](docs/arkime.md#ArkimeSessions) * [PCAP Export](#ArkimePCAPExport) * [SPIView](#ArkimeSPIView) - * [SPIGraph](#ArkimeSPIGraph) + * [SPIGraph](docs/arkime.md#ArkimeSPIGraph) * [Connections](#ArkimeConnections) - * [Hunt](#ArkimeHunt) + * [Hunt](docs/arkime.md#ArkimeHunt) * [Statistics](#ArkimeStats) * [Settings](#ArkimeSettings) -* [OpenSearch Dashboards](#Dashboards) +* [OpenSearch Dashboards](docs/dashboards.md#Dashboards) * [Discover](#Discover) - [Screenshots](#DiscoverGallery) - * [Visualizations and dashboards](#DashboardsVisualizations) + * [Visualizations and dashboards](docs/dashboards.md#DashboardsVisualizations) - [Prebuilt visualizations and dashboards](#PrebuiltVisualizations) - [Screenshots](#PrebuiltVisualizationsGallery) - - [Building your own visualizations and dashboards](#BuildDashboard) + - [Building your own visualizations and dashboards](docs/dashboards.md#BuildDashboard) + [Screenshots](#NewVisualizationsGallery) -* [Search Queries in Arkime and OpenSearch](#SearchCheatSheet) +* [Search Queries in Arkime and OpenSearch](docs/queries-cheat-sheet.md#SearchCheatSheet) * [Other Malcolm features](#MalcolmFeatures) - - [Automatic file extraction and scanning](#ZeekFileExtraction) - - [Automatic host and subnet name assignment](#HostAndSubnetNaming) - + [IP/MAC address to hostname mapping via `host-map.txt`](#HostNaming) - + [CIDR subnet to network segment name mapping via `cidr-map.txt`](#SegmentNaming) - + [Defining hostname and CIDR subnet names interface](#NameMapUI) + - [Automatic file extraction and scanning](docs/file-scanning.md#ZeekFileExtraction) + - [Automatic host and subnet name assignment](docs/host-and-subnet-mapping.md#HostAndSubnetNaming) + + [IP/MAC address to hostname mapping via `host-map.txt`](docs/host-and-subnet-mapping.md#HostNaming) + + [CIDR subnet to network segment name mapping via `cidr-map.txt`](docs/host-and-subnet-mapping.md#SegmentNaming) + + [Defining hostname and CIDR subnet names interface](docs/host-and-subnet-mapping.md#NameMapUI) + [Applying mapping changes](#ApplyMapping) - - [OpenSearch index management](#IndexManagement) - - [Event severity scoring](#Severity) + - [OpenSearch index management](docs/index-management.md#IndexManagement) + - [Event severity scoring](docs/severity.md#Severity) + [Customizing event severity scoring](#SeverityConfig) - - [Zeek Intelligence Framework](#ZeekIntel) - + [STIX™ and TAXII™](#ZeekIntelSTIX) - + [MISP](#ZeekIntelMISP) + - [Zeek Intelligence Framework](docs/zeek-intel.md#ZeekIntel) + + [STIX™ and TAXII™](docs/zeek-intel.md#ZeekIntelSTIX) + + [MISP](docs/zeek-intel.md#ZeekIntelMISP) - [Anomaly Detection](#AnomalyDetection) - - [Alerting](#Alerting) + - [Alerting](docs/alerting.md#Alerting) + [Email Sender Accounts](#AlertingEmail) - - ["Best Guess" Fingerprinting for ICS Protocols](#ICSBestGuess) - - [Asset Management with NetBox](#NetBox) + - ["Best Guess" Fingerprinting for ICS Protocols](docs/ics-best-guess.md#ICSBestGuess) + - [Asset Management with NetBox](docs/netbox.md#NetBox) - [CyberChef](#CyberChef) - - [API](#API) - + [Examples](#APIExamples) + - [API](docs/api.md#API) + + [Examples](docs/api-examples.md#APIExamples) * [Ingesting Third-party Logs](#ThirdPartyLogs) -* [Malcolm installer ISO](#ISO) - * [Installation](#ISOInstallation) +* [Malcolm installer ISO](docs/malcolm-iso.md#ISO) + * [Installation](docs/malcolm-iso.md#ISOInstallation) * [Generating the ISO](#ISOBuild) * [Setup](#ISOSetup) * [Time synchronization](#ConfigTime) - * [Hardening](#Hardening) + * [Hardening](docs/hardening.md#Hardening) * [Compliance Exceptions](#ComplianceExceptions) -* [Installation example using Ubuntu 22.04 LTS](#InstallationExample) -* [Upgrading Malcolm](#UpgradePlan) +* [Installation example using Ubuntu 22.04 LTS](docs/ubuntu-install-example.md#InstallationExample) +* [Upgrading Malcolm](docs/malcolm-upgrade.md#UpgradePlan) * [Modifying or Contributing to Malcolm](#Contributing) * [Forks](#Forks) * [Copyright](#Footer) @@ -116,7 +116,7 @@ You can help steer Malcolm's development by sharing your ideas and feedback. Ple ## Automated Builds Status -See [**Building from source**](#Build) to read how you can use GitHub [workflow files](./.github/workflows/) to build Malcolm. +See [**Building from source**](docs/development.md#Build) to read how you can use GitHub [workflow files](./.github/workflows/) to build Malcolm. ![api-build-and-push-ghcr](https://github.com/mmguero-dev/Malcolm/workflows/api-build-and-push-ghcr/badge.svg) ![arkime-build-and-push-ghcr](https://github.com/mmguero-dev/Malcolm/workflows/arkime-build-and-push-ghcr/badge.svg) @@ -142,9 +142,9 @@ See [**Building from source**](#Build) to read how you can use GitHub [workflow ![Malcolm Network Diagram](./images/malcolm_network_diagram.png) -Malcolm processes network traffic data in the form of packet capture (PCAP) files or Zeek logs. A [sensor](#Hedgehog) (packet capture appliance) monitors network traffic mirrored to it over a SPAN port on a network switch or router, or using a network TAP device. [Zeek](https://www.zeek.org/index.html) logs and [Arkime](https://molo.ch/) sessions are generated containing important session metadata from the traffic observed, which are then securely forwarded to a Malcolm instance. Full PCAP files are optionally stored locally on the sensor device for examination later. +Malcolm processes network traffic data in the form of packet capture (docs/PCAP) files or Zeek logs. A [sensor](live-analysis.md#Hedgehog) (packet capture appliance) monitors network traffic mirrored to it over a SPAN port on a network switch or router, or using a network TAP device. [Zeek](https://www.zeek.org/index.html) logs and [Arkime](https://molo.ch/) sessions are generated containing important session metadata from the traffic observed, which are then securely forwarded to a Malcolm instance. Full PCAP files are optionally stored locally on the sensor device for examination later. -Malcolm parses the network session data and enriches it with additional lookups and mappings including GeoIP mapping, hardware manufacturer lookups from [organizationally unique identifiers (OUI)](http://standards-oui.ieee.org/oui/oui.txt) in MAC addresses, assigning names to [network segments](#SegmentNaming) and [hosts](#HostNaming) based on user-defined IP address and MAC mappings, performing [TLS fingerprinting](#https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967), and many others. +Malcolm parses the network session data and enriches it with additional lookups and mappings including GeoIP mapping, hardware manufacturer lookups from [organizationally unique identifiers (docs/OUI)](http://standards-oui.ieee.org/oui/oui.txt) in MAC addresses, assigning names to [network segments](host-and-subnet-mapping.md#SegmentNaming) and [hosts](host-and-subnet-mapping.md#HostNaming) based on user-defined IP address and MAC mappings, performing [TLS fingerprinting](#https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967), and many others. The enriched data is stored in an [OpenSearch](https://opensearch.org/) document store in a format suitable for analysis through two intuitive interfaces: OpenSearch Dashboards, a flexible data visualization plugin with dozens of prebuilt dashboards providing an at-a-glance overview of network protocols; and Arkime, a powerful tool for finding and identifying the network sessions comprising suspected security incidents. These tools can be accessed through a web browser from analyst workstations or for display in a security operations center (SOC). Logs can also optionally be forwarded on to another instance of Malcolm. diff --git a/docs/alerting.md b/docs/alerting.md index 0daa2bffe..a95f193fd 100644 --- a/docs/alerting.md +++ b/docs/alerting.md @@ -2,11 +2,11 @@ Malcolm uses the Alerting plugins for [OpenSearch](https://github.com/opensearch-project/alerting) and [OpenSearch Dashboards](https://github.com/opensearch-project/alerting-dashboards-plugin). See [Alerting](https://opensearch.org/docs/latest/monitoring-plugins/alerting/index/) in the OpenSearch documentation for usage instructions. -A fresh installation of Malcolm configures an example [custom webhook destination](https://opensearch.org/docs/latest/monitoring-plugins/alerting/monitors/#create-destinations) named **Malcolm API Loopback Webhook** that directs the triggered alerts back into the [Malcolm API](#API) to be reindexed as a session record with `event.dataset` set to `alerting`. The corresponding monitor **Malcolm API Loopback Monitor** is disabled by default, as you'll likely want to configure the trigger conditions to suit your needs. These examples are provided to illustrate how triggers and monitors can interact with a custom webhook to process alerts. +A fresh installation of Malcolm configures an example [custom webhook destination](https://opensearch.org/docs/latest/monitoring-plugins/alerting/monitors/#create-destinations) named **Malcolm API Loopback Webhook** that directs the triggered alerts back into the [Malcolm API](api.md#API) to be reindexed as a session record with `event.dataset` set to `alerting`. The corresponding monitor **Malcolm API Loopback Monitor** is disabled by default, as you'll likely want to configure the trigger conditions to suit your needs. These examples are provided to illustrate how triggers and monitors can interact with a custom webhook to process alerts. #### Email Sender Accounts -When using an email account to send alerts, you must [authenticate each sender account](https://opensearch.org/docs/latest/monitoring-plugins/alerting/monitors/#authenticate-sender-account) before you can send an email. The [`auth_setup`](#AuthSetup) script can be used to securely store the email account credentials: +When using an email account to send alerts, you must [authenticate each sender account](https://opensearch.org/docs/latest/monitoring-plugins/alerting/monitors/#authenticate-sender-account) before you can send an email. The [`auth_setup`](authsetup.md#AuthSetup) script can be used to securely store the email account credentials: ``` ./scripts/auth_setup @@ -31,4 +31,4 @@ Email alert sender account variables stored: opensearch.alerting.destination.ema (Re)generate internal passwords for NetBox (Y/n): n ``` -This action should only be performed while Malcolm is [stopped](#StopAndRestart): otherwise the credentials will not be stored correctly. \ No newline at end of file +This action should only be performed while Malcolm is [stopped](running.md#StopAndRestart): otherwise the credentials will not be stored correctly. \ No newline at end of file diff --git a/docs/anomaly-detection.md b/docs/anomaly-detection.md index 5991eafb0..d6739be3b 100644 --- a/docs/anomaly-detection.md +++ b/docs/anomaly-detection.md @@ -1,6 +1,6 @@ ### Anomaly Detection -Malcolm uses the Anomaly Detection plugins for [OpenSearch](https://github.com/opensearch-project/anomaly-detection) and [OpenSearch Dashboards](https://github.com/opensearch-project/anomaly-detection-dashboards-plugin) to identify anomalous log data in near real-time using the [Random Cut Forest](https://api.semanticscholar.org/CorpusID:927435) (RCF) algorithm. This can be paired with [Alerting](#Alerting) to automatically notify when anomalies are found. See [Anomaly detection](https://opensearch.org/docs/latest/monitoring-plugins/ad/index/) in the OpenSearch documentation for usage instructions on how to create detectors for any of the many fields Malcolm supports. +Malcolm uses the Anomaly Detection plugins for [OpenSearch](https://github.com/opensearch-project/anomaly-detection) and [OpenSearch Dashboards](https://github.com/opensearch-project/anomaly-detection-dashboards-plugin) to identify anomalous log data in near real-time using the [Random Cut Forest](https://api.semanticscholar.org/CorpusID:927435) (RCF) algorithm. This can be paired with [Alerting](alerting.md#Alerting) to automatically notify when anomalies are found. See [Anomaly detection](https://opensearch.org/docs/latest/monitoring-plugins/ad/index/) in the OpenSearch documentation for usage instructions on how to create detectors for any of the many fields Malcolm supports. A fresh installation of Malcolm configures [several detectors](dashboards/anomaly_detectors) for detecting anomalous network traffic: diff --git a/docs/api-aggregations.md b/docs/api-aggregations.md index a03a0c25b..b65f8b70b 100644 --- a/docs/api-aggregations.md +++ b/docs/api-aggregations.md @@ -21,4 +21,4 @@ Examples of `filter` parameter: * `{"event.provider":"zeek","event.dataset":["conn","dns"]}` - "`event.provider` is `zeek` and `event.dataset` is either `conn` or `dns`" * `{"!event.dataset":null}` - "`event.dataset` is set (is not `null`)" -See [Examples](#APIExamples) for more examples of `filter` and corresponding output. \ No newline at end of file +See [Examples](api-examples.md#APIExamples) for more examples of `filter` and corresponding output. \ No newline at end of file diff --git a/docs/api-event-logging.md b/docs/api-event-logging.md index 2a8220d21..2c92a9a96 100644 --- a/docs/api-event-logging.md +++ b/docs/api-event-logging.md @@ -2,7 +2,7 @@ `POST` - /mapi/event -A webhook that accepts alert data to be reindexed into OpenSearch as session records for viewing in Malcolm's [dashboards](#Dashboards). See [Alerting](#Alerting) for more details and an example of how this API is used. +A webhook that accepts alert data to be reindexed into OpenSearch as session records for viewing in Malcolm's [dashboards](dashboards.md#Dashboards). See [Alerting](alerting.md#Alerting) for more details and an example of how this API is used.
Example input: diff --git a/docs/arkime.md b/docs/arkime.md index 2dcf5cc04..f1e9743c0 100644 --- a/docs/arkime.md +++ b/docs/arkime.md @@ -20,7 +20,7 @@ The values of records created from Zeek logs can be expanded and viewed like any #### Correlating Zeek logs and Arkime sessions -The Arkime interface displays both Zeek logs and Arkime sessions alongside each other. Using fields common to both data sources, one can [craft queries](#SearchCheatSheet) to filter results matching desired criteria. +The Arkime interface displays both Zeek logs and Arkime sessions alongside each other. Using fields common to both data sources, one can [craft queries](queries-cheat-sheet.md#SearchCheatSheet) to filter results matching desired criteria. A few fields of particular mention that help limit returned results to those Zeek logs and Arkime session records generated from the same network connection are [Community ID](https://github.com/corelight/community-id-spec) (`network.community_id`) and Zeek's [connection UID](https://docs.zeek.org/en/stable/examples/logs/#using-uids) (`zeek.uid`), which Malcolm maps to both Arkime's `rootId` field and the [ECS](https://www.elastic.co/guide/en/ecs/current/ecs-event.html#field-event-id) `event.id` field. @@ -38,7 +38,7 @@ Click the icon of the owl 🦉 in the upper-left hand corner of to access the Ar ### Sessions -The **Sessions** view provides low-level details of the sessions being investigated, whether they be Arkime sessions created from PCAP files or [Zeek logs mapped](#ArkimeZeek) to the Arkime session database schema. +The **Sessions** view provides low-level details of the sessions being investigated, whether they be Arkime sessions created from PCAP files or [Zeek logs mapped](arkime.md#ArkimeZeek) to the Arkime session database schema. ![Arkime's Sessions view](./images/screenshots/arkime_sessions.png) @@ -61,7 +61,7 @@ The sessions table is displayed below the filter controls. This table contains t To the left of the column headers are two buttons. The **Toggle visible columns** button, indicated by a grid **⊞** icon, allows toggling which columns are displayed in the sessions table. The **Save or load custom column configuration** button, indicated by a columns **◫** icon, allows saving the current displayed columns or loading previously-saved configurations. This is useful for customizing which columns are displayed when investigating different types of traffic. Column headers can also be clicked to sort the results in the table, and column widths may be adjusted by dragging the separators between column headers. -Details for individual sessions/logs can be expanded by clicking the plus **➕** icon on the left of each row. Each row may contain multiple sections and controls, depending on whether the row represents a Arkime session or a [Zeek log](#ArkimeZeek). Clicking the field names and values in the details sections allows additional filters to be specified or summary lists of unique values to be exported. +Details for individual sessions/logs can be expanded by clicking the plus **➕** icon on the left of each row. Each row may contain multiple sections and controls, depending on whether the row represents a Arkime session or a [Zeek log](arkime.md#ArkimeZeek). Clicking the field names and values in the details sections allows additional filters to be specified or summary lists of unique values to be exported. When viewing Arkime session details (ie., a session generated from a PCAP file), an additional packets section will be visible underneath the metadata sections. When the details of a session of this type are expanded, Arkime will read the packet(s) comprising the session for display here. Various controls can be used to adjust how the packet is displayed (enabling **natural** decoding and enabling **Show Images & Files** may produce visually pleasing results), and other options (including PCAP download, carving images and files, applying decoding filters, and examining payloads in [CyberChef](https://github.com/gchq/CyberChef)) are available. @@ -81,7 +81,7 @@ Arkime's **SPI** (**S**ession **P**rofile **I**nformation) **View** provides a q ![Arkime's SPIView](./images/screenshots/arkime_spiview.png) -Click the the plus **➕** icon to the right of a category to expand it. The values for specific fields are displayed by clicking the field description in the field list underneath the category name. The list of field names can be filtered by typing part of the field name in the *Search for fields to display in this category* text input. The **Load All** and **Unload All** buttons can be used to toggle display of all of the fields belonging to that category. Once displayed, a field's name or one of its values may be clicked to provide further actions for filtering or displaying that field or its values. Of particular interest may be the **Open [fieldname] SPI Graph** option when clicking on a field's name. This will open a new tab with the SPI Graph ([see below](#ArkimeSPIGraph)) populated with the field's top values. +Click the the plus **➕** icon to the right of a category to expand it. The values for specific fields are displayed by clicking the field description in the field list underneath the category name. The list of field names can be filtered by typing part of the field name in the *Search for fields to display in this category* text input. The **Load All** and **Unload All** buttons can be used to toggle display of all of the fields belonging to that category. Once displayed, a field's name or one of its values may be clicked to provide further actions for filtering or displaying that field or its values. Of particular interest may be the **Open [fieldname] SPI Graph** option when clicking on a field's name. This will open a new tab with the SPI Graph ([see below](arkime.md#ArkimeSPIGraph)) populated with the field's top values. Note that because the SPIView page can potentially run many queries, SPIView limits the search domain to seven days (in other words, seven indices, as each index represents one day's worth of data). When using SPIView, you will have best results if you limit your search time frame to less than or equal to seven days. This limit can be adjusted by editing the `spiDataMaxIndices` setting in [config.ini](./etc/arkime/config.ini) and rebuilding the `malcolmnetsec/arkime` docker container. @@ -109,7 +109,7 @@ While the default source and destination fields are *Src IP* and *Dst IP:Dst Por * *Src OUI* and *Dst OUI* (hardware manufacturers) * *Src IP* and *Protocols* -* *Originating Network Segment* and *Responding Network Segment* (see [CIDR subnet to network segment name mapping](#SegmentNaming)) +* *Originating Network Segment* and *Responding Network Segment* (see [CIDR subnet to network segment name mapping](host-and-subnet-mapping.md#SegmentNaming)) * *Originating GeoIP City* and *Responding GeoIP City* or any other combination of these or other fields. @@ -118,7 +118,7 @@ See also Arkime's usage documentation for more information on the [Connections g ### Hunt -Arkime's **Hunt** feature allows an analyst to search within the packets themselves (including payload data) rather than simply searching the session metadata. The search string may be specified using ASCII (with or without case sensitivity), hex codes, or regular expressions. Once a hunt job is complete, matching sessions can be viewed in the [Sessions](#ArkimeSessions) view. +Arkime's **Hunt** feature allows an analyst to search within the packets themselves (including payload data) rather than simply searching the session metadata. The search string may be specified using ASCII (with or without case sensitivity), hex codes, or regular expressions. Once a hunt job is complete, matching sessions can be viewed in the [Sessions](arkime.md#ArkimeSessions) view. Clicking the **Create a packet search job** on the Hunt page will allow you to specify the following parameters for a new hunt job: @@ -136,7 +136,7 @@ Once a hunt job is submitted, it will be assigned a unique hunt ID (a long uniqu ![Hunt completed](./images/screenshots/arkime_hunt_finished.png) -Once the hunt job is complete (and a minute or so has passed, as the `huntId` must be added to the matching session records in the database), click the folder **📂** icon on the right side of the hunt job row to open a new [Sessions](#ArkimeSessions) tab with the search bar prepopulated to filter to sessions with packets matching the search criteria. +Once the hunt job is complete (and a minute or so has passed, as the `huntId` must be added to the matching session records in the database), click the folder **📂** icon on the right side of the hunt job row to open a new [Sessions](arkime.md#ArkimeSessions) tab with the search bar prepopulated to filter to sessions with packets matching the search criteria. ![Hunt result sessions](./images/screenshots/arkime_hunt_sessions.png) diff --git a/docs/authsetup.md b/docs/authsetup.md index eb5075304..889394353 100644 --- a/docs/authsetup.md +++ b/docs/authsetup.md @@ -1,12 +1,12 @@ ### Configure authentication -Malcolm requires authentication to access the [user interface](#UserInterfaceURLs). [Nginx](https://nginx.org/) can authenticate users with either local TLS-encrypted HTTP basic authentication or using a remote Lightweight Directory Access Protocol (LDAP) authentication server. +Malcolm requires authentication to access the [user interface](quickstart.md#UserInterfaceURLs). [Nginx](https://nginx.org/) can authenticate users with either local TLS-encrypted HTTP basic authentication or using a remote Lightweight Directory Access Protocol (LDAP) authentication server. With the local basic authentication method, user accounts are managed by Malcolm and can be created, modified, and deleted using a [user management web interface](#AccountManagement). This method is suitable in instances where accounts and credentials do not need to be synced across many Malcolm installations. LDAP authentication are managed on a remote directory service, such as a [Microsoft Active Directory Domain Services](https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/get-started/virtual-dc/active-directory-domain-services-overview) or [OpenLDAP](https://www.openldap.org/). -Malcolm's authentication method is defined in the `x-auth-variables` section near the top of the [`docker-compose.yml`](#DockerComposeYml) file with the `NGINX_BASIC_AUTH` environment variable: `true` for local TLS-encrypted HTTP basic authentication, `false` for LDAP authentication. +Malcolm's authentication method is defined in the `x-auth-variables` section near the top of the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file with the `NGINX_BASIC_AUTH` environment variable: `true` for local TLS-encrypted HTTP basic authentication, `false` for LDAP authentication. In either case, you **must** run `./scripts/auth_setup` before starting Malcolm for the first time in order to: @@ -15,21 +15,21 @@ In either case, you **must** run `./scripts/auth_setup` before starting Malcolm * key and certificate files are located in the `nginx/certs/` directory * specify whether or not to (re)generate the self-signed certificates used by a remote log forwarder (see the `BEATS_SSL` environment variable above) * certificate authority, certificate, and key files for Malcolm's Logstash instance are located in the `logstash/certs/` directory - * certificate authority, certificate, and key files to be copied to and used by the remote log forwarder are located in the `filebeat/certs/` directory; if using [Hedgehog Linux](#Hedgehog), these certificates should be copied to the `/opt/sensor/sensor_ctl/logstash-client-certificates` directory on the sensor + * certificate authority, certificate, and key files to be copied to and used by the remote log forwarder are located in the `filebeat/certs/` directory; if using [Hedgehog Linux](live-analysis.md#Hedgehog), these certificates should be copied to the `/opt/sensor/sensor_ctl/logstash-client-certificates` directory on the sensor * specify whether or not to [store the username/password](https://opensearch.org/docs/latest/monitoring-plugins/alerting/monitors/#authenticate-sender-account) for [email alert senders](https://opensearch.org/docs/latest/monitoring-plugins/alerting/monitors/#create-destinations) * these parameters are stored securely in the OpenSearch keystore file `opensearch/opensearch.keystore` ##### Local account management -[`auth_setup`](#AuthSetup) is used to define the username and password for the administrator account. Once Malcolm is running, the administrator account can be used to manage other user accounts via a **Malcolm User Management** page served over HTTPS on port 488 (e.g., [https://localhost:488](https://localhost:488) if you are connecting locally). +[`auth_setup`](authsetup.md#AuthSetup) is used to define the username and password for the administrator account. Once Malcolm is running, the administrator account can be used to manage other user accounts via a **Malcolm User Management** page served over HTTPS on port 488 (e.g., [https://localhost:488](https://localhost:488) if you are connecting locally). -Malcolm user accounts can be used to access the [interfaces](#UserInterfaceURLs) of all of its [components](#Components), including Arkime. Arkime uses its own internal database of user accounts, so when a Malcolm user account logs in to Arkime for the first time Malcolm creates a corresponding Arkime user account automatically. This being the case, it is *not* recommended to use the Arkime **Users** settings page or change the password via the **Password** form under the Arkime **Settings** page, as those settings would not be consistently used across Malcolm. +Malcolm user accounts can be used to access the [interfaces](quickstart.md#UserInterfaceURLs) of all of its [components](components.md#Components), including Arkime. Arkime uses its own internal database of user accounts, so when a Malcolm user account logs in to Arkime for the first time Malcolm creates a corresponding Arkime user account automatically. This being the case, it is *not* recommended to use the Arkime **Users** settings page or change the password via the **Password** form under the Arkime **Settings** page, as those settings would not be consistently used across Malcolm. Users may change their passwords via the **Malcolm User Management** page by clicking **User Self Service**. A forgotten password can also be reset via an emailed link, though this requires SMTP server settings to be specified in `htadmin/config.ini` in the Malcolm installation directory. #### Lightweight Directory Access Protocol (LDAP) authentication -The [nginx-auth-ldap](https://github.com/kvspb/nginx-auth-ldap) module serves as the interface between Malcolm's [Nginx](https://nginx.org/) web server and a remote LDAP server. When you run [`auth_setup`](#AuthSetup) for the first time, a sample LDAP configuration file is created at `nginx/nginx_ldap.conf`. +The [nginx-auth-ldap](https://github.com/kvspb/nginx-auth-ldap) module serves as the interface between Malcolm's [Nginx](https://nginx.org/) web server and a remote LDAP server. When you run [`auth_setup`](authsetup.md#AuthSetup) for the first time, a sample LDAP configuration file is created at `nginx/nginx_ldap.conf`. ``` # This is a sample configuration for the ldap_server section of nginx.conf. @@ -76,24 +76,24 @@ Authentication over LDAP can be done using one of three ways, [two of which](htt * **LDAPS** - a commonly used (though unofficial and considered deprecated) method in which SSL negotiation takes place before any commands are sent from the client to the server * **Unencrypted** (cleartext) (***not recommended***) -In addition to the `NGINX_BASIC_AUTH` environment variable being set to `false` in the `x-auth-variables` section near the top of the [`docker-compose.yml`](#DockerComposeYml) file, the `NGINX_LDAP_TLS_STUNNEL` and `NGINX_LDAP_TLS_STUNNEL` environment variables are used in conjunction with the values in `nginx/nginx_ldap.conf` to define the LDAP connection security level. Use the following combinations of values to achieve the connection security methods above, respectively: +In addition to the `NGINX_BASIC_AUTH` environment variable being set to `false` in the `x-auth-variables` section near the top of the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file, the `NGINX_LDAP_TLS_STUNNEL` and `NGINX_LDAP_TLS_STUNNEL` environment variables are used in conjunction with the values in `nginx/nginx_ldap.conf` to define the LDAP connection security level. Use the following combinations of values to achieve the connection security methods above, respectively: * **StartTLS** - - `NGINX_LDAP_TLS_STUNNEL` set to `true` in [`docker-compose.yml`](#DockerComposeYml) + - `NGINX_LDAP_TLS_STUNNEL` set to `true` in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) - `url` should begin with `ldap://` and its port should be either the default LDAP port (389) or the default Global Catalog port (3268) in `nginx/nginx_ldap.conf` * **LDAPS** - - `NGINX_LDAP_TLS_STUNNEL` set to `false` in [`docker-compose.yml`](#DockerComposeYml) + - `NGINX_LDAP_TLS_STUNNEL` set to `false` in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) - `url` should begin with `ldaps://` and its port should be either the default LDAPS port (636) or the default LDAPS Global Catalog port (3269) in `nginx/nginx_ldap.conf` * **Unencrypted** (clear text) (***not recommended***) - - `NGINX_LDAP_TLS_STUNNEL` set to `false` in [`docker-compose.yml`](#DockerComposeYml) + - `NGINX_LDAP_TLS_STUNNEL` set to `false` in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) - `url` should begin with `ldap://` and its port should be either the default LDAP port (389) or the default Global Catalog port (3268) in `nginx/nginx_ldap.conf` For encrypted connections (whether using **StartTLS** or **LDAPS**), Malcolm will require and verify certificates when one or more trusted CA certificate files are placed in the `nginx/ca-trust/` directory. Otherwise, any certificate presented by the domain server will be accepted. ### TLS certificates -When you [set up authentication](#AuthSetup) for Malcolm a set of unique [self-signed](https://en.wikipedia.org/wiki/Self-signed_certificate) TLS certificates are created which are used to secure the connection between clients (e.g., your web browser) and Malcolm's browser-based interface. This is adequate for most Malcolm instances as they are often run locally or on internal networks, although your browser will most likely require you to add a security exception for the certificate the first time you connect to Malcolm. +When you [set up authentication](authsetup.md#AuthSetup) for Malcolm a set of unique [self-signed](https://en.wikipedia.org/wiki/Self-signed_certificate) TLS certificates are created which are used to secure the connection between clients (e.g., your web browser) and Malcolm's browser-based interface. This is adequate for most Malcolm instances as they are often run locally or on internal networks, although your browser will most likely require you to add a security exception for the certificate the first time you connect to Malcolm. Another option is to generate your own certificates (or have them issued to you) and have them placed in the `nginx/certs/` directory. The certificate and key file should be named `cert.pem` and `key.pem`, respectively. -A third possibility is to use a third-party reverse proxy (e.g., [Traefik](https://doc.traefik.io/traefik/) or [Caddy](https://caddyserver.com/docs/quick-starts/reverse-proxy)) to handle the issuance of the certificates for you and to broker the connections between clients and Malcolm. Reverse proxies such as these often implement the [ACME](https://datatracker.ietf.org/doc/html/rfc8555) protocol for domain name authentication and can be used to request certificates from certificate authorities like [Let's Encrypt](https://letsencrypt.org/how-it-works/). In this configuration, the reverse proxy will be encrypting the connections instead of Malcolm, so you'll need to set the `NGINX_SSL` environment variable to `false` in [`docker-compose.yml`](#DockerComposeYml) (or answer `no` to the "Require encrypted HTTPS connections?" question posed by `install.py`). If you are setting `NGINX_SSL` to `false`, **make sure** you understand what you are doing and ensure that external connections cannot reach ports over which Malcolm will be communicating without encryption, including verifying your local firewall configuration. \ No newline at end of file +A third possibility is to use a third-party reverse proxy (e.g., [Traefik](https://doc.traefik.io/traefik/) or [Caddy](https://caddyserver.com/docs/quick-starts/reverse-proxy)) to handle the issuance of the certificates for you and to broker the connections between clients and Malcolm. Reverse proxies such as these often implement the [ACME](https://datatracker.ietf.org/doc/html/rfc8555) protocol for domain name authentication and can be used to request certificates from certificate authorities like [Let's Encrypt](https://letsencrypt.org/how-it-works/). In this configuration, the reverse proxy will be encrypting the connections instead of Malcolm, so you'll need to set the `NGINX_SSL` environment variable to `false` in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) (or answer `no` to the "Require encrypted HTTPS connections?" question posed by `install.py`). If you are setting `NGINX_SSL` to `false`, **make sure** you understand what you are doing and ensure that external connections cannot reach ports over which Malcolm will be communicating without encryption, including verifying your local firewall configuration. \ No newline at end of file diff --git a/docs/components.md b/docs/components.md index b67e914e1..21cf1fdf3 100644 --- a/docs/components.md +++ b/docs/components.md @@ -15,14 +15,14 @@ Malcolm leverages the following excellent open source tools, among others. * [ClamAV](https://www.clamav.net/) - an antivirus engine for scanning files extracted by Zeek * [CyberChef](https://github.com/gchq/CyberChef) - a "swiss-army knife" data conversion tool * [jQuery File Upload](https://github.com/blueimp/jQuery-File-Upload) - for uploading PCAP files and Zeek logs for processing -* [List.js](https://github.com/javve/list.js) - for the [host and subnet name mapping](#HostAndSubnetNaming) interface +* [List.js](https://github.com/javve/list.js) - for the [host and subnet name mapping](host-and-subnet-mapping.md#HostAndSubnetNaming) interface * [Docker](https://www.docker.com/) and [Docker Compose](https://docs.docker.com/compose/) - for simple, reproducible deployment of the Malcolm appliance across environments and to coordinate communication between its various components * [NetBox](https://netbox.dev/) - a suite for modeling and documenting modern networks * [PostgreSQL](https://www.postgresql.org/) - a relational database for persisting NetBox's data * [Redis](https://redis.io/) - an in-memory data store for caching NetBox session information * [Nginx](https://nginx.org/) - for HTTPS and reverse proxying Malcolm components * [nginx-auth-ldap](https://github.com/kvspb/nginx-auth-ldap) - an LDAP authentication module for nginx -* [Fluent Bit](https://fluentbit.io/) - for forwarding metrics to Malcolm from [network sensors](#Hedgehog) (packet capture appliances) +* [Fluent Bit](https://fluentbit.io/) - for forwarding metrics to Malcolm from [network sensors](live-analysis.md#Hedgehog) (packet capture appliances) * [Mark Baggett](https://github.com/MarkBaggett)'s [freq](https://github.com/MarkBaggett/freq) - a tool for calculating entropy of strings * [Florian Roth](https://github.com/Neo23x0)'s [Signature-Base](https://github.com/Neo23x0/signature-base) Yara ruleset * These Zeek plugins: diff --git a/docs/contributing-dashboards.md b/docs/contributing-dashboards.md index 835959f76..549bd2d44 100644 --- a/docs/contributing-dashboards.md +++ b/docs/contributing-dashboards.md @@ -32,10 +32,10 @@ Visualizations and dashboards can be [easily created](../../README.md#BuildDashb } } ``` -1. Include the new dashboard either by using a [bind mount](#Bind) for the `./dashboards./dashboards/` directory or by [rebuilding](#Build) the `dashboards-helper` Docker image. Dashboards are imported the first time Malcolm starts up. +1. Include the new dashboard either by using a [bind mount](contributing-local-modifications.md#Bind) for the `./dashboards./dashboards/` directory or by [rebuilding](development.md#Build) the `dashboards-helper` Docker image. Dashboards are imported the first time Malcolm starts up. ### OpenSearch Dashboards plugins -The [dashboards.Dockerfile](../../Dockerfiles/dashboards.Dockerfile) installs the OpenSearch Dashboards plugins used by Malcolm (search for `opensearch-dashboards-plugin install` in that file). Additional Dashboards plugins could be installed by modifying this Dockerfile and [rebuilding](#Build) the `dashboards` Docker image. +The [dashboards.Dockerfile](../../Dockerfiles/dashboards.Dockerfile) installs the OpenSearch Dashboards plugins used by Malcolm (search for `opensearch-dashboards-plugin install` in that file). Additional Dashboards plugins could be installed by modifying this Dockerfile and [rebuilding](development.md#Build) the `dashboards` Docker image. Third-party or community plugisn developed for Kibana will not install into OpenSearch dashboards without source code modification. Depending on the plugin, this could range from very smiple to very complex. As an illustrative example, the changes that were required to port the Sankey diagram visualization plugin from Kibana to OpenSearch Dashboards compatibility can be [viewed on GitHub](https://github.com/mmguero-dev/osd_sankey_vis/compare/edacf6b...main). diff --git a/docs/contributing-file-scanners.md b/docs/contributing-file-scanners.md index dd3b3aa19..3eb645779 100644 --- a/docs/contributing-file-scanners.md +++ b/docs/contributing-file-scanners.md @@ -1,6 +1,6 @@ ## Carved file scanners -Similar to the [PCAP processing pipeline](#PCAP) described above, new tools can plug into Malcolm's [automatic file extraction and scanning](../../README.md#ZeekFileExtraction) to examine file transfers carved from network traffic. +Similar to the [PCAP processing pipeline](contributing-pcap.md#PCAP) described above, new tools can plug into Malcolm's [automatic file extraction and scanning](../../README.md#ZeekFileExtraction) to examine file transfers carved from network traffic. When Zeek extracts a file it observes being transfered in network traffic, the `file-monitor` container picks up those extracted files and publishes to a [ZeroMQ](https://zeromq.org/) topic that can be subscribed to by any other process that wants to analyze that extracted file. In Malcolm at the time of this writing (as of the [v5.0.0 release](https://github.com/idaholab/Malcolm/releases/tag/v5.0.0)), currently implemented file scanners include ClamAV, YARA, capa and VirusTotal, all of which are managed by the `file-monitor` container. The scripts involved in this code are: @@ -9,6 +9,6 @@ When Zeek extracts a file it observes being transfered in network traffic, the ` * [shared/bin/zeek_carve_logger.py](../../shared/bin/zeek_carve_logger.py) - subscribes to `zeek_carve_scanner.py`'s topic and logs hits to a "fake" Zeek signatures.log file which is parsed and ingested by Logstash * [shared/bin/zeek_carve_utils.py](../../shared/bin/zeek_carve_utils.py) - various variables and classes related to carved file scanning -Additional file scanners could either be added to the `file-monitor` service, or to avoid coupling with Malcolm's code you could simply define a new service as instructed in the [Adding a new service](#NewImage) section and write your own scripts to subscribe and publish to the topics as described above. While that might be a bit of hand-waving, these general steps take care of the plumbing around extracting the file and notifying your tool, as well as handling the logging of "hits": you shouldn't have to really edit any *existing* code to add a new carved file scanner. +Additional file scanners could either be added to the `file-monitor` service, or to avoid coupling with Malcolm's code you could simply define a new service as instructed in the [Adding a new service](contributing-new-image.md#NewImage) section and write your own scripts to subscribe and publish to the topics as described above. While that might be a bit of hand-waving, these general steps take care of the plumbing around extracting the file and notifying your tool, as well as handling the logging of "hits": you shouldn't have to really edit any *existing* code to add a new carved file scanner. The `EXTRACTED_FILE_PIPELINE_DEBUG` and `EXTRACTED_FILE_PIPELINE_DEBUG_EXTRA` environment variables in the `docker-compose` files can be set to `true` to enable verbose debug logging from the output of the Docker containers involved in the carved file processing pipeline. \ No newline at end of file diff --git a/docs/contributing-local-modifications.md b/docs/contributing-local-modifications.md index 77aeb5ba5..57e791dca 100644 --- a/docs/contributing-local-modifications.md +++ b/docs/contributing-local-modifications.md @@ -123,4 +123,4 @@ Another method for modifying your local copies of Malcolm's services' containers For example, say you wanted to create a Malcolm container which includes a new dashboard for OpenSearch Dashboards and a new enrichment filter `.conf` file for Logstash. After placing these files under `./dashboards/dashboards` and `./logstash/pipelines/enrichment`, respectively, in your Malcolm working copy, run `./build.sh dashboards-helper logstash` to build just those containers. After the build completes, you can run `docker images` and see you have fresh images for `malcolmnetsec/dashboards-helper` and `malcolmnetsec/logstash-oss`. You may need to review the contents of the [Dockerfiles](../../Dockerfiles) to determine the correct service and filesystem location within that service's Docker image depending on what you're trying to accomplish. -Alternately, if you have forked Malcolm on GitHub, [workflow files](../../.github/workflows/) are provided which contain instructions for GitHub to build the docker images and [sensor](#Hedgehog) and [Malcolm](#ISO) installer ISOs. The resulting images are named according to the pattern `ghcr.io/owner/malcolmnetsec/image:branch` (e.g., if you've forked Malcolm with the github user `romeogdetlevjr`, the `arkime` container built for the `main` would be named `ghcr.io/romeogdetlevjr/malcolmnetsec/arkime:main`). To run your local instance of Malcolm using these images instead of the official ones, you'll need to edit your `docker-compose.yml` file(s) and replace the `image:` tags according to this new pattern, or use the bash helper script `./shared/bin/github_image_helper.sh` to pull and re-tag the images. \ No newline at end of file +Alternately, if you have forked Malcolm on GitHub, [workflow files](../../.github/workflows/) are provided which contain instructions for GitHub to build the docker images and [sensor](live-analysis.md#Hedgehog) and [Malcolm](malcolm-iso.md#ISO) installer ISOs. The resulting images are named according to the pattern `ghcr.io/owner/malcolmnetsec/image:branch` (e.g., if you've forked Malcolm with the github user `romeogdetlevjr`, the `arkime` container built for the `main` would be named `ghcr.io/romeogdetlevjr/malcolmnetsec/arkime:main`). To run your local instance of Malcolm using these images instead of the official ones, you'll need to edit your `docker-compose.yml` file(s) and replace the `image:` tags according to this new pattern, or use the bash helper script `./shared/bin/github_image_helper.sh` to pull and re-tag the images. \ No newline at end of file diff --git a/docs/contributing-logstash.md b/docs/contributing-logstash.md index a2598d3b9..5020273cc 100644 --- a/docs/contributing-logstash.md +++ b/docs/contributing-logstash.md @@ -2,9 +2,9 @@ ### Parsing a new log data source -Let's continue with the example of the `cooltool` service we added in the [PCAP processors](#PCAP) section above, assuming that `cooltool` generates some textual log files we want to parse and index into Malcolm. +Let's continue with the example of the `cooltool` service we added in the [PCAP processors](contributing-pcap.md#PCAP) section above, assuming that `cooltool` generates some textual log files we want to parse and index into Malcolm. -You'd have configured `cooltool` in your `cooltool.Dockerfile` and its section in the `docker-compose` files to write logs into a subdirectory or subdirectories in a shared folder [bind mounted](#Bind) in such a way that both the `cooltool` and `filebeat` containers can access. Referring to the `zeek` container as an example, this is how the `./zeek-logs` folder is handled; both the `filebeat` and `zeek` services have `./zeek-logs` in their `volumes:` section: +You'd have configured `cooltool` in your `cooltool.Dockerfile` and its section in the `docker-compose` files to write logs into a subdirectory or subdirectories in a shared folder [bind mounted](contributing-local-modifications.md#Bind) in such a way that both the `cooltool` and `filebeat` containers can access. Referring to the `zeek` container as an example, this is how the `./zeek-logs` folder is handled; both the `filebeat` and `zeek` services have `./zeek-logs` in their `volumes:` section: ``` $ grep -P "^( - ./zeek-logs| [\w-]+:)" docker-compose.yml | grep -B1 "zeek-logs" @@ -18,12 +18,12 @@ $ grep -P "^( - ./zeek-logs| [\w-]+:)" docker-compose.yml | grep -B1 "zeek You'll need to provide access to your `cooltool` logs in a similar fashion. -Next, tweak [`filebeat.yml`](../../filebeat/filebeat.yml) by adding a new log input path pointing to the `cooltool` logs to send them along to the `logstash` container. This modified `filebeat.yml` will need to be reflected in the `filebeat` container via [bind mount](#Bind) or by [rebuilding](#Build) it. +Next, tweak [`filebeat.yml`](../../filebeat/filebeat.yml) by adding a new log input path pointing to the `cooltool` logs to send them along to the `logstash` container. This modified `filebeat.yml` will need to be reflected in the `filebeat` container via [bind mount](contributing-local-modifications.md#Bind) or by [rebuilding](development.md#Build) it. Logstash can then be easily extended to add more [`logstash/pipelines`](../../logstash/pipelines). At the time of this writing (as of the [v5.0.0 release](https://github.com/idaholab/Malcolm/releases/tag/v5.0.0)), the Logstash pipelines basically look like this: * input (from `filebeat`) sends logs to 1..*n* **parse pipelines** -* each **parse pipeline** does what it needs to do to parse its logs then sends them to the [**enrichment pipeline**](#LogstashEnrichments) +* each **parse pipeline** does what it needs to do to parse its logs then sends them to the [**enrichment pipeline**](contributing-logstash.md#LogstashEnrichments) * the [**enrichment pipeline**](../../logstash/pipelines/enrichment) performs common lookups to the fields that have been normalized and indexes the logs into the OpenSearch data store So, in order to add a new **parse pipeline** for `cooltool` after tweaking [`filebeat.yml`](../../filebeat/filebeat.yml) as described above, create a `cooltool` directory under [`logstash/pipelines`](../../logstash/pipelines) which follows the same pattern as the `zeek` parse pipeline. This directory will have an input file (tiny), a filter file (possibly large), and an output file (tiny). In your filter file, be sure to set the field [`event.hash`](https://www.elastic.co/guide/en/ecs/master/ecs-event.html#field-event-hash) to a unique value to identify indexed documents in OpenSearch; the [fingerprint filter](https://www.elastic.co/guide/en/logstash/current/plugins-filters-fingerprint.html) may be useful for this. @@ -40,12 +40,12 @@ The following modifications must be made in order for Malcolm to be able to pars * Take care, especially when copy-pasting filter code, that the Zeek delimiter isn't modified from a tab character to a space character (see "*zeek's default delimiter is a literal tab, MAKE SURE YOUR EDITOR DOESN'T SCREW IT UP*" warnings in that file) 1. If necessary, perform log normalization in [`logstash/pipelines/zeek/12_zeek_normalize.conf`](../../logstash/pipelines/zeek/12_zeek_normalize.conf) for values like action (`event.action`), result (`event.result`), application protocol version (`network.protocol_version`), etc. 1. If necessary, define conversions for floating point or integer values in [`logstash/pipelines/zeek/11_zeek_logs.conf`](../../logstash/pipelines/zeek/13_zeek_convert.conf) -1. Identify the new fields and add them as described in [Adding new log fields](#NewFields) +1. Identify the new fields and add them as described in [Adding new log fields](contributing-new-log-fields.md#NewFields) ### Enrichments -Malcolm's Logstash instance will do a lot of enrichments for you automatically: see the [enrichment pipeline](../../logstash/pipelines/enrichment), including MAC address to vendor by OUI, GeoIP, ASN, and a few others. In order to take advantage of these enrichments that are already in place, normalize new fields to use the same standardized field names Malcolm uses for things like IP addresses, MAC addresses, etc. You can add your own additional enrichments by creating new `.conf` files containing [Logstash filters](https://www.elastic.co/guide/en/logstash/7.10/filter-plugins.html) in the [enrichment pipeline](../../logstash/pipelines/enrichment) directory and using either of the techniques in the [Local modifications](#LocalMods) section to implement your changes in the `logstash` container +Malcolm's Logstash instance will do a lot of enrichments for you automatically: see the [enrichment pipeline](../../logstash/pipelines/enrichment), including MAC address to vendor by OUI, GeoIP, ASN, and a few others. In order to take advantage of these enrichments that are already in place, normalize new fields to use the same standardized field names Malcolm uses for things like IP addresses, MAC addresses, etc. You can add your own additional enrichments by creating new `.conf` files containing [Logstash filters](https://www.elastic.co/guide/en/logstash/7.10/filter-plugins.html) in the [enrichment pipeline](../../logstash/pipelines/enrichment) directory and using either of the techniques in the [Local modifications](contributing-local-modifications.md#LocalMods) section to implement your changes in the `logstash` container ### Logstash plugins -The [logstash.Dockerfile](../../Dockerfiles/logstash.Dockerfile) installs the Logstash plugins used by Malcolm (search for `logstash-plugin install` in that file). Additional Logstash plugins could be installed by modifying this Dockerfile and [rebuilding](#Build) the `logstash` Docker image. \ No newline at end of file +The [logstash.Dockerfile](../../Dockerfiles/logstash.Dockerfile) installs the Logstash plugins used by Malcolm (search for `logstash-plugin install` in that file). Additional Logstash plugins could be installed by modifying this Dockerfile and [rebuilding](development.md#Build) the `logstash` Docker image. \ No newline at end of file diff --git a/docs/contributing-pcap.md b/docs/contributing-pcap.md index b847c0e2e..0f6a1e279 100644 --- a/docs/contributing-pcap.md +++ b/docs/contributing-pcap.md @@ -2,10 +2,10 @@ When a PCAP is uploaded (either through Malcolm's [upload web interface](../../README.md#Upload) or just copied manually into the `./pcap/upload/` directory), the `pcap-monitor` container has a script that picks up those PCAP files and publishes to a [ZeroMQ](https://zeromq.org/) topic that can be subscribed to by any other process that wants to analyze that PCAP. In Malcolm at the time of this writing (as of the [v5.0.0 release](https://github.com/idaholab/Malcolm/releases/tag/v5.0.0)), there are two of those: the `zeek` container and the `arkime` container. In Malcolm, they actually both share the [same script](../../shared/bin/pcap_processor.py) to read from that topic and run the PCAP through Zeek and Arkime, respectively. If you're looking for an example to follow, the `zeek` container is the less complicated of the two. So, if you were looking to integrate a new PCAP processing tool into Malcolm (named `cooltool` for this example), the process would be something like: -1. Define your service as instructed in the [Adding a new service](#NewImage) section - * Note how the existing `zeek` and `arkime` services use [bind mounts](#Bind) to access the local `./pcap` directory +1. Define your service as instructed in the [Adding a new service](contributing-new-image.md#NewImage) section + * Note how the existing `zeek` and `arkime` services use [bind mounts](contributing-local-modifications.md#Bind) to access the local `./pcap` directory 1. Write a script (modelled after [the one](../../shared/bin/pcap_processor.py) `zeek` and `arkime` use, if you like) which subscribes to the PCAP topic port (`30441` as defined in [pcap_utils.py](../../shared/bin/pcap_utils.py)) and handles the PCAP files published there, each PCAP file represented by a JSON dictionary with `name`, `tags`, `size`, `type` and `mime` keys (search for `FILE_INFO_` in [pcap_utils.py](../../shared/bin/pcap_utils.py)). This script should be added to and run by your `cooltool.Dockerfile`-generated container. -1. Add whatever other logic needed to get your tool's data into Malcolm, whether by writing it directly info OpenSearch or by sending log files for parsing and enrichment by [Logstash](#Logstash) (especially see the section on [Parsing a new log data source](#LogstashNewSource)) +1. Add whatever other logic needed to get your tool's data into Malcolm, whether by writing it directly info OpenSearch or by sending log files for parsing and enrichment by [Logstash](contributing-logstash.md#Logstash) (especially see the section on [Parsing a new log data source](contributing-logstash.md#LogstashNewSource)) While that might be a bit of hand-waving, these general steps take care of the PCAP processing piece: you shouldn't have to really edit any *existing* code to add a new PCAP processor. You're just creating a new container for the Malcolm appliance to the ZeroMQ topic and handle the PCAPs your tool receives. diff --git a/docs/contributing-zeek.md b/docs/contributing-zeek.md index 7666b5c43..128c1a798 100644 --- a/docs/contributing-zeek.md +++ b/docs/contributing-zeek.md @@ -4,12 +4,12 @@ Some Zeek behavior can be tweaked without having to manually edit configuration files through the use of environment variables: search for `ZEEK` in the [`docker-compose.yml` parameters](../../README.md#DockerComposeYml) section of the documentation. -Other changes to Zeek's behavior could be made by modifying [local.zeek](../../zeek/config/local.zeek) and either using a [bind mount](#Bind) or [rebuilding](#Build) the `zeek` Docker image with the modification. See the [Zeek documentation](https://docs.zeek.org/en/master/quickstart.html#local-site-customization) for more information on customizing a Zeek instance. Note that changing Zeek's behavior could result in changes to the format of the logs Zeek generates, which could break Malcolm's parsing of those logs, so exercise caution. +Other changes to Zeek's behavior could be made by modifying [local.zeek](../../zeek/config/local.zeek) and either using a [bind mount](contributing-local-modifications.md#Bind) or [rebuilding](development.md#Build) the `zeek` Docker image with the modification. See the [Zeek documentation](https://docs.zeek.org/en/master/quickstart.html#local-site-customization) for more information on customizing a Zeek instance. Note that changing Zeek's behavior could result in changes to the format of the logs Zeek generates, which could break Malcolm's parsing of those logs, so exercise caution. ### Adding a new Zeek package -The easiest way to add a new Zeek package to Malcolm is to add the git URL of that package to the `ZKG_GITHUB_URLS` array in [zeek_install_plugins.sh](../../shared/bin/zeek_install_plugins.sh) script and then [rebuilding](#Build) the `zeek` Docker image. This will cause your package to be installed (via the [`zkg`](https://docs.zeek.org/projects/package-manager/en/stable/zkg.html) command-line tool). See [Parsing new Zeek logs](#LogstashZeek) on how to process any new `.log` files if your package generates them. +The easiest way to add a new Zeek package to Malcolm is to add the git URL of that package to the `ZKG_GITHUB_URLS` array in [zeek_install_plugins.sh](../../shared/bin/zeek_install_plugins.sh) script and then [rebuilding](development.md#Build) the `zeek` Docker image. This will cause your package to be installed (via the [`zkg`](https://docs.zeek.org/projects/package-manager/en/stable/zkg.html) command-line tool). See [Parsing new Zeek logs](contributing-logstash.md#LogstashZeek) on how to process any new `.log` files if your package generates them. -### Zeek Intelligence Framework +### Zeek Intelligence Framework See [Zeek Intelligence Framework](../../README.md#ZeekIntel) in the Malcolm README for information on how to use Zeek's [Intelligence Framework](https://docs.zeek.org/en/master/frameworks/intel.html) with Malcolm. \ No newline at end of file diff --git a/docs/contributing.md b/docs/contributing.md index 851abe2ea..6c89497ee 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -4,34 +4,24 @@ The purpose of this document is to provide some direction for those willing to m ## Table of Contents -* [Local modifications](#LocalMods) - + [Docker bind mounts](#Bind) - + [Building Malcolm's Docker images](#Build) -* [Adding a new service (Docker image)](#NewImage) +* [Local modifications](contributing-local-modifications.md#LocalMods) + + [Docker bind mounts](contributing-local-modifications.md#Bind) + + [Building Malcolm's Docker images](development.md#Build) +* [Adding a new service (Docker image)](contributing-new-image.md#NewImage) + [Networking and firewall](#NewImageFirewall) -* [Adding new log fields](#NewFields) -- [Zeek](#Zeek) +* [Adding new log fields](contributing-new-log-fields.md#NewFields) +- [Zeek](contributing-zeek.md#Zeek) + [`local.zeek`](#LocalZeek) - + [Adding a new Zeek package](#ZeekPackage) - + [Zeek Intelligence Framework](#ZeekIntel) -* [PCAP processors](#PCAP) -* [Logstash](#Logstash) - + [Parsing a new log data source](#LogstashNewSource) - + [Parsing new Zeek logs](#LogstashZeek) - + [Enrichments](#LogstashEnrichments) + + [Adding a new Zeek package](contributing-zeek.md#ZeekPackage) + + [Zeek Intelligence Framework](#ContributingZeekIntel) +* [PCAP processors](contributing-pcap.md#PCAP) +* [Logstash](contributing-logstash.md#Logstash) + + [Parsing a new log data source](contributing-logstash.md#LogstashNewSource) + + [Parsing new Zeek logs](contributing-logstash.md#LogstashZeek) + + [Enrichments](contributing-logstash.md#LogstashEnrichments) + [Logstash plugins](#LogstashPlugins) * [OpenSearch Dashboards](#dashboards) + [Adding new visualizations and dashboards](#DashboardsNewViz) + [OpenSearch Dashboards plugins](#DashboardsPlugins) * [Carved file scanners](#Scanners) * [Style](#Style) - -## Copyright - -[Malcolm](https://github.com/idaholab/Malcolm) is Copyright 2022 Battelle Energy Alliance, LLC, and is developed and released through the cooperation of the [Cybersecurity and Infrastructure Security Agency](https://www.cisa.gov/) of the [U.S. Department of Homeland Security](https://www.dhs.gov/). - -See [`License.txt`](../../License.txt) for the terms of its release. - -### Contact information of author(s): - -[malcolm@inl.gov](mailto:malcolm@inl.gov?subject=Malcolm) diff --git a/docs/cyberchef.md b/docs/cyberchef.md index 6fae108d8..3f8c7c98e 100644 --- a/docs/cyberchef.md +++ b/docs/cyberchef.md @@ -2,4 +2,4 @@ Malcolm provides an instance of [CyberChef](https://github.com/gchq/CyberChef), the "Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis." CyberChef is available at at [https://localhost/cyberchef.html](https://localhost/cyberchef.html) if you are connecting locally. -Arkime's [Sessions](#ArkimeSessions) view has built-in CyberChef integration for Arkime sessions with full PCAP payloads available: expanding a session and opening the **Packet Options** drop-down menu in its payload section will provide options for **Open src packets with CyberChef** and **Open dst packets with CyberChef**. \ No newline at end of file +Arkime's [Sessions](arkime.md#ArkimeSessions) view has built-in CyberChef integration for Arkime sessions with full PCAP payloads available: expanding a session and opening the **Packet Options** drop-down menu in its payload section will provide options for **Open src packets with CyberChef** and **Open dst packets with CyberChef**. \ No newline at end of file diff --git a/docs/development.md b/docs/development.md index 1fd8773b6..e39cc7a02 100644 --- a/docs/development.md +++ b/docs/development.md @@ -9,7 +9,7 @@ Checking out the [Malcolm source code](https://github.com/idaholab/Malcolm/tree/ * `Dockerfiles` - a directory containing build instructions for Malcolm's docker images * `docs` - a directory containing instructions and documentation * `opensearch` - an initially empty directory where the OpenSearch database instance will reside -* `opensearch-backup` - an initially empty directory for storing OpenSearch [index snapshots](#IndexManagement) +* `opensearch-backup` - an initially empty directory for storing OpenSearch [index snapshots](index-management.md#IndexManagement) * `filebeat` - code and configuration for the `filebeat` container which ingests Zeek logs and forwards them to the `logstash` container * `file-monitor` - code and configuration for the `file-monitor` container which can scan files extracted by Zeek * `file-upload` - code and configuration for the `upload` container which serves a web browser-based upload form for uploading PCAP files and Zeek logs, and which serves an SFTP share as an alternate method for upload @@ -17,15 +17,15 @@ Checking out the [Malcolm source code](https://github.com/idaholab/Malcolm/tree/ * `htadmin` - configuration for the `htadmin` user account management container * `dashboards` - code and configuration for the `dashboards` container for creating additional ad-hoc visualizations and dashboards beyond that which is provided by Arkime Viewer * `logstash` - code and configuration for the `logstash` container which parses Zeek logs and forwards them to the `opensearch` container -* `malcolm-iso` - code and configuration for building an [installer ISO](#ISO) for a minimal Debian-based Linux installation for running Malcolm -* `name-map-ui` - code and configuration for the `name-map-ui` container which provides the [host and subnet name mapping](#HostAndSubnetNaming) interface +* `malcolm-iso` - code and configuration for building an [installer ISO](malcolm-iso.md#ISO) for a minimal Debian-based Linux installation for running Malcolm +* `name-map-ui` - code and configuration for the `name-map-ui` container which provides the [host and subnet name mapping](host-and-subnet-mapping.md#HostAndSubnetNaming) interface * `netbox` - code and configuration for the `netbox`, `netbox-postgres`, `netbox-redis` and `netbox-redis-cache` containers which provide asset management capabilities * `nginx` - configuration for the `nginx` reverse proxy container * `pcap` - an initially empty directory for PCAP files to be uploaded, processed, and stored * `pcap-capture` - code and configuration for the `pcap-capture` container which can capture network traffic * `pcap-monitor` - code and configuration for the `pcap-monitor` container which watches for new or uploaded PCAP files notifies the other services to process them * `scripts` - control scripts for starting, stopping, restarting, etc. Malcolm -* `sensor-iso` - code and configuration for building a [Hedgehog Linux](#Hedgehog) ISO +* `sensor-iso` - code and configuration for building a [Hedgehog Linux](live-analysis.md#Hedgehog) ISO * `shared` - miscellaneous code used by various Malcolm components * `suricata` - code and configuration for the `suricata` container which handles PCAP processing using Suricata * `suricata-logs` - an initially empty directory for Suricata logs to be uploaded, processed, and stored @@ -39,7 +39,7 @@ and the following files of special note: * `host-map.txt` - specify custom IP and/or MAC address to host mapping * `net-map.json` - an alternative to `cidr-map.txt` and `host-map.txt`, mapping hosts and network segments to their names in a JSON-formatted file * `docker-compose.yml` - the configuration file used by `docker-compose` to build, start, and stop an instance of the Malcolm appliance -* `docker-compose-standalone.yml` - similar to `docker-compose.yml`, only used for the ["packaged"](#Packager) installation of Malcolm +* `docker-compose-standalone.yml` - similar to `docker-compose.yml`, only used for the ["packaged"](development.md#Packager) installation of Malcolm ### Building from source @@ -72,7 +72,7 @@ Then, go take a walk or something since it will be a while. When you're done, yo * `malcolmnetsec/suricata` (based on `debian:11-slim`) * `malcolmnetsec/zeek` (based on `debian:11-slim`) -Alternately, if you have forked Malcolm on GitHub, [workflow files](./.github/workflows/) are provided which contain instructions for GitHub to build the docker images and [sensor](#Hedgehog) and [Malcolm](#ISO) installer ISOs. The resulting images are named according to the pattern `ghcr.io/owner/malcolmnetsec/image:branch` (e.g., if you've forked Malcolm with the github user `romeogdetlevjr`, the `arkime` container built for the `main` would be named `ghcr.io/romeogdetlevjr/malcolmnetsec/arkime:main`). To run your local instance of Malcolm using these images instead of the official ones, you'll need to edit your `docker-compose.yml` file(s) and replace the `image:` tags according to this new pattern, or use the bash helper script `./shared/bin/github_image_helper.sh` to pull and re-tag the images. +Alternately, if you have forked Malcolm on GitHub, [workflow files](./.github/workflows/) are provided which contain instructions for GitHub to build the docker images and [sensor](live-analysis.md#Hedgehog) and [Malcolm](malcolm-iso.md#ISO) installer ISOs. The resulting images are named according to the pattern `ghcr.io/owner/malcolmnetsec/image:branch` (e.g., if you've forked Malcolm with the github user `romeogdetlevjr`, the `arkime` container built for the `main` would be named `ghcr.io/romeogdetlevjr/malcolmnetsec/arkime:main`). To run your local instance of Malcolm using these images instead of the official ones, you'll need to edit your `docker-compose.yml` file(s) and replace the `image:` tags according to this new pattern, or use the bash helper script `./shared/bin/github_image_helper.sh` to pull and re-tag the images. ## Pre-Packaged installation files diff --git a/docs/download.md b/docs/download.md index 180655cda..e7fb9a52c 100644 --- a/docs/download.md +++ b/docs/download.md @@ -22,7 +22,7 @@ While official downloads of the Malcolm installer ISO are not provided, an **uno ### Installer ISO -[Instructions are provided](/hedgehog/#ISOBuild) to generate the Hedgehog Linux ISO from source. While official downloads of the Hedgehog Linux ISO are not provided, an **unofficial build** of the ISO installer for the latest stable release is available for download here. +[Instructions are provided](/hedgehog/#SensorISOBuild) to generate the Hedgehog Linux ISO from source. While official downloads of the Hedgehog Linux ISO are not provided, an **unofficial build** of the ISO installer for the latest stable release is available for download here. | ISO | SHA256 | |---|---| diff --git a/docs/file-scanning.md b/docs/file-scanning.md index d51926db9..914894eb3 100644 --- a/docs/file-scanning.md +++ b/docs/file-scanning.md @@ -1,6 +1,6 @@ ### Automatic file extraction and scanning -Malcolm can leverage Zeek's knowledge of network protocols to automatically detect file transfers and extract those files from PCAPs as Zeek processes them. This behavior can be enabled globally by modifying the `ZEEK_EXTRACTOR_MODE` [environment variable in `docker-compose.yml`](#DockerComposeYml), or on a per-upload basis for PCAP files uploaded via the [browser-based upload form](#Upload) when **Analyze with Zeek** is selected. +Malcolm can leverage Zeek's knowledge of network protocols to automatically detect file transfers and extract those files from PCAPs as Zeek processes them. This behavior can be enabled globally by modifying the `ZEEK_EXTRACTOR_MODE` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml), or on a per-upload basis for PCAP files uploaded via the [browser-based upload form](upload.md#Upload) when **Analyze with Zeek** is selected. To specify which files should be extracted, the following values are acceptable in `ZEEK_EXTRACTOR_MODE`: @@ -12,17 +12,17 @@ To specify which files should be extracted, the following values are acceptable Extracted files can be examined through any of the following methods: -* submitting file hashes to [**VirusTotal**](https://www.virustotal.com/en/#search); to enable this method, specify the `VTOT_API2_KEY` [environment variable in `docker-compose.yml`](#DockerComposeYml) -* scanning files with [**ClamAV**](https://www.clamav.net/); to enable this method, set the `EXTRACTED_FILE_ENABLE_CLAMAV` [environment variable in `docker-compose.yml`](#DockerComposeYml) to `true` -* scanning files with [**Yara**](https://github.com/VirusTotal/yara); to enable this method, set the `EXTRACTED_FILE_ENABLE_YARA` [environment variable in `docker-compose.yml`](#DockerComposeYml) to `true` -* scanning PE (portable executable) files with [**Capa**](https://github.com/fireeye/capa); to enable this method, set the `EXTRACTED_FILE_ENABLE_CAPA` [environment variable in `docker-compose.yml`](#DockerComposeYml) to `true` +* submitting file hashes to [**VirusTotal**](https://www.virustotal.com/en/#search); to enable this method, specify the `VTOT_API2_KEY` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) +* scanning files with [**ClamAV**](https://www.clamav.net/); to enable this method, set the `EXTRACTED_FILE_ENABLE_CLAMAV` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) to `true` +* scanning files with [**Yara**](https://github.com/VirusTotal/yara); to enable this method, set the `EXTRACTED_FILE_ENABLE_YARA` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) to `true` +* scanning PE (portable executable) files with [**Capa**](https://github.com/fireeye/capa); to enable this method, set the `EXTRACTED_FILE_ENABLE_CAPA` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) to `true` Files which are flagged via any of these methods will be logged as Zeek `signatures.log` entries, and can be viewed in the **Signatures** dashboard in OpenSearch Dashboards. -The `EXTRACTED_FILE_PRESERVATION` [environment variable in `docker-compose.yml`](#DockerComposeYml) determines the behavior for preservation of Zeek-extracted files: +The `EXTRACTED_FILE_PRESERVATION` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) determines the behavior for preservation of Zeek-extracted files: * `quarantined`: preserve only flagged files in `./zeek-logs/extract_files/quarantine` * `all`: preserve flagged files in `./zeek-logs/extract_files/quarantine` and all other extracted files in `./zeek-logs/extract_files/preserved` * `none`: preserve no extracted files -The `EXTRACTED_FILE_HTTP_SERVER_…` [environment variables in `docker-compose.yml`](#DockerComposeYml) configure access to the Zeek-extracted files path through the means of a simple HTTPS directory server. Beware that Zeek-extracted files may contain malware. As such, the files may be optionally encrypted upon download. \ No newline at end of file +The `EXTRACTED_FILE_HTTP_SERVER_…` [environment variables in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) configure access to the Zeek-extracted files path through the means of a simple HTTPS directory server. Beware that Zeek-extracted files may contain malware. As such, the files may be optionally encrypted upon download. \ No newline at end of file diff --git a/docs/hedgehog-boot.md b/docs/hedgehog-boot.md index b189cd1ef..67c8c8478 100644 --- a/docs/hedgehog-boot.md +++ b/docs/hedgehog-boot.md @@ -1,8 +1,8 @@ -# Boot +# Boot Each time the sensor boots, a grub boot menu will be shown briefly, after which the sensor will proceed to load. -## Kiosk mode +## Kiosk mode ![Kiosk mode sensor menu: resource statistics](./docs/images/kiosk_mode_sensor_menu.png) diff --git a/docs/hedgehog-config-root.md b/docs/hedgehog-config-root.md index eafe5e2a0..5a6ab068d 100644 --- a/docs/hedgehog-config-root.md +++ b/docs/hedgehog-config-root.md @@ -1,6 +1,6 @@ -## Interfaces, hostname, and time synchronization +## Interfaces, hostname, and time synchronization -### Hostname +### Hostname The first step of sensor configuration is to configure the network interfaces and sensor hostname. Clicking the **Configure Interfaces and Hostname** toolbar icon (or, if you are at a command line prompt, running `configure-interfaces`) will prompt you for the root password you created during installation, after which the configuration welcome screen is shown. Select **Continue** to proceed. @@ -12,7 +12,7 @@ Selecting **Hostname**, you will be presented with a summary of the current sens ![Specifying a new sensor hostname](./docs/images/hostname_setting.png) -### Interfaces +### Interfaces Returning to the configuration mode selection, choose **Interface**. You will be prompted if you would like help identifying network interfaces. If you select **Yes**, you will be prompted to select a network interface, after which that interface's link LED will blink for 10 seconds to help you in its identification. This network interface identification aid will continue to prompt you to identify further network interfaces until you select **No**. @@ -30,7 +30,7 @@ If you select static, you will be prompted to enter the IP address, netmask, and In either case, upon selecting **OK** the network interface will be brought down, configured, and brought back up, and the result of the operation will be displayed. You may choose **Quit** upon returning to the configuration tool's welcome screen. -### Time synchronization +### Time synchronization Returning to the configuration mode selection, choose **Time Sync**. Here you can configure the sensor to keep its time synchronized with either an NTP server (using the NTP protocol) or a local [Malcolm](https://github.com/idaholab/Malcolm) aggregator or another HTTP/HTTPS server. On the next dialog, choose the time synchronization method you wish to configure. diff --git a/docs/hedgehog-config-user.md b/docs/hedgehog-config-user.md index 64e2a9ce0..c38c31058 100644 --- a/docs/hedgehog-config-user.md +++ b/docs/hedgehog-config-user.md @@ -1,10 +1,10 @@ -## Capture, forwarding, and autostart services +## Capture, forwarding, and autostart services Clicking the **Configure Capture and Forwarding** toolbar icon (or, if you are at a command prompt, running `configure-capture`) will launch the configuration tool for capture and forwarding. The root password is not required as it was for the interface and hostname configuration, as sensor services are run under the non-privileged sensor account. Select **Continue** to proceed. You may select from a list of configuration options. ![Select configuration mode](./docs/images/capture_config_main.png) -### Capture +### Capture Choose **Configure Capture** to configure parameters related to traffic capture and local analysis. You will be prompted if you would like help identifying network interfaces. If you select **Yes**, you will be prompted to select a network interface, after which that interface's link LED will blink for 10 seconds to help you in its identification. This network interface identification aid will continue to prompt you to identify further network interfaces until you select **No**. @@ -20,7 +20,7 @@ Next you must specify the paths where captured PCAP files and logs will be store ![Specify capture paths](./docs/images/capture_paths.png) -#### Automatic file extraction and scanning +#### Automatic file extraction and scanning Hedgehog Linux can leverage Zeek's knowledge of network protocols to automatically detect file transfers and extract those files from network traffic as Zeek sees them. @@ -30,7 +30,7 @@ To specify which files should be extracted, specify the Zeek file carving mode: If you're not sure what to choose, either of **mapped (except common plain text files)** (if you want to carve and scan almost all files) or **interesting** (if you only want to carve and scan files with [mime types of common attack vectors](./interface/sensor_ctl/zeek/extractor_override.interesting.zeek)) is probably a good choice. -Next, specify which carved files to preserve (saved on the sensor under `/capture/bro/capture/extract_files/quarantine` by default). In order to not consume all of the sensor's available storage space, the oldest preserved files will be pruned along with the oldest Zeek logs as described below with **AUTOSTART_PRUNE_ZEEK** in the [autostart services](#ConfigAutostart) section. +Next, specify which carved files to preserve (saved on the sensor under `/capture/bro/capture/extract_files/quarantine` by default). In order to not consume all of the sensor's available storage space, the oldest preserved files will be pruned along with the oldest Zeek logs as described below with **AUTOSTART_PRUNE_ZEEK** in the [autostart services](hedgehog-config-user.md#HedgehogConfigAutostart) section. You'll be prompted to specify which engine(s) to use to analyze extracted files. Extracted files can be examined through any of three methods: @@ -47,7 +47,7 @@ Files which are flagged as potentially malicious will be logged as Zeek `signatu Finally, you will be presented with the list of configuration variables that will be used for capture, including the values which you have configured up to this point in this section. Upon choosing **OK** these values will be written back out to the sensor configuration file located at `/opt/sensor/sensor_ctl/control_vars.conf`. It is not recommended that you edit this file manually. After confirming these values, you will be presented with a confirmation that these settings have been written to the configuration file, and you will be returned to the welcome screen. -### Forwarding +### Forwarding Select **Configure Forwarding** to set up forwarding logs and statistics from the sensor to an aggregator server, such as [Malcolm](https://github.com/idaholab/Malcolm). @@ -55,7 +55,7 @@ Select **Configure Forwarding** to set up forwarding logs and statistics from th There are five forwarder services used on the sensor, each for forwarding a different type of log or sensor metric. -### capture: Arkime session forwarding +### capture: Arkime session forwarding [capture](https://github.com/arkime/arkime/tree/master/capture) is not only used to capture PCAP files, but also the parse raw traffic into sessions and forward this session metadata to an [OpenSearch](https://opensearch.org/) database so that it can be viewed in [Arkime viewer](https://arkime.com/), whether standalone or as part of a [Malcolm](https://github.com/idaholab/Malcolm) instance. If you're using Hedgehog Linux with Malcolm, please read [Correlating Zeek logs and Arkime sessions](https://github.com/idaholab/Malcolm#ZeekArkimeFlowCorrelation) in the Malcolm documentation for more information. @@ -79,7 +79,7 @@ Finally, you'll be given the opportunity to review the all of the Arkime `captur ![capture settings confirmation](./docs/images/arkime_confirm.png) -### filebeat: Zeek and Suricata log forwarding +### filebeat: Zeek and Suricata log forwarding [Filebeat](https://www.elastic.co/products/beats/filebeat) is used to forward [Zeek](https://www.zeek.org/) and [Suricata](https://suricata.io/) logs to a remote [Logstash](https://www.elastic.co/products/logstash) instance for further enrichment prior to insertion into an [OpenSearch](https://opensearch.org/) database. @@ -109,21 +109,21 @@ Once you have specified all of the filebeat parameters, you will be presented wi ![Confirm filebeat settings](./docs/images/filebeat_confirm.png) -### miscbeat: System metrics forwarding +### miscbeat: System metrics forwarding -The sensor uses [Fluent Bit](https://fluentbit.io/) to gather miscellaneous system resource metrics (CPU, network I/O, disk I/O, memory utilization, temperature, etc.) and the [Beats](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-tcp.html) protocol to forward these metrics to a remote [Logstash](https://www.elastic.co/products/logstash) instance for further enrichment prior to insertion into an [OpenSearch](https://opensearch.org/) database. Metrics categories can be enabled/disabled as described in the [autostart services](#ConfigAutostart) section of this document. +The sensor uses [Fluent Bit](https://fluentbit.io/) to gather miscellaneous system resource metrics (CPU, network I/O, disk I/O, memory utilization, temperature, etc.) and the [Beats](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-tcp.html) protocol to forward these metrics to a remote [Logstash](https://www.elastic.co/products/logstash) instance for further enrichment prior to insertion into an [OpenSearch](https://opensearch.org/) database. Metrics categories can be enabled/disabled as described in the [autostart services](hedgehog-config-user.md#HedgehogConfigAutostart) section of this document. -This forwarder's configuration is almost identical to that of [filebeat](#filebeat) in the previous section. Select `miscbeat` from the forwarding configuration mode options and follow the same steps outlined above to set up this forwarder. +This forwarder's configuration is almost identical to that of [filebeat](hedgehog-config-user.md#Hedgehogfilebeat) in the previous section. Select `miscbeat` from the forwarding configuration mode options and follow the same steps outlined above to set up this forwarder. -### Autostart services +### Autostart services Once the forwarders have been configured, the final step is to **Configure Autostart Services**. Choose this option from the configuration mode menu after the welcome screen of the sensor configuration tool. Despite configuring capture and/or forwarder services as described in previous sections, only services enabled in the autostart configuration will run when the sensor starts up. The available autostart processes are as follows (recommended services are in **bold text**): -* **AUTOSTART_ARKIME** - [capture](#arkime-capture) PCAP engine for traffic capture, as well as traffic parsing and metadata insertion into OpenSearch for viewing in [Arkime](https://arkime.com/). If you are using Hedgehog Linux along with [Malcolm](https://github.com/idaholab/Malcolm) or another Arkime installation, this is probably the packet capture engine you want to use. +* **AUTOSTART_ARKIME** - [capture](hedgehog-config-user.md#Hedgehogarkime-capture) PCAP engine for traffic capture, as well as traffic parsing and metadata insertion into OpenSearch for viewing in [Arkime](https://arkime.com/). If you are using Hedgehog Linux along with [Malcolm](https://github.com/idaholab/Malcolm) or another Arkime installation, this is probably the packet capture engine you want to use. * **AUTOSTART_CLAMAV_UPDATES** - Virus database update service for ClamAV (requires sensor to be connected to the internet) -* **AUTOSTART_FILEBEAT** - [filebeat](#filebeat) Zeek and Suricata log forwarder +* **AUTOSTART_FILEBEAT** - [filebeat](hedgehog-config-user.md#Hedgehogfilebeat) Zeek and Suricata log forwarder * **AUTOSTART_FLUENTBIT_AIDE** - [Fluent Bit](https://fluentbit.io/) agent [monitoring](https://docs.fluentbit.io/manual/pipeline/inputs/exec) [AIDE](https://aide.github.io/) file system integrity checks * **AUTOSTART_FLUENTBIT_AUDITLOG** - [Fluent Bit](https://fluentbit.io/) agent [monitoring](https://docs.fluentbit.io/manual/pipeline/inputs/tail) [auditd](https://man7.org/linux/man-pages/man8/auditd.8.html) logs * *AUTOSTART_FLUENTBIT_KMSG* - [Fluent Bit](https://fluentbit.io/) agent [monitoring](https://docs.fluentbit.io/manual/pipeline/inputs/kernel-logs) the Linux kernel log buffer (these are generally reflected in syslog as well, which may make this agent redundant) diff --git a/docs/hedgehog-config-zeek-intel.md b/docs/hedgehog-config-zeek-intel.md index e50966346..5a10f9660 100644 --- a/docs/hedgehog-config-zeek-intel.md +++ b/docs/hedgehog-config-zeek-intel.md @@ -1,4 +1,4 @@ -### Zeek Intelligence Framework +### Zeek Intelligence Framework To quote Zeek's [Intelligence Framework](https://docs.zeek.org/en/master/frameworks/intel.html) documentation, "The goals of Zeek’s Intelligence Framework are to consume intelligence data, make it available for matching, and provide infrastructure to improve performance and memory utilization. Data in the Intelligence Framework is an atomic piece of intelligence such as an IP address or an e-mail address. This atomic data will be packed with metadata such as a freeform source field, a freeform descriptive field, and a URL which might lead to more information about the specific item." Zeek [intelligence](https://docs.zeek.org/en/master/scripts/base/frameworks/intel/main.zeek.html) [indicator types](https://docs.zeek.org/en/master/scripts/base/frameworks/intel/main.zeek.html#type-Intel::Type) include IP addresses, URLs, file names, hashes, email addresses, and more. diff --git a/docs/hedgehog-config.md b/docs/hedgehog-config.md index b37bf2eff..d3881958d 100644 --- a/docs/hedgehog-config.md +++ b/docs/hedgehog-config.md @@ -1,4 +1,4 @@ -# Configuration +# Configuration Kiosk mode can be exited by connecting an external USB keyboard and pressing **Alt+F4**, upon which the *sensor* user's desktop is shown. @@ -13,5 +13,5 @@ Several icons are available in the top menu bar: * **Sensor status** – displays a list with the status of each sensor service * **Configure capture and forwarding** – opens a dialog for configuring the sensor's capture and forwarding services, as well as specifying which services should autostart upon boot * **Configure interfaces and hostname** – opens a dialog for configuring the sensor's network interfaces and setting the sensor's hostname -* **Restart sensor services** - stops and restarts all of the [autostart services](#ConfigAutostart) +* **Restart sensor services** - stops and restarts all of the [autostart services](hedgehog-config-user.md#HedgehogConfigAutostart) diff --git a/docs/hedgehog-hardening.md b/docs/hedgehog-hardening.md index e6214a646..9db45a9b9 100644 --- a/docs/hedgehog-hardening.md +++ b/docs/hedgehog-hardening.md @@ -1,4 +1,4 @@ -# Appendix D - Hardening +# Appendix D - Hardening Hedgehog Linux uses the [harbian-audit](https://github.com/hardenedlinux/harbian-audit) benchmarks which target the following guidelines for establishing a secure configuration posture: @@ -6,7 +6,7 @@ Hedgehog Linux uses the [harbian-audit](https://github.com/hardenedlinux/harbian * [DISA STIG (Security Technical Implementation Guides) for RHEL 7](https://www.stigviewer.com/stig/red_hat_enterprise_linux_7/) v2r5 Ubuntu v1r2 [adapted](https://github.com/hardenedlinux/STIG-OS-mirror/blob/master/redhat-STIG-DOCs/U_Red_Hat_Enterprise_Linux_7_V2R5_STIG.zip) for a Debian operating system * Additional recommendations from [cisecurity.org](https://www.cisecurity.org/) -## Compliance Exceptions +## Compliance Exceptions [Currently](https://github.com/hardenedlinux/harbian-audit/tree/master/bin/hardening) there are 274 checks to determine compliance with the [harbian-audit](https://github.com/hardenedlinux/harbian-audit) benchmark. diff --git a/docs/hedgehog-installation.md b/docs/hedgehog-installation.md index 4cab7ffd8..491eb99a0 100644 --- a/docs/hedgehog-installation.md +++ b/docs/hedgehog-installation.md @@ -1,6 +1,6 @@ -# Sensor installation +# Sensor installation -## Image boot options +## Image boot options The Hedgehog Linux installation image, when provided on an optical disc, USB thumb drive, or other removable medium, can be used to install or reinstall the sensor software. @@ -9,11 +9,11 @@ The Hedgehog Linux installation image, when provided on an optical disc, USB thu The boot menu of the sensor installer image provides several options: * **Live system** and **Live system (fully in RAM)** may also be used to run the sensor in a "live USB" mode without installing any software or making any persistent configuration changes on the sensor hardware. -* **Install Hedgehog Linux** and **Install Hedgehog Linux (encrypted)** are used to [install the sensor](#Installer) onto the current system. Both selections install the same operating system and sensor software, the only difference being that the **encrypted** option encrypts the hard disks with a password (provided in a subsequent step during installation) that must be provided each time the sensor boots. There is some CPU overhead involved in an encrypted installation, so it is recommended that encrypted installations only be used for mobile installations (eg., on a sensor that may be shipped or carried for an incident response) and that the unencrypted option be used for fixed sensors in secure environments. +* **Install Hedgehog Linux** and **Install Hedgehog Linux (encrypted)** are used to [install the sensor](hedgehog-installation.md#HedgehogInstaller) onto the current system. Both selections install the same operating system and sensor software, the only difference being that the **encrypted** option encrypts the hard disks with a password (provided in a subsequent step during installation) that must be provided each time the sensor boots. There is some CPU overhead involved in an encrypted installation, so it is recommended that encrypted installations only be used for mobile installations (eg., on a sensor that may be shipped or carried for an incident response) and that the unencrypted option be used for fixed sensors in secure environments. * **Install Hedgehog Linux (advanced configuration)** allows you to configure installation fully using all of the [Debian installer](https://www.debian.org/releases/stable/amd64/) settings and should only be selected for advanced users who know what they're doing. * **Rescue system** is included for debugging and/or system recovery and should not be needed in most cases. -## Installer +## Installer The sensor installer is designed to require as little user input as possible. For this reason, there are NO user prompts and confirmations about partitioning and reformatting hard disks for use by the sensor. The installer assumes that all non-removable storage media (eg., SSD, HDD, NVMe, etc.) are available for use and ⛔🆘😭💀 ***will partition and format them without warning*** 💀😭🆘⛔. diff --git a/docs/hedgehog-iso-build.md b/docs/hedgehog-iso-build.md index 7a8d3f5a9..d92f449cc 100644 --- a/docs/hedgehog-iso-build.md +++ b/docs/hedgehog-iso-build.md @@ -1,4 +1,4 @@ -# Appendix A - Generating the ISO +# Appendix A - Generating the ISO Official downloads of the Hedgehog Linux installer ISO are not provided: however, it can be built easily on an internet-connected Linux host with Vagrant: diff --git a/docs/hedgehog-ssh.md b/docs/hedgehog-ssh.md index fbe5747ab..4a5f515d4 100644 --- a/docs/hedgehog-ssh.md +++ b/docs/hedgehog-ssh.md @@ -1,4 +1,4 @@ -# Appendix B - Configuring SSH access +# Appendix B - Configuring SSH access SSH access to the sensor's non-privileged sensor account is only available using secure key-based authentication which can be enabled by adding a public SSH key to the **/home/sensor/.ssh/authorized_keys** file as illustrated below: diff --git a/docs/hedgehog-troubleshooting.md b/docs/hedgehog-troubleshooting.md index 7aeaa891a..736f50d40 100644 --- a/docs/hedgehog-troubleshooting.md +++ b/docs/hedgehog-troubleshooting.md @@ -1,4 +1,4 @@ -# Appendix C - Troubleshooting +# Appendix C - Troubleshooting Should the sensor not function as expected, first try rebooting the device. If the behavior continues, here are a few things that may help you diagnose the problem (items which may require Linux command line use are marked with **†**) diff --git a/docs/hedgehog-upgrade.md b/docs/hedgehog-upgrade.md index 301a5d459..b4e39ee3e 100644 --- a/docs/hedgehog-upgrade.md +++ b/docs/hedgehog-upgrade.md @@ -1,8 +1,8 @@ -# Appendix E - Upgrades +# Appendix E - Upgrades At this time there is not an "official" upgrade procedure to get from one release of Hedgehog Linux to the next. Upgrading the underlying operating system packages is generally straightforward, but not all of the Hedgehog Linux components are packaged into .deb archives yet as they should be, so for now it's a manual (and kind of nasty) process to Frankenstein an upgrade into existance. The author of this project intends to remedy this at some future point when time and resources allow. -If possible, it would save you **a lot** of trouble to just [re-ISO](#Installation) your Hedgehog installation and start fresh, backing up the files (in `/opt/sensor/sensor_ctl`) first and reconfiguring or restoring them as needed afterwards. +If possible, it would save you **a lot** of trouble to just [re-ISO](hedgehog-installation.md#HedgehogInstallation) your Hedgehog installation and start fresh, backing up the files (in `/opt/sensor/sensor_ctl`) first and reconfiguring or restoring them as needed afterwards. However, if reinstalling the system is not an option, here is the basic process for doing a manual upgrade of Hedgehog Linux. It should be understood that this process is very likely to break your system, and there is **no** guarantee of any kind that any of this will work, or that these instructions are even complete or any support whatsoever regarding them. Really, it will be **much** easier if you re-ISO your installation. But for the brave among you, here you go. ⛔🆘😭💀 @@ -10,7 +10,7 @@ However, if reinstalling the system is not an option, here is the basic process * A good understanding of the Linux command line * An existing installation of Hedgehog Linux **with internet access** -* A copy of the Hedgehog Linux [ISO](#ISOBuild) for the version approximating the one you're upgrading to (i.e., the latest version), **and** +* A copy of the Hedgehog Linux [ISO](hedgehog-iso-build.md#HedgehogISOBuild) for the version approximating the one you're upgrading to (i.e., the latest version), **and** - Either a separate VM with that ISO installed **OR** - A separate Linux workstation where you can manually mount that ISO to pull stuff off of it @@ -62,7 +62,7 @@ $ apt-get install $(cat *.list.chroot) * If there were [new python packages](https://raw.githubusercontent.com/idaholab/Malcolm/master/sensor-iso/config/hooks/normal/0169-pip-installs.hook.chroot) added to this release of Hedgehog Linux (you might have to [manually compare](https://github.com/idaholab/Malcolm/blame/main/sensor-iso/config/hooks/normal/0169-pip-installs.hook.chroot) on GitHub), install them. If you are using a PyPI mirror, replace `XXXXXX` here with your mirror's IP. The `colorama` package is used here as an example, your package list might vary. - `python3 -m pip install --no-compile --no-cache-dir --force-reinstall --upgrade --index-url=https://XXXXXX:443/pypi/simple --trusted-host=XXXXXX:443 colorama` -8. Okay, **now** things start to get a little bit ugly. You're going to need access to the ISO of the release of Hedgehog Linux you're upgrading to, as we're going to grab some packages off of it. On another Linux system, [build it](#ISOBuild). +8. Okay, **now** things start to get a little bit ugly. You're going to need access to the ISO of the release of Hedgehog Linux you're upgrading to, as we're going to grab some packages off of it. On another Linux system, [build it](hedgehog-iso-build.md#HedgehogISOBuild). 9. Use a disk image mounter to mount the ISO, **or** if you want to just install the ISO in a VM and grab the files we need off of it, that's fine too. But I'll go through the example as if I've mounted the ISO. @@ -315,7 +315,7 @@ sensor@hedgehog:opt$ for BEAT in filebeat miscbeat; do cp /opt/sensor_upgrade_ba sensor@hedgehog:opt$ cp /opt/sensor_upgrade_backup_2020-05-07/sensor_ctl/filebeat/{ca.crt,client.crt,client.key} /opt/sensor/sensor_ctl/logstash-client-certificates/ ``` -24. Despite what we just did, you may consider running `capture-config` to re-configure [capture, forwarding, and autostart services](#ConfigUser) from scratch anyway. You can use the backed-up version of `control_vars.conf` to refer back to as a basis for things you might want to restore (e.g., `CAPTURE_INTERFACE`, `CAPTURE_FILTER`, `PCAP_PATH`, `ZEEK_LOG_PATH`, your autostart settings, etc.). +24. Despite what we just did, you may consider running `capture-config` to re-configure [capture, forwarding, and autostart services](hedgehog-config-user.md#HedgehogConfigUser) from scratch anyway. You can use the backed-up version of `control_vars.conf` to refer back to as a basis for things you might want to restore (e.g., `CAPTURE_INTERFACE`, `CAPTURE_FILTER`, `PCAP_PATH`, `ZEEK_LOG_PATH`, your autostart settings, etc.). 25. Once you feel confident you've completed all of these steps, issue a reboot on the Hedgehog diff --git a/docs/hedgehog.md b/docs/hedgehog.md index e2d008adf..309229398 100644 --- a/docs/hedgehog.md +++ b/docs/hedgehog.md @@ -12,36 +12,36 @@ Hedgehog Linux is a Debian-based operating system built to ![sensor-iso-build-docker-wrap-push-ghcr](https://github.com/idaholab/Malcolm/workflows/sensor-iso-build-docker-wrap-push-ghcr/badge.svg) -### Table of Contents - -* [Sensor installation](#Installation) - - [Image boot options](#BootOptions) - - [Installer](#Installer) -* [Boot](#Boot) - - [Kiosk mode](#KioskMode) -* [Configuration](#Configuration) - - [Interfaces, hostname, and time synchronization](#ConfigRoot) - + [Hostname](#ConfigHostname) - + [Interfaces](#ConfigIface) - + [Time synchronization](#ConfigTime) - - [Capture, forwarding, and autostart services](#ConfigUser) - + [Capture](#ConfigCapture) - * [Automatic file extraction and scanning](#ZeekFileExtraction) - + [Forwarding](#ConfigForwarding) - * [arkime-capture](#arkime-capture): Arkime session forwarding - * [filebeat](#filebeat): Zeek and Suricata log forwarding - * [miscbeat](#miscbeat): System metrics forwarding - + [Autostart services](#ConfigAutostart) -+ [Zeek Intelligence Framework](#ZeekIntel) -* [Appendix A - Generating the ISO](#ISOBuild) -* [Appendix B - Configuring SSH access](#ConfigSSH) -* [Appendix C - Troubleshooting](#Troubleshooting) -* [Appendix D - Hardening](#Hardening) - - [Compliance exceptions](#ComplianceExceptions) -* [Appendix E - Upgrades](#UpgradePlan) -* [Copyright](#Footer) - -# Copyright +### Table of Contents + +* [Sensor installation](hedgehog-installation.md#HedgehogInstallation) + - [Image boot options](#HedgehogBootOptions) + - [Installer](hedgehog-installation.md#HedgehogInstaller) +* [Boot](#HedgehogBoot) + - [Kiosk mode](#HedgehogKioskMode) +* [Configuration](#HedgehogConfiguration) + - [Interfaces, hostname, and time synchronization](#HedgehogConfigRoot) + + [Hostname](#HedgehogConfigHostname) + + [Interfaces](#HedgehogConfigIface) + + [Time synchronization](#HedgehogConfigTime) + - [Capture, forwarding, and autostart services](hedgehog-config-user.md#HedgehogConfigUser) + + [Capture](#HedgehogConfigCapture) + * [Automatic file extraction and scanning](#HedgehogZeekFileExtraction) + + [Forwarding](#HedgehogConfigForwarding) + * [arkime-capture](hedgehog-config-user.md#Hedgehogarkime-capture): Arkime session forwarding + * [filebeat](hedgehog-config-user.md#Hedgehogfilebeat): Zeek and Suricata log forwarding + * [miscbeat](#Hedgehogmiscbeat): System metrics forwarding + + [Autostart services](hedgehog-config-user.md#HedgehogConfigAutostart) ++ [Zeek Intelligence Framework](#HedgehogZeekIntel) +* [Appendix A - Generating the ISO](hedgehog-iso-build.md#HedgehogISOBuild) +* [Appendix B - Configuring SSH access](#HedgehogConfigSSH) +* [Appendix C - Troubleshooting](#HedgehogTroubleshooting) +* [Appendix D - Hardening](#HedgehogHardening) + - [Compliance exceptions](#HedgehogComplianceExceptions) +* [Appendix E - Upgrades](#HedgehogUpgradePlan) +* [Copyright](#HedgehogFooter) + +# Copyright Hedgehog Linux - part of [Malcolm](https://github.com/idaholab/Malcolm) - is Copyright 2022 Battelle Energy Alliance, LLC, and is developed and released through the cooperation of the Cybersecurity and Infrastructure Security Agency of the U.S. Department of Homeland Security. diff --git a/docs/host-and-subnet-mapping.md b/docs/host-and-subnet-mapping.md index 9426fca57..436031548 100644 --- a/docs/host-and-subnet-mapping.md +++ b/docs/host-and-subnet-mapping.md @@ -83,6 +83,6 @@ This editor provides the following controls: #### Applying mapping changes -When changes are made to either `cidr-map.txt`, `host-map.txt` or `net-map.json`, Malcolm's Logstash container must be restarted. The easiest way to do this is to restart malcolm via `restart` (see [Stopping and restarting Malcolm](#StopAndRestart)) or by clicking the 🔁 **Restart Logstash** button in the [name mapping interface](#NameMapUI) interface. +When changes are made to either `cidr-map.txt`, `host-map.txt` or `net-map.json`, Malcolm's Logstash container must be restarted. The easiest way to do this is to restart malcolm via `restart` (see [Stopping and restarting Malcolm](running.md#StopAndRestart)) or by clicking the 🔁 **Restart Logstash** button in the [name mapping interface](host-and-subnet-mapping.md#NameMapUI) interface. Restarting Logstash may take several minutes, after which log ingestion will be resumed. \ No newline at end of file diff --git a/docs/host-config-linux.md b/docs/host-config-linux.md new file mode 100644 index 000000000..517c5331a --- /dev/null +++ b/docs/host-config-linux.md @@ -0,0 +1,91 @@ +#### Linux host system configuration + +##### Installing Docker + +Docker installation instructions vary slightly by distribution. Please follow the links below to docker.com to find the instructions specific to your distribution: + +* [Ubuntu](https://docs.docker.com/install/linux/docker-ce/ubuntu/) +* [Debian](https://docs.docker.com/install/linux/docker-ce/debian/) +* [Fedora](https://docs.docker.com/install/linux/docker-ce/fedora/) +* [CentOS](https://docs.docker.com/install/linux/docker-ce/centos/) +* [Binaries](https://docs.docker.com/install/linux/docker-ce/binaries/) + +After installing Docker, because Malcolm should be run as a non-root user, add your user to the `docker` group with something like: +``` +$ sudo usermod -aG docker yourusername +``` + +Following this, either reboot or log out then log back in. + +Docker starts automatically on DEB-based distributions. On RPM-based distributions, you need to start it manually or enable it using the appropriate `systemctl` or `service` command(s). + +You can test docker by running `docker info`, or (assuming you have internet access), `docker run --rm hello-world`. + +##### Installing docker-compose + +Please follow [this link](https://docs.docker.com/compose/install/) on docker.com for instructions on installing docker-compose. + +##### Operating system configuration + +The host system (ie., the one running Docker) will need to be configured for the [best possible OpenSearch performance](https://www.elastic.co/guide/en/elasticsearch/reference/master/system-config.html). Here are a few suggestions for Linux hosts (these may vary from distribution to distribution): + +* Append the following lines to `/etc/sysctl.conf`: + +``` +# the maximum number of open file handles +fs.file-max=2097152 + +# increase maximums for inotify watches +fs.inotify.max_user_watches=131072 +fs.inotify.max_queued_events=131072 +fs.inotify.max_user_instances=512 + +# the maximum number of memory map areas a process may have +vm.max_map_count=262144 + +# decrease "swappiness" (swapping out runtime memory vs. dropping pages) +vm.swappiness=1 + +# the maximum number of incoming connections +net.core.somaxconn=65535 + +# the % of system memory fillable with "dirty" pages before flushing +vm.dirty_background_ratio=40 + +# maximum % of dirty system memory before committing everything +vm.dirty_ratio=80 +``` + +* Depending on your distribution, create **either** the file `/etc/security/limits.d/limits.conf` containing: + +``` +# the maximum number of open file handles +* soft nofile 65535 +* hard nofile 65535 +# do not limit the size of memory that can be locked +* soft memlock unlimited +* hard memlock unlimited +``` + +**OR** the file `/etc/systemd/system.conf.d/limits.conf` containing: + +``` +[Manager] +# the maximum number of open file handles +DefaultLimitNOFILE=65535:65535 +# do not limit the size of memory that can be locked +DefaultLimitMEMLOCK=infinity +``` + +* Change the readahead value for the disk where the OpenSearch data will be stored. There are a few ways to do this. For example, you could add this line to `/etc/rc.local` (replacing `/dev/sda` with your disk block descriptor): + +``` +# change disk read-adhead value (# of blocks) +blockdev --setra 512 /dev/sda +``` + +* Change the I/O scheduler to `deadline` or `noop`. Again, this can be done in a variety of ways. The simplest is to add `elevator=deadline` to the arguments in `GRUB_CMDLINE_LINUX` in `/etc/default/grub`, then running `sudo update-grub2` + +* If you are planning on using very large data sets, consider formatting the drive containing the `opensearch` volume as XFS. + +After making all of these changes, do a reboot for good measure! \ No newline at end of file diff --git a/docs/host-config-macos.md b/docs/host-config-macos.md new file mode 100644 index 000000000..74b63ef91 --- /dev/null +++ b/docs/host-config-macos.md @@ -0,0 +1,36 @@ +#### macOS host system configuration + +##### Automatic installation using `install.py` + +The `install.py` script will attempt to guide you through the installation of Docker and Docker Compose if they are not present. If that works for you, you can skip ahead to **Configure docker daemon option** in this section. + +##### Install Homebrew + +The easiest way to install and maintain docker on Mac is using the [Homebrew cask](https://brew.sh). Execute the following in a terminal. + +``` +$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)" +$ brew install cask +$ brew tap homebrew/cask-versions +``` + +##### Install docker-edge + +``` +$ brew cask install docker-edge +``` +This will install the latest version of docker and docker-compose. It can be upgraded later using `brew` as well: +``` +$ brew cask upgrade --no-quarantine docker-edge +``` +You can now run docker from the Applications folder. + +##### Configure docker daemon option + +Some changes should be made for performance ([this link](http://markshust.com/2018/01/30/performance-tuning-docker-mac) gives a good succinct overview). + +* **Resource allocation** - For a good experience, you likely need at least a quad-core MacBook Pro with 16GB RAM and an SSD. I have run Malcolm on an older 2013 MacBook Pro with 8GB of RAM, but the more the better. Go in your system tray and select **Docker** → **Preferences** → **Advanced**. Set the resources available to docker to at least 4 CPUs and 8GB of RAM (>= 16GB is preferable). + +* **Volume mount performance** - You can speed up performance of volume mounts by removing unused paths from **Docker** → **Preferences** → **File Sharing**. For example, if you're only going to be mounting volumes under your home directory, you could share `/Users` but remove other paths. + +After making these changes, right click on the Docker 🐋 icon in the system tray and select **Restart**. \ No newline at end of file diff --git a/docs/host-config-windows.md b/docs/host-config-windows.md new file mode 100644 index 000000000..428786fc1 --- /dev/null +++ b/docs/host-config-windows.md @@ -0,0 +1,16 @@ +#### Windows host system configuration + +##### Installing and configuring Docker Desktop for Windows + +Installing and configuring [Docker to run under Windows](https://docs.docker.com/desktop/windows/wsl/) must be done manually, rather than through the `install.py` script as is done for Linux and macOS. + +1. Be running Windows 10, version 1903 or higher +1. Prepare your system and [install WSL](https://docs.microsoft.com/en-us/windows/wsl/install) and a Linux distribution by running `wsl --install -d Debian` in PowerShell as Administrator (these instructions are tested with Debian, but may work with other distributions) +1. Install Docker Desktop for Windows either by downloading the installer from the [official Docker site](https://hub.docker.com/editions/community/docker-ce-desktop-windows) or installing it through [chocolatey](https://chocolatey.org/packages/docker-desktop). +1. Follow the [Docker Desktop WSL 2 backend](https://docs.docker.com/desktop/windows/wsl/) instructions to finish configuration and review best practices +1. Reboot +1. Open the WSL distribution's terminal and run run `docker info` to make sure Docker is running + +##### Finish Malcolm's configuration + +Once Docker is installed, configured and running as described in the previous section, run [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning) to finish configuration of the local Malcolm installation. Malcolm will be controlled and run from within your WSL distribution's terminal environment. \ No newline at end of file diff --git a/docs/ics-best-guess.md b/docs/ics-best-guess.md index c7da9a924..6969706b3 100644 --- a/docs/ics-best-guess.md +++ b/docs/ics-best-guess.md @@ -1,11 +1,11 @@ ### "Best Guess" Fingerprinting for ICS Protocols -There are many ICS (industrial control systems) protocols. While Malcolm's collection of [protocol parsers](#Protocols) includes a number of them, many, particularly those that are proprietary or less common, are unlikely to be supported with a full parser in the foreseeable future. +There are many ICS (industrial control systems) protocols. While Malcolm's collection of [protocol parsers](protocols.md#Protocols) includes a number of them, many, particularly those that are proprietary or less common, are unlikely to be supported with a full parser in the foreseeable future. In an effort to help identify more ICS traffic, Malcolm can use "best guess" method based on transport protocol (e.g., TCP or UDP) and port(s) to categorize potential traffic communicating over some ICS protocols without full parser support. This feature involves a [mapping table](https://github.com/idaholab/Malcolm/blob/master/zeek/config/guess_ics_map.txt) and a [Zeek script](https://github.com/idaholab/Malcolm/blob/master/zeek/config/guess.zeek) to look up the transport protocol and destination and/or source port to make a best guess at whether a connection belongs to one of those protocols. These potential ICS communications are categorized by vendor where possible. -Naturally, these lookups could produce false positives, so these connections are displayed in their own dashboard (the **Best Guess** dashboard found under the **ICS** section of Malcolm's [OpenSearch Dashboards](#DashboardsVisualizations) navigation pane). Values such as IP addresses, ports, or UID can be used to [pivot to other dashboards](#ZeekArkimeFlowCorrelation) to investigate further. +Naturally, these lookups could produce false positives, so these connections are displayed in their own dashboard (the **Best Guess** dashboard found under the **ICS** section of Malcolm's [OpenSearch Dashboards](dashboards.md#DashboardsVisualizations) navigation pane). Values such as IP addresses, ports, or UID can be used to [pivot to other dashboards](arkime.md#ZeekArkimeFlowCorrelation) to investigate further. ![](./images/screenshots/dashboards_bestguess.png) -This feature is disabled by default, but it can be enabled by clearing (setting to `''`) the value of the `ZEEK_DISABLE_BEST_GUESS_ICS` environment variable in [`docker-compose.yml`](#DockerComposeYml). \ No newline at end of file +This feature is disabled by default, but it can be enabled by clearing (setting to `''`) the value of the `ZEEK_DISABLE_BEST_GUESS_ICS` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). \ No newline at end of file diff --git a/docs/index-management.md b/docs/index-management.md index 59ec6a92a..ee6ed7f76 100644 --- a/docs/index-management.md +++ b/docs/index-management.md @@ -4,4 +4,4 @@ Malcolm releases prior to v6.2.0 used environment variables to configure OpenSea Since then, OpenSearch Dashboards has developed and released plugins with UIs for [Index State Management](https://opensearch.org/docs/latest/im-plugin/ism/index/) and [Snapshot Management](https://opensearch.org/docs/latest/opensearch/snapshots/sm-dashboards/). Because these plugins provide a more comprehensive and user-friendly interfaces for these features, the old environment variable-based configuration code has been removed from Malcolm, with the exception of the code that uses `OPENSEARCH_INDEX_SIZE_PRUNE_LIMIT` and `OPENSEARCH_INDEX_SIZE_PRUNE_NAME_SORT` which deals with deleting the oldest network session metadata indices when the database exceeds a certain size. -Note that OpenSearch index state management and snapshot management only deals with disk space consumed by OpenSearch indices: it does not have anything to do with PCAP file storage. The `MANAGE_PCAP_FILES` environment variable in the [`docker-compose.yml`](#DockerComposeYml) file can be used to allow Arkime to prune old PCAP files based on available disk space. \ No newline at end of file +Note that OpenSearch index state management and snapshot management only deals with disk space consumed by OpenSearch indices: it does not have anything to do with PCAP file storage. The `MANAGE_PCAP_FILES` environment variable in the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file can be used to allow Arkime to prune old PCAP files based on available disk space. \ No newline at end of file diff --git a/docs/live-analysis.md b/docs/live-analysis.md index 30aa86b85..5f95d5882 100644 --- a/docs/live-analysis.md +++ b/docs/live-analysis.md @@ -13,19 +13,19 @@ Please see the [Hedgehog Linux README](./sensor-iso/README.md) for more informat ### Monitoring local network interfaces -Malcolm's `pcap-capture`, `suricata-live` and `zeek-live` containers can monitor one or more local network interfaces, specified by the `PCAP_IFACE` environment variable in [`docker-compose.yml`](#DockerComposeYml). These containers are started with additional privileges (`IPC_LOCK`, `NET_ADMIN`, `NET_RAW`, and `SYS_ADMIN`) to allow opening network interfaces in promiscuous mode for capture. +Malcolm's `pcap-capture`, `suricata-live` and `zeek-live` containers can monitor one or more local network interfaces, specified by the `PCAP_IFACE` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). These containers are started with additional privileges (`IPC_LOCK`, `NET_ADMIN`, `NET_RAW`, and `SYS_ADMIN`) to allow opening network interfaces in promiscuous mode for capture. -The instances of Zeek and Suricata (in the `suricata-live` and `zeek-live` containers when the `SURICATA_LIVE_CAPTURE` and `ZEEK_LIVE_CAPTURE` environment variables in [`docker-compose.yml`](#DockerComposeYml) are set to `true`, respectively) analyze traffic on-the-fly and generate log files containing network session metadata. These log files are in turn scanned by Filebeat and forwarded to Logstash for enrichment and indexing into the OpenSearch document store. +The instances of Zeek and Suricata (in the `suricata-live` and `zeek-live` containers when the `SURICATA_LIVE_CAPTURE` and `ZEEK_LIVE_CAPTURE` environment variables in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) are set to `true`, respectively) analyze traffic on-the-fly and generate log files containing network session metadata. These log files are in turn scanned by Filebeat and forwarded to Logstash for enrichment and indexing into the OpenSearch document store. -In contrast, the `pcap-capture` container buffers traffic to PCAP files and periodically rotates these files for processing (by Arkime's `capture` utlity in the `arkime` container) according to the thresholds defined by the `PCAP_ROTATE_MEGABYTES` and `PCAP_ROTATE_MINUTES` environment variables in [`docker-compose.yml`](#DockerComposeYml). If for some reason (e.g., a low resources environment) you also want Zeek and Suricata to process these intermediate PCAP files rather than monitoring the network interfaces directly, you can set `SURICATA_ROTATED_PCAP`/`ZEEK_ROTATED_PCAP` to `true` and `SURICATA_LIVE_CAPTURE`/`ZEEK_LIVE_CAPTURE` to false. +In contrast, the `pcap-capture` container buffers traffic to PCAP files and periodically rotates these files for processing (by Arkime's `capture` utlity in the `arkime` container) according to the thresholds defined by the `PCAP_ROTATE_MEGABYTES` and `PCAP_ROTATE_MINUTES` environment variables in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). If for some reason (e.g., a low resources environment) you also want Zeek and Suricata to process these intermediate PCAP files rather than monitoring the network interfaces directly, you can set `SURICATA_ROTATED_PCAP`/`ZEEK_ROTATED_PCAP` to `true` and `SURICATA_LIVE_CAPTURE`/`ZEEK_LIVE_CAPTURE` to false. -These various options for monitoring traffic on local network interfaces can also be configured by running [`./scripts/install.py --configure`](#ConfigAndTuning). +These various options for monitoring traffic on local network interfaces can also be configured by running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning). Note that currently Microsoft Windows and Apple macOS platforms run Docker inside of a virtualized environment. Live traffic capture and analysis on those platforms would require additional configuration of virtual interfaces and port forwarding in Docker which is outside of the scope of this document. ### Manually forwarding logs from an external source -Malcolm's Logstash instance can also be configured to accept logs from a [remote forwarder](https://www.elastic.co/products/beats/filebeat) by running [`./scripts/install.py --configure`](#ConfigAndTuning) and answering "yes" to "`Expose Logstash port to external hosts?`." Enabling encrypted transport of these logs files is discussed in [Configure authentication](#AuthSetup) and the description of the `BEATS_SSL` environment variable in the [`docker-compose.yml`](#DockerComposeYml) file. +Malcolm's Logstash instance can also be configured to accept logs from a [remote forwarder](https://www.elastic.co/products/beats/filebeat) by running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning) and answering "yes" to "`Expose Logstash port to external hosts?`." Enabling encrypted transport of these logs files is discussed in [Configure authentication](authsetup.md#AuthSetup) and the description of the `BEATS_SSL` environment variable in the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file. Configuring Filebeat to forward Zeek logs to Malcolm might look something like this example [`filebeat.yml`](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-reference-yml.html): ``` diff --git a/docs/malcolm-config.md b/docs/malcolm-config.md new file mode 100644 index 000000000..f3a0d2590 --- /dev/null +++ b/docs/malcolm-config.md @@ -0,0 +1,77 @@ +### System configuration and tuning + +If you already have Docker and Docker Compose installed, the `install.py` script can still help you tune system configuration and `docker-compose.yml` parameters for Malcolm. To run it in "configuration only" mode, bypassing the steps to install Docker and Docker Compose, run it like this: +``` +./scripts/install.py --configure +``` + +Although `install.py` will attempt to automate many of the following configuration and tuning parameters, they are nonetheless listed in the following sections for reference: + +#### `docker-compose.yml` parameters + +Edit `docker-compose.yml` and search for the `OPENSEARCH_JAVA_OPTS` key. Edit the `-Xms4g -Xmx4g` values, replacing `4g` with a number that is half of your total system memory, or just under 32 gigabytes, whichever is less. So, for example, if I had 64 gigabytes of memory I would edit those values to be `-Xms31g -Xmx31g`. This indicates how much memory can be allocated to the OpenSearch heaps. For a pleasant experience, I would suggest not using a value under 10 gigabytes. Similar values can be modified for Logstash with `LS_JAVA_OPTS`, where using 3 or 4 gigabytes is recommended. + +Various other environment variables inside of `docker-compose.yml` can be tweaked to control aspects of how Malcolm behaves, particularly with regards to processing PCAP files and Zeek logs. The environment variables of particular interest are located near the top of that file under **Commonly tweaked configuration options**, which include: + +* `ARKIME_ANALYZE_PCAP_THREADS` – the number of threads available to Arkime for analyzing PCAP files (default `1`) +* `AUTO_TAG` – if set to `true`, Malcolm will automatically create Arkime sessions and Zeek logs with tags based on the filename, as described in [Tagging](upload.md#Tagging) (default `true`) +* `BEATS_SSL` – if set to `true`, Logstash will use require encrypted communications for any external [Beats](https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html)-based forwarders from which it will accept logs (default `true`) +* `CONNECTION_SECONDS_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the duration threshold (in seconds) for assigning severity to long connections (default `3600`) +* `EXTRACTED_FILE_CAPA_VERBOSE` – if set to `true`, all Capa rule hits will be logged; otherwise (`false`) only [MITRE ATT&CK® technique](https://attack.mitre.org/techniques) classifications will be logged +* `EXTRACTED_FILE_ENABLE_CAPA` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) that are determined to be PE (portable executable) files will be scanned with [Capa](https://github.com/fireeye/capa) +* `EXTRACTED_FILE_ENABLE_CLAMAV` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be scanned with [ClamAV](https://www.clamav.net/) +* `EXTRACTED_FILE_ENABLE_YARA` – if set to `true`, [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be scanned with [Yara](https://github.com/VirusTotal/yara) +* `EXTRACTED_FILE_HTTP_SERVER_ENABLE` – if set to `true`, the directory containing [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) will be served over HTTP at `./extracted-files/` (e.g., [https://localhost/extracted-files/](https://localhost/extracted-files/) if you are connecting locally) +* `EXTRACTED_FILE_HTTP_SERVER_ENCRYPT` – if set to `true`, those Zeek-extracted files will be AES-256-CBC-encrypted in an `openssl enc`-compatible format (e.g., `openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe`) +* `EXTRACTED_FILE_HTTP_SERVER_KEY` – specifies the AES-256-CBC decryption password for encrypted Zeek-extracted files; used in conjunction with `EXTRACTED_FILE_HTTP_SERVER_ENCRYPT` +* `EXTRACTED_FILE_IGNORE_EXISTING` – if set to `true`, files extant in `./zeek-logs/extract_files/` directory will be ignored on startup rather than scanned +* `EXTRACTED_FILE_PRESERVATION` – determines behavior for preservation of [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) +* `EXTRACTED_FILE_UPDATE_RULES` – if set to `true`, file scanner engines (e.g., ClamAV, Capa, Yara) will periodically update their rule definitions +* `EXTRACTED_FILE_YARA_CUSTOM_ONLY` – if set to `true`, Malcolm will bypass the default [Yara ruleset](https://github.com/Neo23x0/signature-base) and use only user-defined rules in `./yara/rules` +* `FREQ_LOOKUP` - if set to `true`, domain names (from DNS queries and SSL server names) will be assigned entropy scores as calculated by [`freq`](https://github.com/MarkBaggett/freq) (default `false`) +* `FREQ_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the entropy threshold for assigning severity to events with entropy scores calculated by [`freq`](https://github.com/MarkBaggett/freq); a lower value will only assign severity scores to fewer domain names with higher entropy (e.g., `2.0` for `NQZHTFHRMYMTVBQJE.COM`), while a higher value will assign severity scores to more domain names with lower entropy (e.g., `7.5` for `naturallanguagedomain.example.org`) (default `2.0`) +* `LOGSTASH_OUI_LOOKUP` – if set to `true`, Logstash will map MAC addresses to vendors for all source and destination MAC addresses when analyzing Zeek logs (default `true`) +* `LOGSTASH_REVERSE_DNS` – if set to `true`, Logstash will perform a reverse DNS lookup for all external source and destination IP address values when analyzing Zeek logs (default `false`) +* `LOGSTASH_SEVERITY_SCORING` - if set to `true`, Logstash will perform [severity scoring](severity.md#Severity) when analyzing Zeek logs (default `true`) +* `MANAGE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will be marked as available for deletion by Arkime if available storage space becomes too low (default `false`) +* `MAXMIND_GEOIP_DB_LICENSE_KEY` - Malcolm uses MaxMind's free GeoLite2 databases for GeoIP lookups. As of December 30, 2019, these databases are [no longer available](https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/) for download via a public URL. Instead, they must be downloaded using a MaxMind license key (available without charge [from MaxMind](https://www.maxmind.com/en/geolite2/signup)). The license key can be specified here for GeoIP database downloads during build- and run-time. +* `OPENSEARCH_LOCAL` - if set to `true`, Malcolm will use its own internal [OpenSearch instance](opensearch-instances.md#OpenSearchInstance) (default `true`) +* `OPENSEARCH_URL` - when using Malcolm's internal OpenSearch instance (i.e., `OPENSEARCH_LOCAL` is `true`) this should be `http://opensearch:9200`, otherwise this value specifies the primary remote instance URL in the format `protocol://host:port` (default `http://opensearch:9200`) +* `OPENSEARCH_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the primary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`) +* `OPENSEARCH_SECONDARY` - if set to `true`, Malcolm will forward logs to a secondary remote OpenSearch instance in addition to the primary (local or remote) OpenSearch instance (default `false`) +* `OPENSEARCH_SECONDARY_URL` - when forwarding to a secondary remote OpenSearch instance (i.e., `OPENSEARCH_SECONDARY` is `true`) this value specifies the secondary remote instance URL in the format `protocol://host:port` +* `OPENSEARCH_SECONDARY_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the secondary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`) +* `NETBOX_DISABLED` - if set to `true`, Malcolm will **not** start [NetBox](netbox.md#NetBox) and manage a [NetBox](#NetBox) instance (default `true`) +* `NETBOX_CRON` - if set to `true`, network traffic metadata will periodically be queried and used to populate Malcolm's [NetBox](netbox.md#NetBox) instance +* `NGINX_BASIC_AUTH` - if set to `true`, use [TLS-encrypted HTTP basic](authsetup.md#AuthBasicAccountManagement) authentication (default); if set to `false`, use [Lightweight Directory Access Protocol (LDAP)](authsetup.md#AuthLDAP) authentication +* `NGINX_LOG_ACCESS_AND_ERRORS` - if set to `true`, all access to Malcolm via its [web interfaces](quickstart.md#UserInterfaceURLs) will be logged to OpenSearch (default `false`) +* `NGINX_SSL` - if set to `true`, require HTTPS connections to Malcolm's `nginx-proxy` container (default); if set to `false`, use unencrypted HTTP connections (using unsecured HTTP connections is **NOT** recommended unless you are running Malcolm behind another reverse proxy like Traefik, Caddy, etc.) +* `PCAP_ENABLE_NETSNIFF` – if set to `true`, Malcolm will capture network traffic on the local network interface(s) indicated in `PCAP_IFACE` using [netsniff-ng](http://netsniff-ng.org/) +* `PCAP_ENABLE_TCPDUMP` – if set to `true`, Malcolm will capture network traffic on the local network interface(s) indicated in `PCAP_IFACE` using [tcpdump](https://www.tcpdump.org/); there is no reason to enable *both* `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP` +* `PCAP_FILTER` – specifies a tcpdump-style filter expression for local packet capture; leave blank to capture all traffic +* `PCAP_IFACE` – used to specify the network interface(s) for local packet capture if `PCAP_ENABLE_NETSNIFF`, `PCAP_ENABLE_TCPDUMP`, `ZEEK_LIVE_CAPTURE` or `SURICATA_LIVE_CAPTURE` are enabled; for multiple interfaces, separate the interface names with a comma (e.g., `'enp0s25'` or `'enp10s0,enp11s0'`) +* `PCAP_IFACE_TWEAK` - if set to `true`, Malcolm will [use `ethtool`](shared/bin/nic-capture-setup.sh) to disable NIC hardware offloading features and adjust ring buffer sizes for capture interface(s); this should be `true` if the interface(s) are being used for capture only, `false` if they are being used for management/communication +* `PCAP_ROTATE_MEGABYTES` – used to specify how large a locally-captured PCAP file can become (in megabytes) before it is closed for processing and a new PCAP file created +* `PCAP_ROTATE_MINUTES` – used to specify a time interval (in minutes) after which a locally-captured PCAP file will be closed for processing and a new PCAP file created +* `pipeline.workers`, `pipeline.batch.size` and `pipeline.batch.delay` - these settings are used to tune the performance and resource utilization of the the `logstash` container; see [Tuning and Profiling Logstash Performance](https://www.elastic.co/guide/en/logstash/current/tuning-logstash.html), [`logstash.yml`](https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html) and [Multiple Pipelines](https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html) +* `PUID` and `PGID` - Docker runs all of its containers as the privileged `root` user by default. For better security, Malcolm immediately drops to non-privileged user accounts for executing internal processes wherever possible. The `PUID` (**p**rocess **u**ser **ID**) and `PGID` (**p**rocess **g**roup **ID**) environment variables allow Malcolm to map internal non-privileged user accounts to a corresponding [user account](https://en.wikipedia.org/wiki/User_identifier) on the host. +* `SENSITIVE_COUNTRY_CODES` - when [severity scoring](severity.md#Severity) is enabled, this variable defines a comma-separated list of sensitive countries (using [ISO 3166-1 alpha-2 codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Current_codes)) (default `'AM,AZ,BY,CN,CU,DZ,GE,HK,IL,IN,IQ,IR,KG,KP,KZ,LY,MD,MO,PK,RU,SD,SS,SY,TJ,TM,TW,UA,UZ'`, taken from the U.S. Department of Energy Sensitive Country List) +* `SURICATA_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Suricata, and the resulting logs will also be imported (default `false`) +* `SURICATA_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Suricata logs (default `1`) +* `SURICATA_CUSTOM_RULES_ONLY` – if set to `true`, Malcolm will bypass the default [Suricata ruleset](https://github.com/OISF/suricata/tree/master/rules) and use only user-defined rules (`./suricata/rules/*.rules`). +* `SURICATA_UPDATE_RULES` – if set to `true`, Suricata signatures will periodically be updated (default `false`) +* `SURICATA_LIVE_CAPTURE` - if set to `true`, Suricata will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` +* `SURICATA_ROTATED_PCAP` - if set to `true`, Suricata can analyze captured PCAP files captured by `netsniff-ng` or `tcpdump` (see `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP`, as well as `SURICATA_AUTO_ANALYZE_PCAP_FILES`); if `SURICATA_LIVE_CAPTURE` is `true`, this should be false, otherwise Suricata will see duplicate traffic +* `SURICATA_…` - the [`suricata` container entrypoint script](shared/bin/suricata_config_populate.py) can use **many** more environment variables to tweak [suricata.yaml](https://github.com/OISF/suricata/blob/master/suricata.yaml.in); in that script, `DEFAULT_VARS` defines those variables (albeit without the `SURICATA_` prefix you must add to each for use) +* `TOTAL_MEGABYTES_SEVERITY_THRESHOLD` - when [severity scoring](severity.md#Severity) is enabled, this variable indicates the size threshold (in megabytes) for assigning severity to large connections or file transfers (default `1000`) +* `VTOT_API2_KEY` – used to specify a [VirusTotal Public API v.20](https://www.virustotal.com/en/documentation/public-api/) key, which, if specified, will be used to submit hashes of [Zeek-extracted files](file-scanning.md#ZeekFileExtraction) to VirusTotal +* `ZEEK_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Zeek, and the resulting logs will also be imported (default `false`) +* `ZEEK_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Zeek logs (default `1`) +* `ZEEK_DISABLE_…` - if set to any non-blank value, each of these variables can be used to disable a certain Zeek function when it analyzes PCAP files (for example, setting `ZEEK_DISABLE_LOG_PASSWORDS` to `true` to disable logging of cleartext passwords) +* `ZEEK_DISABLE_BEST_GUESS_ICS` - see ["Best Guess" Fingerprinting for ICS Protocols](ics-best-guess.md#ICSBestGuess) +* `ZEEK_EXTRACTOR_MODE` – determines the file extraction behavior for file transfers detected by Zeek; see [Automatic file extraction and scanning](file-scanning.md#ZeekFileExtraction) for more details +* `ZEEK_INTEL_FEED_SINCE` - when querying a [TAXII](zeek-intel.md#ZeekIntelSTIX) or [MISP](zeek-intel.md#ZeekIntelMISP) feed, only process threat indicators that have been created or modified since the time represented by this value; it may be either a fixed date/time (`01/01/2021`) or relative interval (`30 days ago`) +* `ZEEK_INTEL_ITEM_EXPIRATION` - specifies the value for Zeek's [`Intel::item_expiration`](https://docs.zeek.org/en/current/scripts/base/frameworks/intel/main.zeek.html#id-Intel::item_expiration) timeout as used by the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) (default `-1min`, which disables item expiration) +* `ZEEK_INTEL_REFRESH_CRON_EXPRESSION` - specifies a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) indicating the refresh interval for generating the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) files (defaults to empty, which disables automatic refresh) +* `ZEEK_LIVE_CAPTURE` - if set to `true`, Zeek will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` +* `ZEEK_ROTATED_PCAP` - if set to `true`, Zeek can analyze captured PCAP files captured by `netsniff-ng` or `tcpdump` (see `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP`, as well as `ZEEK_AUTO_ANALYZE_PCAP_FILES`); if `ZEEK_LIVE_CAPTURE` is `true`, this should be false, otherwise Zeek will see duplicate traffic \ No newline at end of file diff --git a/docs/malcolm-iso.md b/docs/malcolm-iso.md index 0a0136930..061e28a60 100644 --- a/docs/malcolm-iso.md +++ b/docs/malcolm-iso.md @@ -39,7 +39,7 @@ Finished, created "/malcolm-build/malcolm-iso/malcolm-6.4.0.iso" … ``` -By default, Malcolm's Docker images are not packaged with the installer ISO, assuming instead that you will pull the [latest images](https://hub.docker.com/u/malcolmnetsec) with a `docker-compose pull` command as described in the [Quick start](quickstart.md#QuickStart) section. If you wish to build an ISO with the latest Malcolm images included, follow the directions to create [pre-packaged installation files](#Packager), which include a tarball with a name like `malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz`. Then, pass that images tarball to the ISO build script with a `-d`, like this: +By default, Malcolm's Docker images are not packaged with the installer ISO, assuming instead that you will pull the [latest images](https://hub.docker.com/u/malcolmnetsec) with a `docker-compose pull` command as described in the [Quick start](quickstart.md#QuickStart) section. If you wish to build an ISO with the latest Malcolm images included, follow the directions to create [pre-packaged installation files](development.md#Packager), which include a tarball with a name like `malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz`. Then, pass that images tarball to the ISO build script with a `-d`, like this: ``` $ ./malcolm-iso/build_via_vagrant.sh -f -d malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz @@ -48,7 +48,7 @@ $ ./malcolm-iso/build_via_vagrant.sh -f -d malcolm_YYYYMMDD_HHNNSS_xxxxxxx_image A system installed from the resulting ISO will load the Malcolm Docker images upon first boot. This method is desirable when the ISO is to be installed in an "air gapped" environment or for distribution to non-networked machines. -Alternately, if you have forked Malcolm on GitHub, [workflow files](./.github/workflows/) are provided which contain instructions for GitHub to build the docker images and [sensor](#Hedgehog) and [Malcolm](#ISO) installer ISOs, specifically [`malcolm-iso-build-docker-wrap-push-ghcr.yml`](./.github/workflows/malcolm-iso-build-docker-wrap-push-ghcr.yml) for the Malcolm ISO. You'll need to run the workflows to build and push your fork's Malcolm docker images before building the ISO. The resulting ISO file is wrapped in a Docker image that provides an HTTP server from which the ISO may be downloaded. +Alternately, if you have forked Malcolm on GitHub, [workflow files](./.github/workflows/) are provided which contain instructions for GitHub to build the docker images and [sensor](live-analysis.md#Hedgehog) and [Malcolm](malcolm-iso.md#ISO) installer ISOs, specifically [`malcolm-iso-build-docker-wrap-push-ghcr.yml`](./.github/workflows/malcolm-iso-build-docker-wrap-push-ghcr.yml) for the Malcolm ISO. You'll need to run the workflows to build and push your fork's Malcolm docker images before building the ISO. The resulting ISO file is wrapped in a Docker image that provides an HTTP server from which the ISO may be downloaded. ### Installation @@ -76,6 +76,6 @@ Following these prompts, the installer will reboot and the Malcolm base operatin When the system boots for the first time, the Malcolm Docker images will load if the installer was built with pre-packaged installation files as described above. Wait for this operation to continue (the progress dialog will disappear when they have finished loading) before continuing the setup. -Open a terminal (click the red terminal 🗔 icon next to the Debian swirl logo 🍥 menu button in the menu bar). At this point, setup is similar to the steps described in the [Quick start](quickstart.md#QuickStart) section. Navigate to the Malcolm directory (`cd ~/Malcolm`) and run [`auth_setup`](#AuthSetup) to configure authentication. If the ISO didn't have pre-packaged Malcolm images, or if you'd like to retrieve the latest updates, run `docker-compose pull`. Finalize your configuration by running `scripts/install.py --configure` and follow the prompts as illustrated in the [installation example](#InstallationExample). +Open a terminal (click the red terminal 🗔 icon next to the Debian swirl logo 🍥 menu button in the menu bar). At this point, setup is similar to the steps described in the [Quick start](quickstart.md#QuickStart) section. Navigate to the Malcolm directory (`cd ~/Malcolm`) and run [`auth_setup`](authsetup.md#AuthSetup) to configure authentication. If the ISO didn't have pre-packaged Malcolm images, or if you'd like to retrieve the latest updates, run `docker-compose pull`. Finalize your configuration by running `scripts/install.py --configure` and follow the prompts as illustrated in the [installation example](ubuntu-install-example.md#InstallationExample). -Once Malcolm is configured, you can [start Malcolm](#Starting) via the command line or by clicking the circular yellow Malcolm icon in the menu bar. \ No newline at end of file +Once Malcolm is configured, you can [start Malcolm](running.md#Starting) via the command line or by clicking the circular yellow Malcolm icon in the menu bar. \ No newline at end of file diff --git a/docs/malcolm-upgrade.md b/docs/malcolm-upgrade.md index b8c7803dd..fa071f8b5 100644 --- a/docs/malcolm-upgrade.md +++ b/docs/malcolm-upgrade.md @@ -6,7 +6,7 @@ At this time there is not an "official" upgrade procedure to get from one versio You may wish to get the official updates for the underlying system's software packages before you proceed. Consult the documentation of your operating system for how to do this. -If you are upgrading an Malcolm instance installed from [Malcolm installation ISO](#ISOInstallation), follow scenario 2 below. Due to the Malcolm base operating system's [hardened](#Hardening) configuration, when updating the underlying system, temporarily set the umask value to Debian default (`umask 0022` in the root shell in which updates are being performed) instead of the more restrictive Malcolm default. This will allow updates to be applied with the right permissions. +If you are upgrading an Malcolm instance installed from [Malcolm installation ISO](malcolm-iso.md#ISOInstallation), follow scenario 2 below. Due to the Malcolm base operating system's [hardened](hardening.md#Hardening) configuration, when updating the underlying system, temporarily set the umask value to Debian default (`umask 0022` in the root shell in which updates are being performed) instead of the more restrictive Malcolm default. This will allow updates to be applied with the right permissions. ### Scenario 1: Malcolm is a GitHub clone @@ -23,10 +23,10 @@ If you checked out a working copy of the Malcolm repository from GitHub with a ` 5. apply saved configuration change stashed earlier * `git stash pop` 6. if you see `Merge conflict` messages, resolve the [conflicts](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging#_basic_merge_conflicts) with your favorite text editor -7. you may wish to re-run `install.py --configure` as described in [System configuration and tuning](#ConfigAndTuning) in case there are any new `docker-compose.yml` parameters for Malcolm that need to be set up +7. you may wish to re-run `install.py --configure` as described in [System configuration and tuning](malcolm-config.md#ConfigAndTuning) in case there are any new `docker-compose.yml` parameters for Malcolm that need to be set up 8. start Malcolm * `./scripts/start` -9. you may be prompted to [configure authentication](#AuthSetup) if there are new authentication-related files that need to be generated +9. you may be prompted to [configure authentication](authsetup.md#AuthSetup) if there are new authentication-related files that need to be generated * you probably do not need to re-generate self-signed certificates ### Scenario 2: Malcolm was installed from a packaged tarball @@ -45,13 +45,13 @@ If you installed Malcolm from [pre-packaged installation files](https://github.c * `cp -r ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/scripts ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/README.md ./` 4. replace (overwrite) `docker-compose.yml` file with new version * `cp ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/docker-compose.yml ./docker-compose.yml` -5. re-run `./scripts/install.py --configure` as described in [System configuration and tuning](#ConfigAndTuning) -6. using a file comparison tool (e.g., `diff`, `meld`, `Beyond Compare`, etc.), compare `docker-compose.yml` and the `docker-compare.yml` file you backed up in step 3, and manually migrate over any customizations you wish to preserve from that file (e.g., `PCAP_FILTER`, `MAXMIND_GEOIP_DB_LICENSE_KEY`, `MANAGE_PCAP_FILES`; [anything else](#DockerComposeYml) you may have edited by hand in `docker-compose.yml` that's not prompted for in `install.py --configure`) +5. re-run `./scripts/install.py --configure` as described in [System configuration and tuning](malcolm-config.md#ConfigAndTuning) +6. using a file comparison tool (e.g., `diff`, `meld`, `Beyond Compare`, etc.), compare `docker-compose.yml` and the `docker-compare.yml` file you backed up in step 3, and manually migrate over any customizations you wish to preserve from that file (e.g., `PCAP_FILTER`, `MAXMIND_GEOIP_DB_LICENSE_KEY`, `MANAGE_PCAP_FILES`; [anything else](malcolm-config.md#DockerComposeYml) you may have edited by hand in `docker-compose.yml` that's not prompted for in `install.py --configure`) 7. pull the new docker images (this will take a while) * `docker-compose pull` to pull them from Docker Hub or `docker-compose load -i malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz` if you have an offline tarball of the Malcolm docker images 8. start Malcolm * `./scripts/start` -9. you may be prompted to [configure authentication](#AuthSetup) if there are new authentication-related files that need to be generated +9. you may be prompted to [configure authentication](authsetup.md#AuthSetup) if there are new authentication-related files that need to be generated * you probably do not need to re-generate self-signed certificates ### Post-upgrade @@ -62,7 +62,7 @@ If you are technically-minded, you may wish to follow the debug output provided Running `docker-compose ps -a` should give you a good idea if all of Malcolm's Docker containers started up and, in some cases, may be able to indicate if the containers are "healthy" or not. -After upgrading following one of the previous outlines, give Malcolm several minutes to get started. Once things are up and running, open one of Malcolm's [web interfaces](#UserInterfaceURLs) to verify that things are working. +After upgrading following one of the previous outlines, give Malcolm several minutes to get started. Once things are up and running, open one of Malcolm's [web interfaces](quickstart.md#UserInterfaceURLs) to verify that things are working. #### Loading new OpenSearch Dashboards visualizations diff --git a/docs/netbox.md b/docs/netbox.md index 43b45b6c1..5694ec3a4 100644 --- a/docs/netbox.md +++ b/docs/netbox.md @@ -2,6 +2,6 @@ Malcolm provides an instance of [NetBox](https://netbox.dev/), an open-source "solution for modeling and documenting modern networks." The NetBox web interface is available at at [https://localhost/netbox/](https://localhost/netbox/) if you are connecting locally. -The design of a potentially deeper integration between Malcolm and Netbox is a work in progress. The purpose of an asset management system is to document the intended state of a network: were Malcolm to actively and agressively populate NetBox with the live network state, a network configuration fault could result in an incorrect documented configuration. The Malcolm development team is investigating what data, if any, should automatically flow to NetBox based on traffic observed (enabled via the `NETBOX_CRON` [environment variable in `docker-compose.yml`](#DockerComposeYml)), and what NetBox inventory data could be used, if any, to enrich Malcolm's network traffic metadata. Well-considered suggestions in this area [are welcome](mailto:malcolm@inl.gov?subject=NetBox). +The design of a potentially deeper integration between Malcolm and Netbox is a work in progress. The purpose of an asset management system is to document the intended state of a network: were Malcolm to actively and agressively populate NetBox with the live network state, a network configuration fault could result in an incorrect documented configuration. The Malcolm development team is investigating what data, if any, should automatically flow to NetBox based on traffic observed (enabled via the `NETBOX_CRON` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml)), and what NetBox inventory data could be used, if any, to enrich Malcolm's network traffic metadata. Well-considered suggestions in this area [are welcome](mailto:malcolm@inl.gov?subject=NetBox). Please see the [NetBox page on GitHub](https://github.com/netbox-community/netbox), its [documentation](https://docs.netbox.dev/en/stable/) and its [public demo](https://demo.netbox.dev/) for more information. \ No newline at end of file diff --git a/docs/opensearch-instances.md b/docs/opensearch-instances.md index ffcb57b68..e8e32a2b5 100644 --- a/docs/opensearch-instances.md +++ b/docs/opensearch-instances.md @@ -4,7 +4,7 @@ Malcolm's default standalone configuration is to use a local [OpenSearch](https: As the permutations of OpenSearch cluster configurations are numerous, it is beyond Malcolm's scope to set up multi-node clusters. However, Malcolm can be configured to use a remote OpenSearch cluster rather than its own internal instance. -The `OPENSEARCH_…` [environment variables in `docker-compose.yml`](#DockerComposeYml) control whether Malcolm uses its own local OpenSearch instance or a remote OpenSearch instance as its primary data store. The configuration portion of Malcolm install script ([`./scripts/install.py --configure`](#ConfigAndTuning)) can help you configure these options. +The `OPENSEARCH_…` [environment variables in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) control whether Malcolm uses its own local OpenSearch instance or a remote OpenSearch instance as its primary data store. The configuration portion of Malcolm install script ([`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning)) can help you configure these options. For example, to use the default standalone configuration, answer `Y` when prompted `Should Malcolm use and maintain its own OpenSearch instance?`. @@ -22,7 +22,7 @@ You must run auth_setup after install.py to store OpenSearch connection credenti … ``` -Whether the primary OpenSearch instance is a locally maintained single-node instance or is a remote cluster, Malcolm can be configured additionally forward logs to a secondary remote OpenSearch instance. The `OPENSEARCH_SECONDARY_…` [environment variables in `docker-compose.yml`](#DockerComposeYml) control this behavior. Configuration of a remote secondary OpenSearch instance is similar to that of a remote primary OpenSearch instance: +Whether the primary OpenSearch instance is a locally maintained single-node instance or is a remote cluster, Malcolm can be configured additionally forward logs to a secondary remote OpenSearch instance. The `OPENSEARCH_SECONDARY_…` [environment variables in `docker-compose.yml`](malcolm-config.md#DockerComposeYml) control this behavior. Configuration of a remote secondary OpenSearch instance is similar to that of a remote primary OpenSearch instance: ``` @@ -39,7 +39,7 @@ You must run auth_setup after install.py to store OpenSearch connection credenti #### Authentication and authorization for remote OpenSearch clusters -In addition to setting the environment variables in [`docker-compose.yml`](#DockerComposeYml) as described above, you must provide Malcolm with credentials for it to be able to communicate with remote OpenSearch instances. These credentials are stored in the Malcolm installation directory as `.opensearch.primary.curlrc` and `.opensearch.secondary.curlrc` for the primary and secondary OpenSearch connections, respectively, and are bind mounted into the Docker containers which need to communicate with OpenSearch. These [cURL-formatted](https://everything.curl.dev/cmdline/configfile) config files can be generated for you by the [`auth_setup`](#AuthSetup) script as illustrated: +In addition to setting the environment variables in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) as described above, you must provide Malcolm with credentials for it to be able to communicate with remote OpenSearch instances. These credentials are stored in the Malcolm installation directory as `.opensearch.primary.curlrc` and `.opensearch.secondary.curlrc` for the primary and secondary OpenSearch connections, respectively, and are bind mounted into the Docker containers which need to communicate with OpenSearch. These [cURL-formatted](https://everything.curl.dev/cmdline/configfile) config files can be generated for you by the [`auth_setup`](authsetup.md#AuthSetup) script as illustrated: ``` $ ./scripts/auth_setup @@ -73,6 +73,6 @@ $ ls -la .opensearch.*.curlrc -rw------- 1 user user 35 Aug 22 14:18 .opensearch.secondary.curlrc ``` -One caveat with Malcolm using a remote OpenSearch cluster as its primary document store is that the accounts used to access Malcolm's [web interfaces](#UserInterfaceURLs), particularly [OpenSearch Dashboards](#Dashboards), are in some instance passed directly through to OpenSearch itself. For this reason, both Malcolm and the remote primary OpenSearch instance must have the same account information. The easiest way to accomplish this is to use an Active Directory/LDAP server that both [Malcolm](#AuthLDAP) and [OpenSearch](https://opensearch.org/docs/latest/security-plugin/configuration/ldap/) use as a common authentication backend. +One caveat with Malcolm using a remote OpenSearch cluster as its primary document store is that the accounts used to access Malcolm's [web interfaces](quickstart.md#UserInterfaceURLs), particularly [OpenSearch Dashboards](dashboards.md#Dashboards), are in some instance passed directly through to OpenSearch itself. For this reason, both Malcolm and the remote primary OpenSearch instance must have the same account information. The easiest way to accomplish this is to use an Active Directory/LDAP server that both [Malcolm](authsetup.md#AuthLDAP) and [OpenSearch](https://opensearch.org/docs/latest/security-plugin/configuration/ldap/) use as a common authentication backend. See the OpenSearch documentation on [access control](https://opensearch.org/docs/latest/security-plugin/access-control/index/) for more information. \ No newline at end of file diff --git a/docs/preparation.md b/docs/preparation.md index 4180d587b..d576dfe3d 100644 --- a/docs/preparation.md +++ b/docs/preparation.md @@ -1,233 +1 @@ ## Preparing your system - -### Recommended system requirements - -Malcolm runs on top of [Docker](https://www.docker.com/) which runs on recent releases of Linux, Apple macOS and Microsoft Windows 10. - -To quote the [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html), "If there is one resource that you will run out of first, it will likely be memory." The same is true for Malcolm: you will want at least 16 gigabytes of RAM to run Malcolm comfortably. For processing large volumes of traffic, I'd recommend at a bare minimum a dedicated server with 16 cores and 16 gigabytes of RAM. Malcolm can run on less, but more is better. You're going to want as much hard drive space as possible, of course, as the amount of PCAP data you're able to analyze and store will be limited by your hard drive. - -Arkime's wiki has a couple of documents ([here](https://github.com/arkime/arkime#hardware-requirements) and [here](https://github.com/arkime/arkime/wiki/FAQ#what-kind-of-capture-machines-should-we-buy) and [here](https://github.com/arkime/arkime/wiki/FAQ#how-many-elasticsearch-nodes-or-machines-do-i-need) and a [calculator here](https://molo.ch/#estimators)) which may be helpful, although not everything in those documents will apply to a Docker-based setup like Malcolm. - -### System configuration and tuning - -If you already have Docker and Docker Compose installed, the `install.py` script can still help you tune system configuration and `docker-compose.yml` parameters for Malcolm. To run it in "configuration only" mode, bypassing the steps to install Docker and Docker Compose, run it like this: -``` -./scripts/install.py --configure -``` - -Although `install.py` will attempt to automate many of the following configuration and tuning parameters, they are nonetheless listed in the following sections for reference: - -#### `docker-compose.yml` parameters - -Edit `docker-compose.yml` and search for the `OPENSEARCH_JAVA_OPTS` key. Edit the `-Xms4g -Xmx4g` values, replacing `4g` with a number that is half of your total system memory, or just under 32 gigabytes, whichever is less. So, for example, if I had 64 gigabytes of memory I would edit those values to be `-Xms31g -Xmx31g`. This indicates how much memory can be allocated to the OpenSearch heaps. For a pleasant experience, I would suggest not using a value under 10 gigabytes. Similar values can be modified for Logstash with `LS_JAVA_OPTS`, where using 3 or 4 gigabytes is recommended. - -Various other environment variables inside of `docker-compose.yml` can be tweaked to control aspects of how Malcolm behaves, particularly with regards to processing PCAP files and Zeek logs. The environment variables of particular interest are located near the top of that file under **Commonly tweaked configuration options**, which include: - -* `ARKIME_ANALYZE_PCAP_THREADS` – the number of threads available to Arkime for analyzing PCAP files (default `1`) -* `AUTO_TAG` – if set to `true`, Malcolm will automatically create Arkime sessions and Zeek logs with tags based on the filename, as described in [Tagging](#Tagging) (default `true`) -* `BEATS_SSL` – if set to `true`, Logstash will use require encrypted communications for any external [Beats](https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html)-based forwarders from which it will accept logs (default `true`) -* `CONNECTION_SECONDS_SEVERITY_THRESHOLD` - when [severity scoring](#Severity) is enabled, this variable indicates the duration threshold (in seconds) for assigning severity to long connections (default `3600`) -* `EXTRACTED_FILE_CAPA_VERBOSE` – if set to `true`, all Capa rule hits will be logged; otherwise (`false`) only [MITRE ATT&CK® technique](https://attack.mitre.org/techniques) classifications will be logged -* `EXTRACTED_FILE_ENABLE_CAPA` – if set to `true`, [Zeek-extracted files](#ZeekFileExtraction) that are determined to be PE (portable executable) files will be scanned with [Capa](https://github.com/fireeye/capa) -* `EXTRACTED_FILE_ENABLE_CLAMAV` – if set to `true`, [Zeek-extracted files](#ZeekFileExtraction) will be scanned with [ClamAV](https://www.clamav.net/) -* `EXTRACTED_FILE_ENABLE_YARA` – if set to `true`, [Zeek-extracted files](#ZeekFileExtraction) will be scanned with [Yara](https://github.com/VirusTotal/yara) -* `EXTRACTED_FILE_HTTP_SERVER_ENABLE` – if set to `true`, the directory containing [Zeek-extracted files](#ZeekFileExtraction) will be served over HTTP at `./extracted-files/` (e.g., [https://localhost/extracted-files/](https://localhost/extracted-files/) if you are connecting locally) -* `EXTRACTED_FILE_HTTP_SERVER_ENCRYPT` – if set to `true`, those Zeek-extracted files will be AES-256-CBC-encrypted in an `openssl enc`-compatible format (e.g., `openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe`) -* `EXTRACTED_FILE_HTTP_SERVER_KEY` – specifies the AES-256-CBC decryption password for encrypted Zeek-extracted files; used in conjunction with `EXTRACTED_FILE_HTTP_SERVER_ENCRYPT` -* `EXTRACTED_FILE_IGNORE_EXISTING` – if set to `true`, files extant in `./zeek-logs/extract_files/` directory will be ignored on startup rather than scanned -* `EXTRACTED_FILE_PRESERVATION` – determines behavior for preservation of [Zeek-extracted files](#ZeekFileExtraction) -* `EXTRACTED_FILE_UPDATE_RULES` – if set to `true`, file scanner engines (e.g., ClamAV, Capa, Yara) will periodically update their rule definitions -* `EXTRACTED_FILE_YARA_CUSTOM_ONLY` – if set to `true`, Malcolm will bypass the default [Yara ruleset](https://github.com/Neo23x0/signature-base) and use only user-defined rules in `./yara/rules` -* `FREQ_LOOKUP` - if set to `true`, domain names (from DNS queries and SSL server names) will be assigned entropy scores as calculated by [`freq`](https://github.com/MarkBaggett/freq) (default `false`) -* `FREQ_SEVERITY_THRESHOLD` - when [severity scoring](#Severity) is enabled, this variable indicates the entropy threshold for assigning severity to events with entropy scores calculated by [`freq`](https://github.com/MarkBaggett/freq); a lower value will only assign severity scores to fewer domain names with higher entropy (e.g., `2.0` for `NQZHTFHRMYMTVBQJE.COM`), while a higher value will assign severity scores to more domain names with lower entropy (e.g., `7.5` for `naturallanguagedomain.example.org`) (default `2.0`) -* `LOGSTASH_OUI_LOOKUP` – if set to `true`, Logstash will map MAC addresses to vendors for all source and destination MAC addresses when analyzing Zeek logs (default `true`) -* `LOGSTASH_REVERSE_DNS` – if set to `true`, Logstash will perform a reverse DNS lookup for all external source and destination IP address values when analyzing Zeek logs (default `false`) -* `LOGSTASH_SEVERITY_SCORING` - if set to `true`, Logstash will perform [severity scoring](#Severity) when analyzing Zeek logs (default `true`) -* `MANAGE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will be marked as available for deletion by Arkime if available storage space becomes too low (default `false`) -* `MAXMIND_GEOIP_DB_LICENSE_KEY` - Malcolm uses MaxMind's free GeoLite2 databases for GeoIP lookups. As of December 30, 2019, these databases are [no longer available](https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/) for download via a public URL. Instead, they must be downloaded using a MaxMind license key (available without charge [from MaxMind](https://www.maxmind.com/en/geolite2/signup)). The license key can be specified here for GeoIP database downloads during build- and run-time. -* `OPENSEARCH_LOCAL` - if set to `true`, Malcolm will use its own internal [OpenSearch instance](#OpenSearchInstance) (default `true`) -* `OPENSEARCH_URL` - when using Malcolm's internal OpenSearch instance (i.e., `OPENSEARCH_LOCAL` is `true`) this should be `http://opensearch:9200`, otherwise this value specifies the primary remote instance URL in the format `protocol://host:port` (default `http://opensearch:9200`) -* `OPENSEARCH_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the primary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`) -* `OPENSEARCH_SECONDARY` - if set to `true`, Malcolm will forward logs to a secondary remote OpenSearch instance in addition to the primary (local or remote) OpenSearch instance (default `false`) -* `OPENSEARCH_SECONDARY_URL` - when forwarding to a secondary remote OpenSearch instance (i.e., `OPENSEARCH_SECONDARY` is `true`) this value specifies the secondary remote instance URL in the format `protocol://host:port` -* `OPENSEARCH_SECONDARY_SSL_CERTIFICATE_VERIFICATION` - if set to `true`, connections to the secondary remote OpenSearch instance will require full TLS certificate validation (this may fail if using self-signed certificates) (default `false`) -* `NETBOX_DISABLED` - if set to `true`, Malcolm will **not** start [NetBox](#NetBox) and manage a [NetBox](#NetBox) instance (default `true`) -* `NETBOX_CRON` - if set to `true`, network traffic metadata will periodically be queried and used to populate Malcolm's [NetBox](#NetBox) instance -* `NGINX_BASIC_AUTH` - if set to `true`, use [TLS-encrypted HTTP basic](#AuthBasicAccountManagement) authentication (default); if set to `false`, use [Lightweight Directory Access Protocol (LDAP)](#AuthLDAP) authentication -* `NGINX_LOG_ACCESS_AND_ERRORS` - if set to `true`, all access to Malcolm via its [web interfaces](#UserInterfaceURLs) will be logged to OpenSearch (default `false`) -* `NGINX_SSL` - if set to `true`, require HTTPS connections to Malcolm's `nginx-proxy` container (default); if set to `false`, use unencrypted HTTP connections (using unsecured HTTP connections is **NOT** recommended unless you are running Malcolm behind another reverse proxy like Traefik, Caddy, etc.) -* `PCAP_ENABLE_NETSNIFF` – if set to `true`, Malcolm will capture network traffic on the local network interface(s) indicated in `PCAP_IFACE` using [netsniff-ng](http://netsniff-ng.org/) -* `PCAP_ENABLE_TCPDUMP` – if set to `true`, Malcolm will capture network traffic on the local network interface(s) indicated in `PCAP_IFACE` using [tcpdump](https://www.tcpdump.org/); there is no reason to enable *both* `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP` -* `PCAP_FILTER` – specifies a tcpdump-style filter expression for local packet capture; leave blank to capture all traffic -* `PCAP_IFACE` – used to specify the network interface(s) for local packet capture if `PCAP_ENABLE_NETSNIFF`, `PCAP_ENABLE_TCPDUMP`, `ZEEK_LIVE_CAPTURE` or `SURICATA_LIVE_CAPTURE` are enabled; for multiple interfaces, separate the interface names with a comma (e.g., `'enp0s25'` or `'enp10s0,enp11s0'`) -* `PCAP_IFACE_TWEAK` - if set to `true`, Malcolm will [use `ethtool`](shared/bin/nic-capture-setup.sh) to disable NIC hardware offloading features and adjust ring buffer sizes for capture interface(s); this should be `true` if the interface(s) are being used for capture only, `false` if they are being used for management/communication -* `PCAP_ROTATE_MEGABYTES` – used to specify how large a locally-captured PCAP file can become (in megabytes) before it is closed for processing and a new PCAP file created -* `PCAP_ROTATE_MINUTES` – used to specify a time interval (in minutes) after which a locally-captured PCAP file will be closed for processing and a new PCAP file created -* `pipeline.workers`, `pipeline.batch.size` and `pipeline.batch.delay` - these settings are used to tune the performance and resource utilization of the the `logstash` container; see [Tuning and Profiling Logstash Performance](https://www.elastic.co/guide/en/logstash/current/tuning-logstash.html), [`logstash.yml`](https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html) and [Multiple Pipelines](https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html) -* `PUID` and `PGID` - Docker runs all of its containers as the privileged `root` user by default. For better security, Malcolm immediately drops to non-privileged user accounts for executing internal processes wherever possible. The `PUID` (**p**rocess **u**ser **ID**) and `PGID` (**p**rocess **g**roup **ID**) environment variables allow Malcolm to map internal non-privileged user accounts to a corresponding [user account](https://en.wikipedia.org/wiki/User_identifier) on the host. -* `SENSITIVE_COUNTRY_CODES` - when [severity scoring](#Severity) is enabled, this variable defines a comma-separated list of sensitive countries (using [ISO 3166-1 alpha-2 codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Current_codes)) (default `'AM,AZ,BY,CN,CU,DZ,GE,HK,IL,IN,IQ,IR,KG,KP,KZ,LY,MD,MO,PK,RU,SD,SS,SY,TJ,TM,TW,UA,UZ'`, taken from the U.S. Department of Energy Sensitive Country List) -* `SURICATA_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Suricata, and the resulting logs will also be imported (default `false`) -* `SURICATA_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Suricata logs (default `1`) -* `SURICATA_CUSTOM_RULES_ONLY` – if set to `true`, Malcolm will bypass the default [Suricata ruleset](https://github.com/OISF/suricata/tree/master/rules) and use only user-defined rules (`./suricata/rules/*.rules`). -* `SURICATA_UPDATE_RULES` – if set to `true`, Suricata signatures will periodically be updated (default `false`) -* `SURICATA_LIVE_CAPTURE` - if set to `true`, Suricata will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` -* `SURICATA_ROTATED_PCAP` - if set to `true`, Suricata can analyze captured PCAP files captured by `netsniff-ng` or `tcpdump` (see `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP`, as well as `SURICATA_AUTO_ANALYZE_PCAP_FILES`); if `SURICATA_LIVE_CAPTURE` is `true`, this should be false, otherwise Suricata will see duplicate traffic -* `SURICATA_…` - the [`suricata` container entrypoint script](shared/bin/suricata_config_populate.py) can use **many** more environment variables to tweak [suricata.yaml](https://github.com/OISF/suricata/blob/master/suricata.yaml.in); in that script, `DEFAULT_VARS` defines those variables (albeit without the `SURICATA_` prefix you must add to each for use) -* `TOTAL_MEGABYTES_SEVERITY_THRESHOLD` - when [severity scoring](#Severity) is enabled, this variable indicates the size threshold (in megabytes) for assigning severity to large connections or file transfers (default `1000`) -* `VTOT_API2_KEY` – used to specify a [VirusTotal Public API v.20](https://www.virustotal.com/en/documentation/public-api/) key, which, if specified, will be used to submit hashes of [Zeek-extracted files](#ZeekFileExtraction) to VirusTotal -* `ZEEK_AUTO_ANALYZE_PCAP_FILES` – if set to `true`, all PCAP files imported into Malcolm will automatically be analyzed by Zeek, and the resulting logs will also be imported (default `false`) -* `ZEEK_AUTO_ANALYZE_PCAP_THREADS` – the number of threads available to Malcolm for analyzing Zeek logs (default `1`) -* `ZEEK_DISABLE_…` - if set to any non-blank value, each of these variables can be used to disable a certain Zeek function when it analyzes PCAP files (for example, setting `ZEEK_DISABLE_LOG_PASSWORDS` to `true` to disable logging of cleartext passwords) -* `ZEEK_DISABLE_BEST_GUESS_ICS` - see ["Best Guess" Fingerprinting for ICS Protocols](#ICSBestGuess) -* `ZEEK_EXTRACTOR_MODE` – determines the file extraction behavior for file transfers detected by Zeek; see [Automatic file extraction and scanning](#ZeekFileExtraction) for more details -* `ZEEK_INTEL_FEED_SINCE` - when querying a [TAXII](#ZeekIntelSTIX) or [MISP](#ZeekIntelMISP) feed, only process threat indicators that have been created or modified since the time represented by this value; it may be either a fixed date/time (`01/01/2021`) or relative interval (`30 days ago`) -* `ZEEK_INTEL_ITEM_EXPIRATION` - specifies the value for Zeek's [`Intel::item_expiration`](https://docs.zeek.org/en/current/scripts/base/frameworks/intel/main.zeek.html#id-Intel::item_expiration) timeout as used by the [Zeek Intelligence Framework](#ZeekIntel) (default `-1min`, which disables item expiration) -* `ZEEK_INTEL_REFRESH_CRON_EXPRESSION` - specifies a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) indicating the refresh interval for generating the [Zeek Intelligence Framework](#ZeekIntel) files (defaults to empty, which disables automatic refresh) -* `ZEEK_LIVE_CAPTURE` - if set to `true`, Zeek will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` -* `ZEEK_ROTATED_PCAP` - if set to `true`, Zeek can analyze captured PCAP files captured by `netsniff-ng` or `tcpdump` (see `PCAP_ENABLE_NETSNIFF` and `PCAP_ENABLE_TCPDUMP`, as well as `ZEEK_AUTO_ANALYZE_PCAP_FILES`); if `ZEEK_LIVE_CAPTURE` is `true`, this should be false, otherwise Zeek will see duplicate traffic - -#### Linux host system configuration - -##### Installing Docker - -Docker installation instructions vary slightly by distribution. Please follow the links below to docker.com to find the instructions specific to your distribution: - -* [Ubuntu](https://docs.docker.com/install/linux/docker-ce/ubuntu/) -* [Debian](https://docs.docker.com/install/linux/docker-ce/debian/) -* [Fedora](https://docs.docker.com/install/linux/docker-ce/fedora/) -* [CentOS](https://docs.docker.com/install/linux/docker-ce/centos/) -* [Binaries](https://docs.docker.com/install/linux/docker-ce/binaries/) - -After installing Docker, because Malcolm should be run as a non-root user, add your user to the `docker` group with something like: -``` -$ sudo usermod -aG docker yourusername -``` - -Following this, either reboot or log out then log back in. - -Docker starts automatically on DEB-based distributions. On RPM-based distributions, you need to start it manually or enable it using the appropriate `systemctl` or `service` command(s). - -You can test docker by running `docker info`, or (assuming you have internet access), `docker run --rm hello-world`. - -##### Installing docker-compose - -Please follow [this link](https://docs.docker.com/compose/install/) on docker.com for instructions on installing docker-compose. - -##### Operating system configuration - -The host system (ie., the one running Docker) will need to be configured for the [best possible OpenSearch performance](https://www.elastic.co/guide/en/elasticsearch/reference/master/system-config.html). Here are a few suggestions for Linux hosts (these may vary from distribution to distribution): - -* Append the following lines to `/etc/sysctl.conf`: - -``` -# the maximum number of open file handles -fs.file-max=2097152 - -# increase maximums for inotify watches -fs.inotify.max_user_watches=131072 -fs.inotify.max_queued_events=131072 -fs.inotify.max_user_instances=512 - -# the maximum number of memory map areas a process may have -vm.max_map_count=262144 - -# decrease "swappiness" (swapping out runtime memory vs. dropping pages) -vm.swappiness=1 - -# the maximum number of incoming connections -net.core.somaxconn=65535 - -# the % of system memory fillable with "dirty" pages before flushing -vm.dirty_background_ratio=40 - -# maximum % of dirty system memory before committing everything -vm.dirty_ratio=80 -``` - -* Depending on your distribution, create **either** the file `/etc/security/limits.d/limits.conf` containing: - -``` -# the maximum number of open file handles -* soft nofile 65535 -* hard nofile 65535 -# do not limit the size of memory that can be locked -* soft memlock unlimited -* hard memlock unlimited -``` - -**OR** the file `/etc/systemd/system.conf.d/limits.conf` containing: - -``` -[Manager] -# the maximum number of open file handles -DefaultLimitNOFILE=65535:65535 -# do not limit the size of memory that can be locked -DefaultLimitMEMLOCK=infinity -``` - -* Change the readahead value for the disk where the OpenSearch data will be stored. There are a few ways to do this. For example, you could add this line to `/etc/rc.local` (replacing `/dev/sda` with your disk block descriptor): - -``` -# change disk read-adhead value (# of blocks) -blockdev --setra 512 /dev/sda -``` - -* Change the I/O scheduler to `deadline` or `noop`. Again, this can be done in a variety of ways. The simplest is to add `elevator=deadline` to the arguments in `GRUB_CMDLINE_LINUX` in `/etc/default/grub`, then running `sudo update-grub2` - -* If you are planning on using very large data sets, consider formatting the drive containing the `opensearch` volume as XFS. - -After making all of these changes, do a reboot for good measure! - -#### macOS host system configuration - -##### Automatic installation using `install.py` - -The `install.py` script will attempt to guide you through the installation of Docker and Docker Compose if they are not present. If that works for you, you can skip ahead to **Configure docker daemon option** in this section. - -##### Install Homebrew - -The easiest way to install and maintain docker on Mac is using the [Homebrew cask](https://brew.sh). Execute the following in a terminal. - -``` -$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)" -$ brew install cask -$ brew tap homebrew/cask-versions -``` - -##### Install docker-edge - -``` -$ brew cask install docker-edge -``` -This will install the latest version of docker and docker-compose. It can be upgraded later using `brew` as well: -``` -$ brew cask upgrade --no-quarantine docker-edge -``` -You can now run docker from the Applications folder. - -##### Configure docker daemon option - -Some changes should be made for performance ([this link](http://markshust.com/2018/01/30/performance-tuning-docker-mac) gives a good succinct overview). - -* **Resource allocation** - For a good experience, you likely need at least a quad-core MacBook Pro with 16GB RAM and an SSD. I have run Malcolm on an older 2013 MacBook Pro with 8GB of RAM, but the more the better. Go in your system tray and select **Docker** → **Preferences** → **Advanced**. Set the resources available to docker to at least 4 CPUs and 8GB of RAM (>= 16GB is preferable). - -* **Volume mount performance** - You can speed up performance of volume mounts by removing unused paths from **Docker** → **Preferences** → **File Sharing**. For example, if you're only going to be mounting volumes under your home directory, you could share `/Users` but remove other paths. - -After making these changes, right click on the Docker 🐋 icon in the system tray and select **Restart**. - -#### Windows host system configuration - -#### Installing and configuring Docker Desktop for Windows - -Installing and configuring [Docker to run under Windows](https://docs.docker.com/desktop/windows/wsl/) must be done manually, rather than through the `install.py` script as is done for Linux and macOS. - -1. Be running Windows 10, version 1903 or higher -1. Prepare your system and [install WSL](https://docs.microsoft.com/en-us/windows/wsl/install) and a Linux distribution by running `wsl --install -d Debian` in PowerShell as Administrator (these instructions are tested with Debian, but may work with other distributions) -1. Install Docker Desktop for Windows either by downloading the installer from the [official Docker site](https://hub.docker.com/editions/community/docker-ce-desktop-windows) or installing it through [chocolatey](https://chocolatey.org/packages/docker-desktop). -1. Follow the [Docker Desktop WSL 2 backend](https://docs.docker.com/desktop/windows/wsl/) instructions to finish configuration and review best practices -1. Reboot -1. Open the WSL distribution's terminal and run run `docker info` to make sure Docker is running - -#### Finish Malcolm's configuration - -Once Docker is installed, configured and running as described in the previous section, run [`./scripts/install.py --configure`](#ConfigAndTuning) to finish configuration of the local Malcolm installation. Malcolm will be controlled and run from within your WSL distribution's terminal environment. \ No newline at end of file diff --git a/docs/protocols.md b/docs/protocols.md index a6899d83e..c02ff690a 100644 --- a/docs/protocols.md +++ b/docs/protocols.md @@ -59,6 +59,6 @@ As part of its network traffic analysis, Zeek can extract and analyze files tran * [Portable executable](https://docs.zeek.org/en/stable/scripts/base/files/pe/main.zeek.html#type-PE::Info) files * [X.509](https://docs.zeek.org/en/stable/scripts/base/files/x509/main.zeek.html#type-X509::Info) certificates -See [automatic file extraction and scanning](#ZeekFileExtraction) for additional features related to file scanning. +See [automatic file extraction and scanning](file-scanning.md#ZeekFileExtraction) for additional features related to file scanning. -See [Zeek log integration](#ArkimeZeek) for more information on how Malcolm integrates [Arkime sessions and Zeek logs](#ZeekArkimeFlowCorrelation) for analysis. \ No newline at end of file +See [Zeek log integration](arkime.md#ArkimeZeek) for more information on how Malcolm integrates [Arkime sessions and Zeek logs](arkime.md#ZeekArkimeFlowCorrelation) for analysis. \ No newline at end of file diff --git a/docs/queries-cheat-sheet.md b/docs/queries-cheat-sheet.md index eae198894..1dcce2802 100644 --- a/docs/queries-cheat-sheet.md +++ b/docs/queries-cheat-sheet.md @@ -71,4 +71,4 @@ In addition to the fields listed above, Arkime provides several special field al | Protocol (layers >= 4) | `protocols == tls` | `protocol:tls` | | User | `user == EXISTS! && user != anonymous` | `_exists_:user AND (NOT user:anonymous)` | -For details on how to filter both Zeek logs and Arkime session records for a particular connection, see [Correlating Zeek logs and Arkime sessions](#ZeekArkimeFlowCorrelation). \ No newline at end of file +For details on how to filter both Zeek logs and Arkime session records for a particular connection, see [Correlating Zeek logs and Arkime sessions](arkime.md#ZeekArkimeFlowCorrelation). \ No newline at end of file diff --git a/docs/quickstart.md b/docs/quickstart.md index 2216b0d47..612f397f5 100644 --- a/docs/quickstart.md +++ b/docs/quickstart.md @@ -2,9 +2,9 @@ ### Getting Malcolm -For a `TL;DR` example of downloading, configuring, and running Malcolm on a Linux platform, see [Installation example using Ubuntu 22.04 LTS](#InstallationExample). +For a `TL;DR` example of downloading, configuring, and running Malcolm on a Linux platform, see [Installation example using Ubuntu 22.04 LTS](ubuntu-install-example.md#InstallationExample). -The scripts to control Malcolm require Python 3. The [`install.py`](#ConfigAndTuning) script requires the [requests](https://docs.python-requests.org/en/latest/) module for Python 3, and will make use of the [pythondialog](https://pythondialog.sourceforge.io/) module for user interaction (on Linux) if it is available. +The scripts to control Malcolm require Python 3. The [`install.py`](malcolm-config.md#ConfigAndTuning) script requires the [requests](https://docs.python-requests.org/en/latest/) module for Python 3, and will make use of the [pythondialog](https://pythondialog.sourceforge.io/) module for user interaction (on Linux) if it is available. #### Source code @@ -12,11 +12,11 @@ The files required to build and run Malcolm are available on its [GitHub page](h #### Building Malcolm from scratch -The `build.sh` script can build Malcolm's Docker images from scratch. See [Building from source](#Build) for more information. +The `build.sh` script can build Malcolm's Docker images from scratch. See [Building from source](development.md#Build) for more information. #### Initial configuration -You must run [`auth_setup`](#AuthSetup) prior to pulling Malcolm's Docker images. You should also ensure your system configuration and `docker-compose.yml` settings are tuned by running `./scripts/install.py` or `./scripts/install.py --configure` (see [System configuration and tuning](#ConfigAndTuning)). +You must run [`auth_setup`](authsetup.md#AuthSetup) prior to pulling Malcolm's Docker images. You should also ensure your system configuration and `docker-compose.yml` settings are tuned by running `./scripts/install.py` or `./scripts/install.py --configure` (see [System configuration and tuning](malcolm-config.md#ConfigAndTuning)). #### Pull Malcolm's Docker images @@ -73,7 +73,7 @@ malcolmnetsec/zeek 6.4.0 x #### Import from pre-packaged tarballs -Once built, the `malcolm_appliance_packager.sh` script can be used to create pre-packaged Malcolm tarballs for import on another machine. See [Pre-Packaged Installation Files](#Packager) for more information. +Once built, the `malcolm_appliance_packager.sh` script can be used to create pre-packaged Malcolm tarballs for import on another machine. See [Pre-Packaged Installation Files](development.md#Packager) for more information. ### Starting and stopping Malcolm @@ -86,8 +86,8 @@ A few minutes after starting Malcolm (probably 5 to 10 minutes for Logstash to b * [Arkime](https://arkime.com/): [https://localhost:443](https://localhost:443) * [OpenSearch Dashboards](https://opensearch.org/docs/latest/dashboards/index/): [https://localhost/dashboards/](https://localhost/dashboards/) or [https://localhost:5601](https://localhost:5601) -* [Capture File and Log Archive Upload (Web)](#Upload): [https://localhost/upload/](https://localhost/upload/) -* [Capture File and Log Archive Upload (SFTP)](#Upload): `sftp://@127.0.0.1:8022/files` -* [Host and Subnet Name Mapping](#HostAndSubnetNaming) Editor: [https://localhost/name-map-ui/](https://localhost/name-map-ui/) -* [NetBox](#NetBox): [https://localhost/netbox/](https://localhost/netbox/) -* [Account Management](#AuthBasicAccountManagement): [https://localhost:488](https://localhost:488) \ No newline at end of file +* [Capture File and Log Archive Upload (Web)](upload.md#Upload): [https://localhost/upload/](https://localhost/upload/) +* [Capture File and Log Archive Upload (SFTP)](upload.md#Upload): `sftp://@127.0.0.1:8022/files` +* [Host and Subnet Name Mapping](host-and-subnet-mapping.md#HostAndSubnetNaming) Editor: [https://localhost/name-map-ui/](https://localhost/name-map-ui/) +* [NetBox](netbox.md#NetBox): [https://localhost/netbox/](https://localhost/netbox/) +* [Account Management](authsetup.md#AuthBasicAccountManagement): [https://localhost:488](https://localhost:488) \ No newline at end of file diff --git a/docs/running.md b/docs/running.md index e70aaebc7..7c66f422c 100644 --- a/docs/running.md +++ b/docs/running.md @@ -16,11 +16,11 @@ You can also use `docker stats` to monitor the resource utilization of running c You can run `./scripts/stop` to stop the docker containers and remove their virtual network. Alternatively, `./scripts/restart` will restart an instance of Malcolm. Because the data on disk is stored on the host in docker volumes, doing these operations will not result in loss of data. -Malcolm can be configured to be automatically restarted when the Docker system daemon restart (for example, on system reboot). This behavior depends on the [value](https://docs.docker.com/config/containers/start-containers-automatically/) of the [`restart:`](https://docs.docker.com/compose/compose-file/#restart) setting for each service in the `docker-compose.yml` file. This value can be set by running [`./scripts/install.py --configure`](#ConfigAndTuning) and answering "yes" to "`Restart Malcolm upon system or Docker daemon restart?`." +Malcolm can be configured to be automatically restarted when the Docker system daemon restart (for example, on system reboot). This behavior depends on the [value](https://docs.docker.com/config/containers/start-containers-automatically/) of the [`restart:`](https://docs.docker.com/compose/compose-file/#restart) setting for each service in the `docker-compose.yml` file. This value can be set by running [`./scripts/install.py --configure`](malcolm-config.md#ConfigAndTuning) and answering "yes" to "`Restart Malcolm upon system or Docker daemon restart?`." ### Clearing Malcolm's data -Run `./scripts/wipe` to stop the Malcolm instance and wipe its OpenSearch database (**including** [index snapshots and management policies](#IndexManagement) and [alerting configuration](#Alerting)). +Run `./scripts/wipe` to stop the Malcolm instance and wipe its OpenSearch database (**including** [index snapshots and management policies](index-management.md#IndexManagement) and [alerting configuration](alerting.md#Alerting)). ### Temporary read-only interface @@ -40,4 +40,4 @@ docker-compose exec dashboards-helper /data/opensearch_read_only.py -i _cluster These commands must be re-run every time you restart Malcolm. -Note that after you run these commands you may see an increase of error messages in the Malcolm containers' output as various background processes will fail due to the read-only nature of the indices. Additionally, some features such as Arkime's [Hunt](#ArkimeHunt) and [building your own visualizations and dashboards](#BuildDashboard) in OpenSearch Dashboards will not function correctly in read-only mode. \ No newline at end of file +Note that after you run these commands you may see an increase of error messages in the Malcolm containers' output as various background processes will fail due to the read-only nature of the indices. Additionally, some features such as Arkime's [Hunt](arkime.md#ArkimeHunt) and [building your own visualizations and dashboards](dashboards.md#BuildDashboard) in OpenSearch Dashboards will not function correctly in read-only mode. \ No newline at end of file diff --git a/docs/severity.md b/docs/severity.md index acc215872..95e7f048c 100644 --- a/docs/severity.md +++ b/docs/severity.md @@ -2,24 +2,24 @@ As Zeek logs are parsed and enriched prior to indexing, a severity score up to `100` (a higher score indicating a more severe event) can be assigned when one or more of the following conditions are met: -* cross-segment network traffic (if [network subnets were defined](#HostAndSubnetNaming)) +* cross-segment network traffic (if [network subnets were defined](host-and-subnet-mapping.md#HostAndSubnetNaming)) * connection origination and destination (e.g., inbound, outbound, external, internal) * traffic to or from sensitive countries - - The comma-separated list of countries (by [ISO 3166-1 alpha-2 code](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Current_codes)) can be customized by setting the `SENSITIVE_COUNTRY_CODES` environment variable in [`docker-compose.yml`](#DockerComposeYml). + - The comma-separated list of countries (by [ISO 3166-1 alpha-2 code](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Current_codes)) can be customized by setting the `SENSITIVE_COUNTRY_CODES` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). * domain names (from DNS queries and SSL server names) with high entropy as calculated by [freq](https://github.com/MarkBaggett/freq) - - The entropy threshold for this condition to trigger can be adjusted by setting the `FREQ_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](#DockerComposeYml). A lower value will only assign severity scores to fewer domain names with higher entropy (e.g., `2.0` for `NQZHTFHRMYMTVBQJE.COM`), while a higher value will assign severity scores to more domain names with lower entropy (e.g., `7.5` for `naturallanguagedomain.example.org`). + - The entropy threshold for this condition to trigger can be adjusted by setting the `FREQ_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). A lower value will only assign severity scores to fewer domain names with higher entropy (e.g., `2.0` for `NQZHTFHRMYMTVBQJE.COM`), while a higher value will assign severity scores to more domain names with lower entropy (e.g., `7.5` for `naturallanguagedomain.example.org`). * file transfers (categorized by mime type) -* `notice.log`, [`intel.log`](#ZeekIntel) and `weird.log` entries, including those generated by Zeek plugins detecting vulnerabilities (see the list of Zeek plugins under [Components](#Components)) +* `notice.log`, [`intel.log`](zeek-intel.md#ZeekIntel) and `weird.log` entries, including those generated by Zeek plugins detecting vulnerabilities (see the list of Zeek plugins under [Components](components.md#Components)) * detection of cleartext passwords * use of insecure or outdated protocols * tunneled traffic or use of VPN protocols * rejected or aborted connections * common network services communicating over non-standard ports -* file scanning engine hits on [extracted files](#ZeekFileExtraction) +* file scanning engine hits on [extracted files](file-scanning.md#ZeekFileExtraction) * large connection or file transfer - - The size (in megabytes) threshold for this condition to trigger can be adjusted by setting the `TOTAL_MEGABYTES_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](#DockerComposeYml). + - The size (in megabytes) threshold for this condition to trigger can be adjusted by setting the `TOTAL_MEGABYTES_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). * long connection duration - - The duration (in seconds) threshold for this condition to trigger can be adjusted by setting the `CONNECTION_SECONDS_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](#DockerComposeYml). + - The duration (in seconds) threshold for this condition to trigger can be adjusted by setting the `CONNECTION_SECONDS_SEVERITY_THRESHOLD` environment variable in [`docker-compose.yml`](malcolm-config.md#DockerComposeYml). As this [feature](https://github.com/idaholab/Malcolm/issues/19) is improved it's expected that additional categories will be identified and implemented for severity scoring. @@ -33,12 +33,12 @@ These categories' severity scores can be customized by editing `logstash/maps/ma * Each category can be assigned a number between `1` and `100` for severity scoring. * Any category may be disabled by assigning it a score of `0`. -* A severity score can be assigned for any [supported protocol](#Protocols) by adding an entry with the key formatted like `"PROTOCOL_XYZ"`, where `XYZ` is the uppercased value of the protocol as stored in the `network.protocol` field. For example, to assign a score of `40` to Zeek logs generated for SSH traffic, you could add the following line to `malcolm_severity.yaml`: +* A severity score can be assigned for any [supported protocol](protocols.md#Protocols) by adding an entry with the key formatted like `"PROTOCOL_XYZ"`, where `XYZ` is the uppercased value of the protocol as stored in the `network.protocol` field. For example, to assign a score of `40` to Zeek logs generated for SSH traffic, you could add the following line to `malcolm_severity.yaml`: ``` "PROTOCOL_SSH": 40 ``` -Restart Logstash after modifying `malcolm_severity.yaml` for the changes to take effect. The [hostname and CIDR subnet names interface](#NameMapUI) provides a convenient button for restarting Logstash. +Restart Logstash after modifying `malcolm_severity.yaml` for the changes to take effect. The [hostname and CIDR subnet names interface](host-and-subnet-mapping.md#NameMapUI) provides a convenient button for restarting Logstash. -Severity scoring can be disabled globally by setting the `LOGSTASH_SEVERITY_SCORING` environment variable to `false` in the [`docker-compose.yml`](#DockerComposeYml) file and [restarting Malcolm](#StopAndRestart). \ No newline at end of file +Severity scoring can be disabled globally by setting the `LOGSTASH_SEVERITY_SCORING` environment variable to `false` in the [`docker-compose.yml`](malcolm-config.md#DockerComposeYml) file and [restarting Malcolm](running.md#StopAndRestart). \ No newline at end of file diff --git a/docs/system-requirements.md b/docs/system-requirements.md new file mode 100644 index 000000000..b25d80c47 --- /dev/null +++ b/docs/system-requirements.md @@ -0,0 +1,7 @@ +### Recommended system requirements + +Malcolm runs on top of [Docker](https://www.docker.com/) which runs on recent releases of Linux, Apple macOS and Microsoft Windows 10. + +To quote the [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html), "If there is one resource that you will run out of first, it will likely be memory." The same is true for Malcolm: you will want at least 16 gigabytes of RAM to run Malcolm comfortably. For processing large volumes of traffic, I'd recommend at a bare minimum a dedicated server with 16 cores and 16 gigabytes of RAM. Malcolm can run on less, but more is better. You're going to want as much hard drive space as possible, of course, as the amount of PCAP data you're able to analyze and store will be limited by your hard drive. + +Arkime's wiki has a couple of documents ([here](https://github.com/arkime/arkime#hardware-requirements) and [here](https://github.com/arkime/arkime/wiki/FAQ#what-kind-of-capture-machines-should-we-buy) and [here](https://github.com/arkime/arkime/wiki/FAQ#how-many-elasticsearch-nodes-or-machines-do-i-need) and a [calculator here](https://molo.ch/#estimators)) which may be helpful, although not everything in those documents will apply to a Docker-based setup like Malcolm. \ No newline at end of file diff --git a/docs/ubuntu-install-example.md b/docs/ubuntu-install-example.md index 3bdf5a3c7..fb8bedbbd 100644 --- a/docs/ubuntu-install-example.md +++ b/docs/ubuntu-install-example.md @@ -202,7 +202,7 @@ Scripts for starting and stopping Malcolm and changing authentication-related se At this point you should **reboot your computer** so that the new system settings can be applied. After rebooting, log back in and return to the directory to which Malcolm was installed (or to which the git working copy was cloned). -Now we need to [set up authentication](#AuthSetup) and generate some unique self-signed TLS certificates. You can replace `analyst` in this example with whatever username you wish to use to log in to the Malcolm web interface. +Now we need to [set up authentication](authsetup.md#AuthSetup) and generate some unique self-signed TLS certificates. You can replace `analyst` in this example with whatever username you wish to use to log in to the Malcolm web interface. ``` user@host:~/Malcolm$ ./scripts/auth_setup @@ -225,7 +225,7 @@ Store username/password for email alert sender account? (y/N): n (Re)generate internal passwords for NetBox (Y/n): y ``` -For now, rather than [build Malcolm from scratch](#Build), we'll pull images from [Docker Hub](https://hub.docker.com/u/malcolmnetsec): +For now, rather than [build Malcolm from scratch](development.md#Build), we'll pull images from [Docker Hub](https://hub.docker.com/u/malcolmnetsec): ``` user@host:~/Malcolm$ docker-compose pull Pulling api ... done @@ -320,4 +320,4 @@ malcolm-logstash-1 | [2022-07-27T20:27:52,056][INFO ][logstash.agent … ``` -You can now open a web browser and navigate to one of the [Malcolm user interfaces](#UserInterfaceURLs). \ No newline at end of file +You can now open a web browser and navigate to one of the [Malcolm user interfaces](quickstart.md#UserInterfaceURLs). \ No newline at end of file diff --git a/docs/upload.md b/docs/upload.md index b747d1a02..e142deb72 100644 --- a/docs/upload.md +++ b/docs/upload.md @@ -17,10 +17,10 @@ Files uploaded via these methods are monitored and moved automatically to other ### Tagging -In addition to be processed for uploading, Malcolm events will be tagged according to the components of the filenames of the PCAP files or Zeek log archives files from which the events were parsed. For example, records created from a PCAP file named `ACME_Scada_VLAN10.pcap` would be tagged with `ACME`, `Scada`, and `VLAN10`. Tags are extracted from filenames by splitting on the characters `,` (comma), `-` (dash), and `_` (underscore). These tags are viewable and searchable (via the `tags` field) in Arkime and OpenSearch Dashboards. This behavior can be changed by modifying the `AUTO_TAG` [environment variable in `docker-compose.yml`](#DockerComposeYml). +In addition to be processed for uploading, Malcolm events will be tagged according to the components of the filenames of the PCAP files or Zeek log archives files from which the events were parsed. For example, records created from a PCAP file named `ACME_Scada_VLAN10.pcap` would be tagged with `ACME`, `Scada`, and `VLAN10`. Tags are extracted from filenames by splitting on the characters `,` (comma), `-` (dash), and `_` (underscore). These tags are viewable and searchable (via the `tags` field) in Arkime and OpenSearch Dashboards. This behavior can be changed by modifying the `AUTO_TAG` [environment variable in `docker-compose.yml`](malcolm-config.md#DockerComposeYml). -Tags may also be specified manually with the [browser-based upload form](#Upload). +Tags may also be specified manually with the [browser-based upload form](upload.md#Upload). ### Processing uploaded PCAPs with Zeek and Suricata -The **Analyze with Zeek** and **Analyze with Suricata** checkboxes may be used when uploading PCAP files to cause them to be analyzed by Zeek and Suricata, respectively. This is functionally equivalent to the `ZEEK_AUTO_ANALYZE_PCAP_FILES` and `SURICATA_AUTO_ANALYZE_PCAP_FILES` environment variables [described above](#DockerComposeYml), only on a per-upload basis. Zeek can also automatically carve out files from file transfers; see [Automatic file extraction and scanning](#ZeekFileExtraction) for more details. +The **Analyze with Zeek** and **Analyze with Suricata** checkboxes may be used when uploading PCAP files to cause them to be analyzed by Zeek and Suricata, respectively. This is functionally equivalent to the `ZEEK_AUTO_ANALYZE_PCAP_FILES` and `SURICATA_AUTO_ANALYZE_PCAP_FILES` environment variables [described above](malcolm-config.md#DockerComposeYml), only on a per-upload basis. Zeek can also automatically carve out files from file transfers; see [Automatic file extraction and scanning](file-scanning.md#ZeekFileExtraction) for more details. diff --git a/docs/zeek-intel.md b/docs/zeek-intel.md index 4f90da1b5..6d0d11651 100644 --- a/docs/zeek-intel.md +++ b/docs/zeek-intel.md @@ -4,9 +4,9 @@ To quote Zeek's [Intelligence Framework](https://docs.zeek.org/en/master/framewo Malcolm doesn't come bundled with intelligence files from any particular feed, but they can be easily included into your local instance. On [startup](shared/bin/zeek_intel_setup.sh), Malcolm's `malcolmnetsec/zeek` docker container enumerates the subdirectories under `./zeek/intel` (which is [bind mounted](https://docs.docker.com/storage/bind-mounts/) into the container's runtime) and configures Zeek so that those intelligence files will be automatically included in its local policy. Subdirectories under `./zeek/intel` which contain their own `__load__.zeek` file will be `@load`-ed as-is, while subdirectories containing "loose" intelligence files will be [loaded](https://docs.zeek.org/en/master/frameworks/intel.html#loading-intelligence) automatically with a `redef Intel::read_files` directive. -Note that Malcolm does not manage updates for these intelligence files. You should use the update mechanism suggested by your feeds' maintainers to keep them up to date, or use a [TAXII](#ZeekIntelSTIX) or [MISP](#ZeekIntelMISP) feed as described below. +Note that Malcolm does not manage updates for these intelligence files. You should use the update mechanism suggested by your feeds' maintainers to keep them up to date, or use a [TAXII](zeek-intel.md#ZeekIntelSTIX) or [MISP](zeek-intel.md#ZeekIntelMISP) feed as described below. -Adding and deleting intelligence files under this directory will take effect upon [restarting Malcolm](#StopAndRestart). Alternately, you can use the `ZEEK_INTEL_REFRESH_CRON_EXPRESSION` environment variable containing a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) to specify the interval at which the intel files should be refreshed. It can also be done manually without restarting Malcolm by running the following command from the Malcolm installation directory: +Adding and deleting intelligence files under this directory will take effect upon [restarting Malcolm](running.md#StopAndRestart). Alternately, you can use the `ZEEK_INTEL_REFRESH_CRON_EXPRESSION` environment variable containing a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) to specify the interval at which the intel files should be refreshed. It can also be done manually without restarting Malcolm by running the following command from the Malcolm installation directory: ``` docker-compose exec --user $(id -u) zeek /usr/local/bin/entrypoint.sh true