ModShark is an automated moderation tool for servers running the Sharkey Fediverse server. It runs as a background tool with direct integration to Sharkey's database and API, offering extended and flexible moderation features. With customizable rules and multiple reporting options, ModShark provides a smooth extension to Sharkey's native tooling.
ModShark is based around an asynchronous, backend processing model. New objects accumulate into queues (driven by database triggers) until the next scheduled run. This allows flexible scheduling and burst-mode performance, at the cost of delayed reporting.
When ModShark runs, it processes each queue sequentially. Objects are read from the queues in batches which are then scanned to detect any flags. Wherever possible, filters are checked in-database to reduce extraneous data transfers.
Rules are responsible for the actual scanning logic. Each rule has a number of filters and common settings, along with a collection of "flags" that can each identify some target characteristic. At present, all flags are in the form of regular expression lists. These are compiled at runtime into an optimized scanner, which runs once per object to categorize it. Any match from any flag will cause the entire rule to match.
All matched objects are collected into a single "report" as the run progresses. Once all queues have been cleared, the final report is serialized and distributed by the enabled reporters. Each reporter may format the report in a different way, so take care to enable the best one(s) for your environment and staff.
ModShark "rules" detect and flag objects according to configurable parameters. Current rules can flag instances, notes (posts), and user profiles. See the below section for specific documentation.
The Flagged Instance rule checks all newly-discovered instances against a set of flags. Any flagged instance is included on the report. Filters are available to exclude instances that have already been actioned by the moderation team, including checks for suspended (delivery stopped), blocked (defederated), and silenced (limited).
This is a queued rule, meaning that it runs on a scheduled interval and scans all instances that have been discovered since the last scan. New instances are enqueued via database trigger to minimize overhead.
Flag | Description |
---|---|
Hostname | Scans the domain / hostname. |
Display Name | Scans the human-readable display name, if present. |
Description | Scans the human-readable description line, if present. Note: this field may contain santized HTML. |
Contact | Scans the administrator name / email fields, if present. |
Software | Scans the reported software name / version. Note 1: these fields can be spoofed and are frequently outdated. Note 2: due to a Sharkey bug, these fields are only present for Misskey-based instances. |
The Flagged User rule checks all newly-discovered users against a set of flags. Any flagged user is included on the report. Filters are available to exclude users who have already been actioned by the moderation team, including checks for suspended (blocked) and silenced (limited). Additional filters can exclude local or remote users and users from previously-actioned instances.
This is a queued rule, meaning that it runs on a scheduled interval and scans all users that have been discovered since the last scan. New users are enqueued via database trigger to minimize overhead.
Flag | Description |
---|---|
Username | Scans the unqualified username, excluding the @ prefix. |
Display Name | Scans the human-readable display name, if present. Note: this field may contain MFM or Markdown. |
Bio Text | Scans the human-readable bio text, if present. Note: this field may contain MFM or Markdown. |
Age Range | Compares the user's listed birthday against a list of flagged ranges. The age range format is described below. |
Flagged age ranges are defined by a pair of age definitions separated by a dash.
The first age defines the inclusive lower bound, while the second defines the exclusive upper bound.
Each age follows the same format: #y#m#d
.
"y", "m", and "d", of course, refer to year, month, and day respectively.
All are optional, but each age must include at least one component.
For simplicity, the lower bound can be excluded to match any age below the end point. The starting age will be inferred as zero years whenever there is only a single age, without any dash.
Examples:
- Flag ages that are likely fake:
0y - 4y
,80y - 9999y
- Flag users under 18:
18y
- Flag users under 18, with a 1-day margin-of-error:
18y1d
- Flag users under 18, excluding fake ages:
4y - 18y
The Flagged Note rule checks all newly-discovered notes against a set of flags. Any flagged note is included on the report. The note's subject (content warning) can also be scanned, which may be desired on instances with stricter moderation standards. Filters are available to exclude notes by visibility, including unlisted (home timeline), followers-only, and/or private (direct message). Additional filters can exclude notes by actioned users or from actioned instances. Finally, a pair of scoping filters can exclude local or remote notes as desired.
This is a queued rule, meaning that it runs on a scheduled interval and scans all notes that have been discovered since the last scan. New notes are enqueued via database trigger to minimize overhead.
Flag | Description |
---|---|
Text / CW | Scans the text and CW field, if configured. |
Emojis | Scans the emojis by longcode, if present. Note: emojis are in @[email protected] format. |
ModShark offers a variety of "reporters" to communicate reports in any desired format. All reporters are optional, and multiple can be enabled simultaneously. See the sections below for specific documentation.
The Console reporter is the simplest one - it simply writes the reported objects to ModShark's console output. If installed under Systemd or a similar service architecture, this output will be captured into the system's native logging system. The reporter's output is human-readable, but follows a predictable format that can be parsed via Regular Expression.
The console reporter is on by default, and should remain active under most environments. Disabling this reporter can complicate diagnostic and audit tasks.
The SendGrid reporter provides for email notification via SendGrid. Emails can be sent from any source address / name, and to any number of recipients. This reporter's output is formatted for human consumption and produces screen-reader-friendly HTML emails.
As SendGrid is a commercial service, this reporter requires a valid API key. The SendGrid reporter is disabled by default.
The Native reporter creates reports directly within Sharkey itself. This reporter provides a closer integration with Sharkey and is ideal for teams accustomed to working with Sharkey's moderation controls.
Two modes are available: API mode and database mode. API mode submits reports using a service account, triggering the report workflow including notification emails. If this is not desired, then database mode can be used to "quietly" insert a report that will not trigger notifications. Both modes will result in a valid report entry and trigger the "unresolved reports" dashboard message.
The native reporter is on by default and uses database mode if not otherwise configured.
The Post reporter creates an announcement post using a service account. The post template, visibility, audience, and subject (content warning) can all be configured. This feature is designed for internal staff notifications, but can be used for public alerts with adjustments to the template.
The template can be any valid post in MFM format, with special "variables" available to insert report contents.
$audience
- insert the configured audience as a string of @mentions.$report_body
- contents of the report in a human-readable format.
The post reporter is disabled by default.
The Webhook reporter publishes an announcement to Discord using a webhook. Multiple webhooks can be used simultaneously, and reports will be sent to all of them. This reporter is designed to support mass and cross-server notification use cases.
The webhook reporter is disabled by default.
- .NET 8 (or later) Runtime
- A supported version of Windows, Linux, or macOS (Linux is recommended)
- At least 128 MB available RAM (256 MB recommended)
- A functional installation of Sharkey
- Network or localhost connection to Sharkey's backend API
- Network or localhost connection to Sharkey's PostgreSQL database, and a user with read/write permissions
These instructions are intended for Linux environments using Systemd, and other platforms may require adjustments to the commands. Make sure to substitute all variables for their correct values.
- Create a service account for ModShark:
sudo useradd -s /bin/bash -d /home/modshark -m modshark
- Log into the service account:
sudo su - modshark
- Download the latest release package:
wget -O ModShark-latest.zip https://github.com/warriordog/ModShark/releases/latest/download/ModShark-latest.zip
- Extract the release package into a directory:
mkdir ModShark && unzip -o ModShark-latest.zip -d ModShark
- Create the production config file (see the Configuration section for details):
nano ModShark/appsettings.Production.json
- Run the latest database migrations:
psql -U $postgres_user -W -d $sharkey_database -a -f ModShark/update-ModShark-migrations.sql
- Return to an admin account:
exit
- Install the Systemd service:
sudo cp ModShark/modshark.service /etc/systemd/system/modshark.service
- Register the service and start it:
sudo systemctl daemon-reload && sudo systemctl enable modshark --now
These instructions are intended for Linux environments using Systemd, but should be generally applicable to other platforms. Make sure to substitute all variables for their correct values.
- Stop the ModShark service, if it's running:
sudo systemctl stop modshark
- Log into the ModShark service account:
sudo su - modshark
- Download the latest release package:
wget -O ModShark-latest.zip https://github.com/warriordog/ModShark/releases/latest/download/ModShark-latest.zip
- Extract the release package into your installation directory, overwriting any files:
unzip -o ModShark-latest.zip -d ModShar
- Run the latest database migrations:
psql -U $postgres_user -W -d $sharkey_database -a -f ModShark/update-ModShark-migrations.sql
- Return to an admin account:
exit
- Start the ModShark service:
sudo systemctl start modshark
These instructions are intended for Linux environments using Systemd, and other platforms may require adjustments to the commands. Make sure to substitute all variables for their correct values.
- Stop the ModShark service, if it's running:
systemctl stop modshark
- Disable the service:
systemctl disable modshark
- Remove the service file:
rm /etc/systemd/system/modshark.service && systemctl daemon-reload
- Revert ModShark's database changes:
psql -U $postgres_user -W -d $sharkey_database -a -f uninstall-ModShark-migrations.sql
- Remove ModShark files:
rm -r $modshark_directory
ModShark uses a layered configuration approach that allows for automatic updates without clobbering changes.
The root configuration file is appsettings.json
, which contains the default value for all options.
You may use this as a reference, but please do not modify it directly.
Any changes will be overwritten by updates.
To customize the default configuration, create a new file called appsettings.Production.json
.
Populate this file with the same structure as appsettings.json
, but include only the properties that you wish to modify.
Objects will be merged; arrays and all other values are replaced.
Tip: You can substitute "Production" for any other value to create environment-specific configurations.
Some common values are appsettings.Development.json
, appsettings.Testing.json
, and appsettings.Staging.json
.
There is also a special appsettings.Local.json
, which will be loaded as an additional layer on top of appsettings.Development.json
.
This file exists to store local secrets that should not be committed to source control.
Property | Type | Description |
---|---|---|
Logging.LogLevel |
Hash | Sets the minimum log severity. See this Microsoft article for details.
|
ModShark.Postgres.Connection |
String | Required. Connection string for the database.
|
ModShark.Postgres.Timeout |
Integer | Maximum time that a query can run before automatically terminating. Default: 30 |
ModShark.Reporters.Console.Enabled |
Boolean | Whether the Console reporter should be used. Default: true |
ModShark.Reporters.Native.Enabled |
Boolean | Whether the Native reporter should be used. Default: true |
ModShark.Reporters.Native.UseApi |
Boolean | Whether reports should be sent by API instead of database insert. Default: false |
ModShark.Reporters.Post.Audience |
String[] | Array of usernames to be granted access to the post. Must be in @[email protected] format. |
ModShark.Reporters.Post.Enabled |
Boolean | Whether the Post reporter should be used. Default: false |
ModShark.Reporters.Post.FlagInclusion |
Enum | Determines how flagged content should be included in the post. Note: MFM cannot be escaped to safely render untrusted text. Must be one of "none" , "minimal" , or "full" .Default: none |
ModShark.Reporters.Post.LocalOnly |
Boolean | Whether the post should be sent to local users only (defederated). Required if ModShark.Reporters.Post.Enabled is true.Default: true |
ModShark.Reporters.Post.Subject |
String | Subject line / content warning for the post. Default: "ModShark Report" |
ModShark.Reporters.Post.Template |
String | Template for the post (use variables $audience and $report_body). Default: "$report_body" |
ModShark.Reporters.Post.Visibility |
Enum | Visibility of the report post. Must be one of "public" , "unlisted" , "followers" , or "private" .Default: "followers" |
ModShark.Reporters.SendGrid.ApiKey |
String | SendGrid API key (must have send mail permissions). Required if ModShark.Reporters.SendGrid.Enabled is true. |
ModShark.Reporters.SendGrid.Enabled |
Boolean | Whether the SendGrid reporter should be used. Default: false |
ModShark.Reporters.SendGrid.FromAddress |
String | Email address to send reports from. Required if ModShark.Reporters.SendGrid.Enabled is true. |
ModShark.Reporters.SendGrid.FromName |
String | Name to associate with the from address. Required if ModShark.Reporters.SendGrid.Enabled is true.Default: "ModShark" |
ModShark.Reporters.SendGrid.FlagInclusion |
Boolean | Determines how flagged content should be included in the post. Must be one of "none" , "minimal" , or "full" .Default: full |
ModShark.Reporters.SendGrid.ToAddresses |
String[] | Array of email addresses to send reports to. Required if ModShark.Reporters.SendGrid.Enabled is true. |
ModShark.Reporters.WebHook.Enabled |
Boolean | Whether the Webhook reporter should be used. Default: false |
ModShark.Reporters.WebHook.Hooks |
Hash[] | Array of Webhook configuration objects. See the following options for details. |
ModShark.Reporters.WebHook.Hooks.FlagInclusion |
Boolean | Determines how flagged content should be included in the post. Must be one of "none" , "minimal" , or "full" .Default: full |
ModShark.Reporters.WebHook.Hooks.MaxLength |
Number | Maximum number of characters for each webhook message. Announcements that exceed this length will be chunked. Default: 2000 |
ModShark.Reporters.WebHook.Hooks.Type |
Enum | Indicates the type of webhook. Must be equal to Discord .Default: Discord |
ModShark.Reporters.WebHook.Hooks.Url |
String | URL of the webhook. |
ModShark.Rules.FlaggedInstance.BatchLimit |
Integer | Maximum number of instances to check at once. Default: 5000 |
ModShark.Rules.FlaggedInstance.ContactPatterns |
String[] | Array of regular expressions to check against each instance's admin name / email. |
ModShark.Rules.FlaggedInstance.DescriptionPatterns |
String[] | Array of regular expressions to check against each instance's description. |
ModShark.Rules.FlaggedInstance.Enabled |
Boolean | Whether the Flagged Instance rule should be executed. Default: false |
ModShark.Rules.FlaggedInstance.HostnamePatterns |
String[] | Array of regular expressions to check against each instance's hostname. |
ModShark.Rules.FlaggedInstance.IncludeBlocked |
Boolean | Whether blocked (defederated) instances should be scanned. Default: false |
ModShark.Rules.FlaggedInstance.IncludeSilenced |
Boolean | Whether silenced (limited) instances should be scanned. Default: false |
ModShark.Rules.FlaggedInstance.IncludeSuspended |
Boolean | Whether suspended (delivery stopped) instances should be scanned. Default: false |
ModShark.Rules.FlaggedInstance.NamePatterns |
String[] | Array of regular expressions to check against each instance's name. |
ModShark.Rules.FlaggedInstance.SoftwarePatterns |
String[] | Array of regular expressions to check against each instance's software name / version. |
ModShark.Rules.FlaggedInstance.Timeout |
Integer | Maximum time in milliseconds to spend scanning each instance. Default: 1000 |
ModShark.Rules.FlaggedNote.BatchLimit |
Integer | Maximum number of notes to check at once. Default: 5000 |
ModShark.Rules.FlaggedNote.EmojiPatterns |
String[] | Array of regular expressions to check against each note's emoji longcodes. |
ModShark.Rules.FlaggedNote.Enabled |
Boolean | Whether the Flagged Note rule should be executed. Default: false |
ModShark.Rules.FlaggedNote.IncludeBlockedInstance |
Boolean | Whether notes by users from blocked (defederated) instances should be scanned. Default: false |
ModShark.Rules.FlaggedNote.IncludeCW |
Boolean | Whether the subject line / content warning should be scanned. Default: true |
ModShark.Rules.FlaggedNote.IncludeDeletedUser |
Boolean | Whether notes by users marked as deleted should be scanned. Default: false |
ModShark.Rules.FlaggedNote.IncludeFollowersVis |
Boolean | Whether followers-only notes should be scanned. Default: false |
ModShark.Rules.FlaggedNote.IncludeLocal |
Boolean | Whether local notes should be scanned. Default: true |
ModShark.Rules.FlaggedNote.InlcudePrivateVis |
Boolean | Whether private (direct message) notes should be scanned. Default: false |
ModShark.Rules.FlaggedNote.IncludeRemote |
Boolean | Whether remote notes should be scanned. Default: true |
ModShark.Rules.FlaggedNote.IncludeSilencedUser |
Boolean | Whether notes by silenced users should be scanned. Default: true |
ModShark.Rules.FlaggedNote.IncludeSuspendedUser |
Boolean | Whether notes by suspended users should be scanned. Default: false |
ModShark.Rules.FlaggedNote.IncludeSilencedInstance |
Boolean | Whether notes by users from silenced (limited) instances should be scanned. Default: true |
ModShark.Rules.FlaggedNote.IncludeUnlistedVis |
Boolean | Whether unlisted (home only) notes should be scanned. Default: true |
ModShark.Rules.FlaggedNote.TextPatterns |
String[] | Array of regular expressions to check against each note's body / CW. |
ModShark.Rules.FlaggedNote.Timeout |
Integer | Maximum time in milliseconds to spend scanning each note. Default: 1000 |
ModShark.Rules.FlaggedUser.AgeRanges |
String[] | Array of age ranges to flag. |
ModShark.Rules.FlaggedUser.BatchLimit |
Integer | Maximum number of users to check at once. Default: 5000 |
ModShark.Rules.FlaggedUser.BioPatterns |
String[] | Array of regular expressions to check against each user's bio text. |
ModShark.Rules.FlaggedUser.DisplayNamePatterns |
String[] | Array of regular expressions to check against each user's display name. |
ModShark.Rules.FlaggedUser.Enabled |
Boolean | Whether the Flagged User rule should be executed. Default: false |
ModShark.Rules.FlaggedUser.IncludeBlockedInstance |
Boolean | Whether users from blocked (defederated) instances should be scanned. Default: false |
ModShark.Rules.FlaggedUser.IncludeDeleted |
Boolean | Whether users who are marked as deleted (but still exist) should be scanned. Default: false |
ModShark.Rules.FlaggedUser.IncludeLocal |
Boolean | Whether local users should be scanned. Default: true |
ModShark.Rules.FlaggedUser.IncludeRemote |
Boolean | Whether remote users should be scanned. Default: true |
ModShark.Rules.FlaggedUser.IncludeSilenced |
Boolean | Whether silenced users should be scanned. Default: false |
ModShark.Rules.FlaggedUser.IncludeSilencedInstance |
Boolean | Whether users from silenced (limited) instances should be scanned. Default: true |
ModShark.Rules.FlaggedUser.Timeout |
Integer | Maximum time in milliseconds to spend scanning each username. Default: 1000 |
ModShark.Rules.FlaggedUser.UsernamePatterns |
String[] | Array of regular expressions to check against each user's username. |
ModShark.Sharkey.ApiEndpoint |
String | Required. URL of the instance's backend API. Default: "https://127.0.0.1:3000" |
ModShark.Sharkey.IdFormat |
Enum | Required. ID format used by this instance. Must be one of "aid" , "aidx" , "meid" , "meidg" , "ulid" , or "objectid" .Default: "aidx" |
ModShark.Sharkey.PublicHost |
String | Required. Public hostname / domain of the instance. |
ModShark.Sharkey.ServiceAccount |
String | Username of ModShark's service account. Default: "instance.actor" |
ModShark.Worker.PollInterval |
Integer | Time in milliseconds to wait between each run. Default: 1800000 |