Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: improve settings and environment logic and phase out redundant environment keys #3384

Merged
merged 37 commits into from
Sep 5, 2024
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
0e706a1
First setup
Donnype Aug 8, 2024
f500543
Make schema matching for enabling optional from the API perspective
Donnype Aug 8, 2024
e2db672
Add migration and set the schema for local plugins
Donnype Aug 8, 2024
d0877dd
Fix migration tests
Donnype Aug 10, 2024
622db58
Test the migration
Donnype Aug 12, 2024
a2664ff
Add tests for API logic and fix some edge cases
Donnype Aug 12, 2024
7e0ef3a
Merge branch 'main' into feature/add-json-schema-for-copied-boefjes
Donnype Aug 12, 2024
5699cb8
Merge branch 'main' into feature/add-json-schema-for-copied-boefjes
underdarknl Aug 13, 2024
4111fd4
Merge branch 'main' into feature/add-json-schema-for-copied-boefjes
Donnype Aug 13, 2024
1518444
Merge branch 'feature/add-json-schema-for-copied-boefjes' into featur…
Donnype Aug 13, 2024
6317424
Merge branch 'main' into feature/improve-settings-env-logic
Donnype Aug 13, 2024
6254c5d
Remove organisation argument and update the unit tests
Donnype Aug 19, 2024
da5cdd7
Phase out the environment keys field
Donnype Aug 20, 2024
28bd17d
Merge branch 'main' into feature/add-json-schema-for-copied-boefjes
Donnype Aug 20, 2024
cb0a1b8
Add missing migration file
Donnype Aug 20, 2024
7008d8c
Merge branch 'feature/add-json-schema-for-copied-boefjes' into featur…
Donnype Aug 20, 2024
1dc7127
Merge branch 'main' into feature/add-json-schema-for-copied-boefjes
stephanie0x00 Aug 20, 2024
a224b13
Validate the final environment against the schema before running the …
Donnype Aug 20, 2024
cafdfa4
Merge branch 'main' into feature/add-json-schema-for-copied-boefjes
Donnype Aug 20, 2024
0ce499a
Merge branch 'feature/add-json-schema-for-copied-boefjes' into featur…
Donnype Aug 20, 2024
2f2c551
Add more logging to migration script
Donnype Aug 20, 2024
6bbbfdd
Merge branch 'feature/add-json-schema-for-copied-boefjes' into featur…
Donnype Aug 20, 2024
896b638
Merge branch 'main' into feature/add-json-schema-for-copied-boefjes
Donnype Aug 22, 2024
fd17f82
Merge branch 'feature/add-json-schema-for-copied-boefjes' into featur…
Donnype Aug 22, 2024
2dd6d84
Merge branch 'main' into feature/improve-settings-env-logic
underdarknl Aug 22, 2024
f97d48e
Merge branch 'main' into feature/improve-settings-env-logic
Donnype Aug 27, 2024
7875963
Add comment about importance of the schema enforcement
Donnype Aug 27, 2024
f8d1388
Use dict comprehension
Donnype Aug 29, 2024
fff81f3
Add tests for getting the environment from the KATalogus.
Donnype Aug 29, 2024
9d36117
Merge branch 'main' into feature/improve-settings-env-logic
Donnype Aug 29, 2024
716bbb9
skip integration test when unit testing
Donnype Aug 29, 2024
e5479ec
Merge branch 'main' into feature/improve-settings-env-logic
ammar92 Sep 4, 2024
0c92934
Fix tests and style
Donnype Sep 5, 2024
762687d
Documentation update for settings and environment variables for boefjes.
Donnype Sep 5, 2024
9449433
Merge branch 'main' into feature/improve-settings-env-logic
Donnype Sep 5, 2024
b290049
Merge branch 'main' into feature/improve-settings-env-logic
underdarknl Sep 5, 2024
339d00d
Merge branch 'main' into feature/improve-settings-env-logic
Donnype Sep 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions boefjes/.ci/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ services:
command: sh -c 'python -m pytest -v tests/integration'
depends_on:
- ci_katalogus-db
- ci_katalogus
env_file:
- .ci/.env.test
volumes:
Expand Down Expand Up @@ -103,8 +104,15 @@ services:
hard: 262144

ci_katalogus:
image: "docker.io/wiremock/wiremock:2.34.0"
volumes:
- .ci/wiremock:/home/wiremock
build:
context: ..
dockerfile: boefjes/Dockerfile
args:
- ENVIRONMENT=dev
command: uvicorn boefjes.katalogus.root:app --host 0.0.0.0 --port 8080
depends_on:
- ci_katalogus-db
env_file:
- .ci/.env.test
volumes:
- .:/app/boefjes
15 changes: 0 additions & 15 deletions boefjes/.ci/wiremock/mappings/organisations.json

This file was deleted.

3 changes: 1 addition & 2 deletions boefjes/boefjes/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,8 +151,7 @@ def get_task(task_id, scheduler_client):
def create_boefje_meta(task, local_repository):
boefje = task.data.boefje
boefje_resource = local_repository.by_id(boefje.id)
env_keys = boefje_resource.environment_keys
environment = get_environment_settings(task.data, env_keys) if env_keys else {}
environment = get_environment_settings(task.data, boefje_resource.schema)

organization = task.data.organization
input_ooi = task.data.input_ooi
Expand Down
23 changes: 8 additions & 15 deletions boefjes/boefjes/dependencies/plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ def clone_settings_to_organisation(self, from_organisation: str, to_organisation
self.set_enabled_by_id(plugin_id, to_organisation, enabled=True)

def upsert_settings(self, settings: dict, organisation_id: str, plugin_id: str):
self._assert_settings_match_schema(settings, organisation_id, plugin_id)
self._assert_settings_match_schema(settings, plugin_id)
self._put_boefje(plugin_id)

return self.config_storage.upsert(organisation_id, plugin_id, settings=settings)
Expand All @@ -122,14 +122,14 @@ def _put_boefje(self, boefje_id: str) -> None:

try:
self.plugin_storage.boefje_by_id(boefje_id)
except PluginNotFound:
except PluginNotFound as e:
try:
plugin = self.local_repo.by_id(boefje_id)
except KeyError:
raise
raise e

if plugin.type != "boefje":
raise
raise e
self.plugin_storage.create_boefje(plugin)

def _put_normalizer(self, normalizer_id: str) -> None:
Expand All @@ -150,12 +150,7 @@ def _put_normalizer(self, normalizer_id: str) -> None:
def delete_settings(self, organisation_id: str, plugin_id: str):
self.config_storage.delete(organisation_id, plugin_id)

try:
self._assert_settings_match_schema({}, organisation_id, plugin_id)
except SettingsNotConformingToSchema:
logger.warning("Making sure %s is disabled for %s because settings are deleted", plugin_id, organisation_id)

self.set_enabled_by_id(plugin_id, organisation_id, False)
# We don't check the schema anymore because we can provide entries through the global environment as well
Donnype marked this conversation as resolved.
Show resolved Hide resolved

def schema(self, plugin_id: str) -> dict | None:
try:
Expand Down Expand Up @@ -184,9 +179,7 @@ def description(self, plugin_id: str, organisation_id: str) -> str:
return ""

def set_enabled_by_id(self, plugin_id: str, organisation_id: str, enabled: bool):
if enabled:
all_settings = self.get_all_settings(organisation_id, plugin_id)
self._assert_settings_match_schema(all_settings, organisation_id, plugin_id)
# We don't check the schema anymore because we can provide entries through the global environment as well

try:
self._put_boefje(plugin_id)
Expand All @@ -195,14 +188,14 @@ def set_enabled_by_id(self, plugin_id: str, organisation_id: str, enabled: bool)

self.config_storage.upsert(organisation_id, plugin_id, enabled=enabled)

def _assert_settings_match_schema(self, all_settings: dict, organisation_id: str, plugin_id: str):
def _assert_settings_match_schema(self, all_settings: dict, plugin_id: str):
schema = self.schema(plugin_id)

if schema: # No schema means that there is nothing to assert
try:
validate(instance=all_settings, schema=schema)
except ValidationError as e:
raise SettingsNotConformingToSchema(organisation_id, plugin_id, e.message) from e
raise SettingsNotConformingToSchema(plugin_id, e.message) from e

def _set_plugin_enabled(self, plugin: PluginType, organisation_id: str) -> PluginType:
with contextlib.suppress(KeyError, NotFound):
Expand Down
45 changes: 29 additions & 16 deletions boefjes/boefjes/job_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
import httpx
import structlog
from httpx import HTTPError
from jsonschema.exceptions import ValidationError
from jsonschema.validators import validate

from boefjes.clients.bytes_client import BytesAPIClient
from boefjes.config import settings
Expand All @@ -15,6 +17,7 @@
from boefjes.local_repository import LocalPluginRepository
from boefjes.plugins.models import _default_mime_types
from boefjes.runtime_interfaces import BoefjeJobRunner, Handler, NormalizerJobRunner
from boefjes.storage.interfaces import SettingsNotConformingToSchema
from octopoes.api.models import Affirmation, Declaration, Observation
from octopoes.connector.octopoes import OctopoesAPIConnector
from octopoes.models import Reference, ScanLevel
Expand All @@ -35,30 +38,42 @@ def get_octopoes_api_connector(org_code: str) -> OctopoesAPIConnector:
return OctopoesAPIConnector(str(settings.octopoes_api), org_code)


def get_environment_settings(boefje_meta: BoefjeMeta, environment_keys: list[str]) -> dict[str, str]:
def get_environment_settings(boefje_meta: BoefjeMeta, schema: dict | None = None) -> dict[str, str]:
try:
katalogus_api = str(settings.katalogus_api).rstrip("/")
response = httpx.get(
f"{katalogus_api}/v1/organisations/{boefje_meta.organization}/{boefje_meta.boefje.id}/settings",
timeout=30,
)
response.raise_for_status()
environment = response.json()

# Add prefixed BOEFJE_* global environment variables
for key, value in os.environ.items():
if key.startswith("BOEFJE_"):
katalogus_key = key.split("BOEFJE_", 1)[1]
# Only pass the environment variable if it is not explicitly set through the katalogus,
# if and only if they are defined in boefje.json
if katalogus_key in environment_keys and katalogus_key not in environment:
environment[katalogus_key] = value

return {k: str(v) for k, v in environment.items() if k in environment_keys}
except HTTPError:
logger.exception("Error getting environment settings")
raise

allowed_keys = schema.get("properties", []) if schema else []
new_env = {
key.split("BOEFJE_", 1)[1]: value
for key, value in os.environ.items()
if key.startswith("BOEFJE_") and key in allowed_keys
}

settings_from_katalogus = response.json()

for key, value in settings_from_katalogus.items():
if key in allowed_keys:
new_env[key] = value

# The schema, besides dictating that a boefje cannot run if it is not matched, also provides an extra safeguard:
# it is possible to inject code if arguments are passed that "escape" the call to a tool. Hence, we should enforce
# the schema somewhere and make the schema as strict as possible.
if schema is not None:
try:
validate(instance=new_env, schema=schema)
except ValidationError as e:
raise SettingsNotConformingToSchema(boefje_meta.boefje.id, e.message) from e

return new_env


class BoefjeHandler(Handler):
def __init__(
Expand Down Expand Up @@ -97,10 +112,8 @@ def handle(self, boefje_meta: BoefjeMeta) -> None:

boefje_meta.arguments["input"] = ooi.serialize()

env_keys = boefje_resource.environment_keys

boefje_meta.runnable_hash = boefje_resource.runnable_hash
boefje_meta.environment = get_environment_settings(boefje_meta, env_keys) if env_keys else {}
boefje_meta.environment = get_environment_settings(boefje_meta, boefje_resource.schema)

mime_types = _default_mime_types(boefje_meta.boefje)

Expand Down
2 changes: 0 additions & 2 deletions boefjes/boefjes/katalogus/plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,6 @@ class BoefjeIn(BaseModel):
version: str | None = None
created: datetime.datetime | None = None
description: str | None = None
environment_keys: list[str] = Field(default_factory=list)
scan_level: int = 1
consumes: set[str] = Field(default_factory=set)
produces: set[str] = Field(default_factory=set)
Expand Down Expand Up @@ -167,7 +166,6 @@ class NormalizerIn(BaseModel):
version: str | None = None
created: datetime.datetime | None = None
description: str | None = None
environment_keys: list[str] = Field(default_factory=list)
consumes: list[str] = Field(default_factory=list) # mime types (and/ or boefjes)
produces: list[str] = Field(default_factory=list) # oois

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
"""Remove environment keys field

Revision ID: 870fc302b852
Revises: 5be152459a7b
Create Date: 2024-08-20 06:08:20.943924

"""

import sqlalchemy as sa
from alembic import op
from sqlalchemy.dialects import postgresql

# revision identifiers, used by Alembic.
revision = "870fc302b852"
down_revision = "5be152459a7b"
branch_labels = None
depends_on = None


def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column("boefje", "environment_keys")
op.drop_column("normalizer", "environment_keys")
# ### end Alembic commands ###


def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_column(
"normalizer",
sa.Column("environment_keys", postgresql.ARRAY(sa.VARCHAR(length=128)), autoincrement=False, nullable=False),
)
op.add_column(
"boefje",
sa.Column("environment_keys", postgresql.ARRAY(sa.VARCHAR(length=128)), autoincrement=False, nullable=False),
)
# ### end Alembic commands ###
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ def upgrade() -> None:
str(boefje.scan_level),
list(boefje.consumes),
list(boefje.produces),
boefje.environment_keys,
["TEST_KEY"],
boefje.oci_image,
boefje.oci_arguments,
boefje.version,
Expand Down Expand Up @@ -137,7 +137,7 @@ def upgrade() -> None:
str(boefje.scan_level),
list(boefje.consumes),
list(boefje.produces),
boefje.environment_keys,
["TEST_KEY"],
boefje.oci_image,
boefje.oci_arguments,
boefje.version,
Expand Down Expand Up @@ -177,7 +177,7 @@ def upgrade() -> None:
normalizer.description,
normalizer.consumes,
normalizer.produces,
normalizer.environment_keys,
["TEST_KEY"],
normalizer.version,
)
for normalizer in normalizers_to_insert
Expand Down
1 change: 0 additions & 1 deletion boefjes/boefjes/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ class Plugin(BaseModel):
version: str | None = None
created: datetime.datetime | None = None
description: str | None = None
environment_keys: list[str] = Field(default_factory=list)
enabled: bool = False
static: bool = True # We need to differentiate between local and remote plugins to know which ones can be deleted

Expand Down
3 changes: 0 additions & 3 deletions boefjes/boefjes/plugins/kat_binaryedge/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,5 @@
"IPAddressV4",
"IPAddressV6"
],
"environment_keys": [
"BINARYEDGE_API"
],
"scan_level": 2
}
4 changes: 0 additions & 4 deletions boefjes/boefjes/plugins/kat_censys/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,5 @@
"IPAddressV4",
"IPAddressV6"
],
"environment_keys": [
"CENSYS_API_ID",
"CENSYS_API_SECRET"
],
"scan_level": 1
}
3 changes: 0 additions & 3 deletions boefjes/boefjes/plugins/kat_cve_finding_types/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@
"consumes": [
"CVEFindingType"
],
"environment_keys": [
"CVEAPI_URL"
],
"scan_level": 0,
"enabled": true
}
4 changes: 0 additions & 4 deletions boefjes/boefjes/plugins/kat_dns/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,5 @@
"consumes": [
"Hostname"
],
"environment_keys": [
"RECORD_TYPES",
"REMOTE_NS"
],
"scan_level": 1
}
7 changes: 0 additions & 7 deletions boefjes/boefjes/plugins/kat_external_db/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,5 @@
"consumes": [
"Network"
],
"environment_keys": [
"DB_URL",
"DB_ACCESS_TOKEN",
"DB_ORGANIZATION_IDENTIFIER",
"DB_ENDPOINT_FORMAT",
"REQUESTS_CA_BUNDLE"
],
"scan_level": 0
}
3 changes: 0 additions & 3 deletions boefjes/boefjes/plugins/kat_leakix/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,5 @@
"IPAddressV6",
"Hostname"
],
"environment_keys": [
"LEAKIX_API"
],
"scan_level": 1
}
3 changes: 0 additions & 3 deletions boefjes/boefjes/plugins/kat_log4shell/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,5 @@
"consumes": [
"Hostname"
],
"environment_keys": [
"REPLY_FQDN"
],
"scan_level": 4
}
4 changes: 0 additions & 4 deletions boefjes/boefjes/plugins/kat_masscan/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,5 @@
"consumes": [
"IPV4NetBlock"
],
"environment_keys": [
"PORTS",
"MAX_RATE"
],
"scan_level": 2
}
6 changes: 1 addition & 5 deletions boefjes/boefjes/plugins/kat_maxmind_geoip/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,5 @@
"IPAddressV4",
"IPAddressV6"
],
"scan_level": 1,
"environment_keys": [
"MAXMIND_USER_ID",
"MAXMIND_LICENCE_KEY"
]
"scan_level": 1
}
6 changes: 0 additions & 6 deletions boefjes/boefjes/plugins/kat_nmap_ip_range/boefje.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,5 @@
"IPV6NetBlock",
"IPV4NetBlock"
],
"environment_keys": [
"TOP_PORTS_TCP",
"TOP_PORTS_UDP",
"MIN_VLSM_IPV4",
"MIN_VLSM_IPV6"
],
"scan_level": 2
}
Loading
Loading