Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/bytes openmetrics #750

Merged
merged 32 commits into from
Apr 26, 2023
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
c60ec42
First version of bytes metrics endpoint
Donnype Apr 12, 2023
e4e7251
Added first version of Bytes openmetrics endpoint.
Donnype Apr 13, 2023
5a33a1f
Some optimization
Donnype Apr 13, 2023
d2de642
Comment update
Donnype Apr 13, 2023
f6fe8c0
Cache counting files
Donnype Apr 14, 2023
2a31516
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 17, 2023
eaf729a
Make mountpoints configurable
Donnype Apr 17, 2023
9eefbae
Update env key names to be more consistent
Donnype Apr 17, 2023
ee7cd65
Remove caching and add raw file counts from the database
Donnype Apr 18, 2023
a9b7114
Revert changes from cleanup PR: move api model
Donnype Apr 18, 2023
086c017
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 18, 2023
2bdd60d
Cleanup
Donnype Apr 19, 2023
dd55b6b
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 19, 2023
34a6aaf
Add cachetools, remove redundant metrics.
Donnype Apr 20, 2023
4492df9
Update documentation.
Donnype Apr 20, 2023
0d52ba4
Fix Mypy not understanding lambda's and add types-cachetool as additi…
Donnype Apr 20, 2023
ff4d238
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 20, 2023
d164799
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 20, 2023
6334ef4
Remove ci env for metrics mountpoints
Donnype Apr 20, 2023
9a70f6f
Remove unused methods after update
Donnype Apr 20, 2023
bc49dc4
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 24, 2023
3f27ed4
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 24, 2023
c401f67
Add some performance tuning documentation.
Donnype Apr 24, 2023
b412185
Merge branch 'feature/bytes-openmetrics' of github.com:minvws/nl-kat-…
Donnype Apr 24, 2023
4aa695c
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 25, 2023
c167366
Merge branch 'main' into feature/bytes-openmetrics
Darwinkel Apr 25, 2023
3466734
Update default to 300 seconds for bytes metric cache
Donnype Apr 25, 2023
2506521
Merge branch 'feature/bytes-openmetrics' of github.com:minvws/nl-kat-…
Donnype Apr 25, 2023
dde50ba
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 25, 2023
b99e0ac
Join on RawFileInDB to enable count(*), update test as well
Donnype Apr 25, 2023
ec53499
Merge branch 'main' into feature/bytes-openmetrics
Donnype Apr 26, 2023
b30eac7
Merge branch 'main' into feature/bytes-openmetrics
noamblitz Apr 26, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ repos:
rev: v1.2.0
hooks:
- id: mypy
additional_dependencies: ['types-PyYAML', 'types-requests', 'pydantic', 'pynacl']
additional_dependencies: ['types-PyYAML', 'types-requests', 'types-cachetools', 'pydantic', 'pynacl']
exclude: |
(?x)(
^boefjes/ |
Expand Down
1 change: 1 addition & 0 deletions bytes/.env-dist
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,4 @@ QUEUE_URI=
# Optional environment variables
BYTES_LOG_FILE= # Optional file with Bytes logs.
BYTES_DATA_DIR= # Root for all the data. A change means that you no longer have access to old data unless you move it!
BYTES_METRICS_TTL_SECONDS=0 # The time to cache slow queries performed in the metrics endpoint.
20 changes: 19 additions & 1 deletion bytes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ QUEUE_URI=
# Optional environment variables
BYTES_LOG_FILE= # Optional file with Bytes logs.
BYTES_DATA_DIR= # Root for all the data. A change means that you no longer have access to old data unless you move it!
BYTES_METRICS_TTL_SECONDS=0 # The time to cache slow queries performed in the metrics endpoint.
```

Most of these are self-explanatory, but a few sets of variables require more explanation.
Expand Down Expand Up @@ -130,6 +131,14 @@ KAT_PRIVATE_KEY_B64=""
VWS_PUBLIC_KEY_B64=""
```

### Observability

Bytes exposes a `/metrics` endpoint for basic application level observability,
such as the amount of organizations and the amount of raw files per organization.
Another important component to monitor is the disk usage of Bytes.
It is recommended to install [node exporter](https://prometheus.io/docs/guides/node-exporter/) to keep track of this.


## Design

We now include two levels of design, according to the [C4 model](https://c4model.com/).
Expand All @@ -147,7 +156,6 @@ graph
Scheduler["Scheduler<br/><i>Software System"]
Boefjes["Boefjes<br/><i>Python App"]


Boefjes -- GET/POST Raw/Meta --> Bytes
User -- Interacts with --> Rocky
Rocky -- GET/POST Raw/Meta --> Bytes
Expand Down Expand Up @@ -238,3 +246,13 @@ To export raw SQL from the SQLAlchemy migration files, run the following target
```shell
$ make sql rev1=0003 rev2=0004 > sql_migrations/0004_change_x_to_y_add_column_z.sql
```


## Production


### Performance tuning

Bytes caches some metrics for performance, but the default is not to cache these queries.
It is recommended to tune the `BYTES_METRICS_TTL_SECONDS` variable to on the amount of calls to the `/metrics` endpoint.
As a guideline, add at least 10 seconds to the cache for every million of raw files in the database.
46 changes: 46 additions & 0 deletions bytes/bytes/api/metrics.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import logging
from typing import Dict

from cachetools import TTLCache, cached
from prometheus_client import CollectorRegistry, Gauge

from bytes.config import get_settings
from bytes.repositories.meta_repository import MetaDataRepository

collector_registry = CollectorRegistry()


bytes_database_organizations_total = Gauge(
name="bytes_database_organizations_total",
documentation="Total amount of organizations in the bytes database.",
registry=collector_registry,
)
bytes_database_raw_files_total = Gauge(
name="bytes_database_raw_files_total",
documentation="Total amount of raw files in the bytes database.",
registry=collector_registry,
labelnames=["organization_id"],
)

logger = logging.getLogger(__name__)


def ignore_arguments_key(meta_repository: MetaDataRepository):
return ""


@cached(cache=TTLCache(maxsize=1, ttl=get_settings().bytes_metrics_ttl_seconds), key=ignore_arguments_key)
def cached_counts_per_organization(meta_repository: MetaDataRepository) -> Dict[str, int]:
logger.debug("Metrics cache miss, ttl set to %s seconds", get_settings().bytes_metrics_ttl_seconds)

return meta_repository.get_raw_file_count_per_organization()


def get_registry(meta_repository: MetaDataRepository) -> CollectorRegistry:
counts_per_organization = cached_counts_per_organization(meta_repository)
bytes_database_organizations_total.set(len(counts_per_organization))

for organization_id, count in counts_per_organization.items():
bytes_database_raw_files_total.labels(organization_id).set(count)

return collector_registry
16 changes: 14 additions & 2 deletions bytes/bytes/api/root.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
import logging
from typing import Any, List, Optional, Union

import prometheus_client
from fastapi import APIRouter, Depends
from fastapi.exceptions import RequestValidationError
from fastapi.responses import RedirectResponse
from fastapi.responses import RedirectResponse, Response
from fastapi.security import OAuth2PasswordRequestForm
from pydantic import BaseModel, ValidationError
from starlette.requests import Request
from starlette.responses import JSONResponse
from starlette.status import HTTP_422_UNPROCESSABLE_ENTITY

from bytes.auth import TokenResponse, get_access_token
from bytes.api.metrics import get_registry
from bytes.auth import TokenResponse, authenticate_token, get_access_token
from bytes.database.sql_meta_repository import create_meta_data_repository
from bytes.repositories.meta_repository import MetaDataRepository
from bytes.version import __version__

router = APIRouter()
Expand Down Expand Up @@ -46,6 +50,14 @@ def root() -> ServiceHealth:
return bytes_health


@router.get("/metrics", dependencies=[Depends(authenticate_token)])
def metrics(meta_repository: MetaDataRepository = Depends(create_meta_data_repository)):
collector_registry = get_registry(meta_repository)
data = prometheus_client.generate_latest(collector_registry)

return Response(media_type="text/plain", content=data)


@router.post("/token", response_model=TokenResponse)
def login(form_data: OAuth2PasswordRequestForm = Depends()) -> TokenResponse:
access_token, expire_time = get_access_token(form_data)
Expand Down
2 changes: 2 additions & 0 deletions bytes/bytes/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ class Settings(BaseSettings):
kat_private_key_b64: str = ""
vws_public_key_b64: str = ""

bytes_metrics_ttl_seconds: int = 0


@lru_cache()
def get_settings() -> Settings:
Expand Down
12 changes: 11 additions & 1 deletion bytes/bytes/database/sql_meta_repository.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
import logging
import uuid
from typing import Iterator, List, Optional, Type
from typing import Dict, Iterator, List, Optional, Type

from sqlalchemy import func
from sqlalchemy.exc import IntegrityError
from sqlalchemy.orm import Session, sessionmaker

Expand Down Expand Up @@ -157,6 +158,15 @@ def has_raw(self, boefje_meta: BoefjeMeta, mime_types: List[MimeType]) -> bool:

return count > 0

def get_raw_file_count_per_organization(self) -> Dict[str, int]:
query = (
self.session.query(BoefjeMetaInDB.organization, func.count(RawFileInDB.id))
Donnype marked this conversation as resolved.
Show resolved Hide resolved
.join(BoefjeMetaInDB)
.group_by(BoefjeMetaInDB.organization)
)

return {organization_id: count for organization_id, count in query}

def _to_raw(self, raw_file_in_db: RawFileInDB) -> RawData:
boefje_meta = to_boefje_meta(raw_file_in_db.boefje_meta)
data = self.raw_repository.get_raw(raw_file_in_db.id, boefje_meta)
Expand Down
8 changes: 4 additions & 4 deletions bytes/bytes/repositories/meta_repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,7 @@ def save_boefje_meta(self, boefje_meta: BoefjeMeta) -> None:
def get_boefje_meta_by_id(self, boefje_meta_id: str) -> BoefjeMeta:
raise NotImplementedError()

def get_boefje_meta(
self,
query_filter: BoefjeMetaFilter,
) -> List[BoefjeMeta]:
def get_boefje_meta(self, query_filter: BoefjeMetaFilter) -> List[BoefjeMeta]:
raise NotImplementedError()

def save_normalizer_meta(self, normalizer_meta: NormalizerMeta) -> None:
Expand All @@ -68,3 +65,6 @@ def has_raw(self, boefje_meta: BoefjeMeta, mime_types: List[MimeType]) -> bool:

def get_raw(self, raw_id: str) -> RawData:
raise NotImplementedError()

def get_raw_file_count_per_organization(self) -> Dict[str, int]:
raise NotImplementedError()
2 changes: 2 additions & 0 deletions bytes/requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ astroid==2.12.12
attrs==23.1.0
bcrypt==4.0.1
black==23.1.0
cachetools==5.3.0
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==3.1.0
Expand Down Expand Up @@ -32,6 +33,7 @@ pika==1.3.1
platformdirs==3.0.0
pluggy==1.0.0
pre-commit==3.2.2
prometheus-client==0.16.0
psycopg2==2.9.6
py==1.11.0
pyasn1==0.4.8
Expand Down
2 changes: 2 additions & 0 deletions bytes/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ alembic==1.8.1
anyio==3.6.2
asgiref==3.5.2
bcrypt==4.0.1
cachetools==5.3.0
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==3.1.0
Expand All @@ -18,6 +19,7 @@ mako==1.2.4
markupsafe==2.1.2
passlib[bcrypt]==1.7.4
pika==1.3.1
prometheus-client==0.16.0
psycopg2==2.9.6
pyasn1==0.4.8
pycparser==2.21
Expand Down
10 changes: 9 additions & 1 deletion bytes/tests/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from typing import Any, Callable, Dict, List, Optional, Union

import requests
from requests.models import HTTPError
from requests.exceptions import HTTPError

from bytes.models import BoefjeMeta, NormalizerMeta
from bytes.repositories.meta_repository import BoefjeMetaFilter, RawDataFilter
Expand Down Expand Up @@ -70,6 +70,14 @@ def _get_token(self) -> str:

return str(response.json()["access_token"])

@retry_with_login
def get_metrics(self) -> bytes:
response = self._session.get("/metrics", headers=self.headers)

self._verify_response(response)

return response.content

@retry_with_login
def save_boefje_meta(self, boefje_meta: BoefjeMeta) -> None:
response = self._session.post("/bytes/boefje_meta", data=boefje_meta.json(), headers=self.headers)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import pytest
import requests
from prometheus_client.parser import text_string_to_metric_families
from requests import HTTPError
from tests.client import BytesAPIClient
from tests.loading import get_boefje_meta, get_normalizer_meta
Expand All @@ -17,6 +18,57 @@ def test_login(bytes_api_client: BytesAPIClient) -> None:
assert "bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9" in bytes_api_client.headers["Authorization"]


def test_metrics(bytes_api_client: BytesAPIClient) -> None:
metrics = bytes_api_client.get_metrics()

metrics = list(text_string_to_metric_families(metrics.decode()))
assert len(metrics) == 2
assert metrics[0].name == "bytes_database_organizations_total"
assert len(metrics[0].samples) == 1
assert metrics[1].name == "bytes_database_raw_files_total"
assert len(metrics[1].samples) == 0

database_organizations, database_files = metrics

assert database_organizations.samples[0].value == 0.0

boefje_meta = get_boefje_meta()
bytes_api_client.save_boefje_meta(boefje_meta)
bytes_api_client.save_raw(boefje_meta.id, b"test 123")
bytes_api_client.save_raw(boefje_meta.id, b"test 12334", ["text/boefje"])

metrics = bytes_api_client.get_metrics()
metrics = list(text_string_to_metric_families(metrics.decode()))
assert len(metrics[0].samples) == 1
assert len(metrics[1].samples) == 1
database_organizations, database_files = metrics

assert database_organizations.samples[0].value == 1.0

assert database_files.samples[0].labels["organization_id"] == "test"
assert database_files.samples[0].value == 2.0

boefje_meta = get_boefje_meta()
boefje_meta.id = str(uuid.uuid4())
boefje_meta.organization = "test2"
bytes_api_client.save_boefje_meta(boefje_meta)
bytes_api_client.save_raw(boefje_meta.id, b"test 123")

metrics = bytes_api_client.get_metrics()
metrics = list(text_string_to_metric_families(metrics.decode()))
assert len(metrics[0].samples) == 1
assert len(metrics[1].samples) == 2
database_organizations, database_files = metrics

assert database_organizations.samples[0].value == 2.0

assert len(database_files.samples) == 2
assert database_files.samples[0].labels["organization_id"] == "test"
assert database_files.samples[0].value == 2.0
assert database_files.samples[1].labels["organization_id"] == "test2"
assert database_files.samples[1].value == 1.0


def test_boefje_meta(bytes_api_client: BytesAPIClient) -> None:
boefje_meta = get_boefje_meta()
bytes_api_client.save_boefje_meta(boefje_meta)
Expand Down
2 changes: 2 additions & 0 deletions bytes/tests/integration/test_meta_repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,8 @@ def test_save_boefje_meta_hash(meta_repository: SQLMetaDataRepository) -> None:
non_empty_raws = meta_repository.get_raws(query_filter)
assert len(non_empty_raws) == 1

assert meta_repository.get_raw_file_count_per_organization() == {"test": 1}


def test_save_normalizer_meta(meta_repository: SQLMetaDataRepository) -> None:
with meta_repository:
Expand Down
29 changes: 29 additions & 0 deletions bytes/tests/seed_database.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import uuid

from bytes.database.sql_meta_repository import create_meta_data_repository
from bytes.models import RawData
from tests.loading import get_boefje_meta


def seed():
repository = next(create_meta_data_repository())
number_of_raw_files, chunk_size = int(2e6), 1000

for i in range(int(number_of_raw_files / chunk_size)):
with repository:
for j in range(chunk_size):
raw = b"asdf --- \n\n testdata" + str(i).encode()

boefje_meta = get_boefje_meta(meta_id=str(uuid.uuid4()))
boefje_meta.organization = ["a", "b", "c", "d", "e", "f", "g"][i % 7]

repository.save_boefje_meta(boefje_meta)
repository.save_raw(RawData(value=raw, boefje_meta=boefje_meta))

print(f"Committing chunk {i}")


if __name__ == "__main__":
"""This script is just a helper to generate a large set of objects to test performance with."""

seed()