Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(server): telemetry env variables #13705

Merged
merged 1 commit into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/docs/features/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@ The metrics in immich are grouped into API (endpoint calls and response times),

### Configuration

Immich will not expose an endpoint for metrics by default. To enable this endpoint, you can add the `IMMICH_METRICS=true` environmental variable to your `.env` file. Note that only the server and microservices containers currently use this variable.
Immich will not expose an endpoint for metrics by default. To enable this endpoint, you can add the `IMMICH_TELEMETRY_INCLUDE=all` environmental variable to your `.env` file. Note that only the server container currently use this variable.

:::tip
`IMMICH_METRICS` enables all metrics, but there are also [environmental variables](/docs/install/environment-variables.md#prometheus) to toggle specific metric groups. If you'd like to only expose certain kinds of metrics, you can set only those environmental variables to `true`. Explicitly setting the environmental variable for a metric group overrides `IMMICH_METRICS` for that group. For example, setting `IMMICH_METRICS=true` and `IMMICH_API_METRICS=false` will enable all metrics except API metrics.
`IMMICH_TELEMETRY_INCLUDE=all` enables all metrics. For a more granular configuration you can enumerate the telemetry metrics that should be included as a comma separated list (e.g. `IMMICH_TELEMETRY_INCLUDE=repo,api`). Alternatively, you can also exclude specific metrics with `IMMICH_TELEMETRY_EXCLUDE`. For more information refer to the [environment section](/docs/install/environment-variables.md#prometheus).
:::

The next step is to configure a new or existing Prometheus instance to scrape this endpoint. The following steps assume that you do not have an existing Prometheus instance, but the steps will be similar either way.
Expand Down
13 changes: 4 additions & 9 deletions docs/docs/install/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,15 +183,10 @@ Other machine learning parameters can be tuned from the admin UI.

## Prometheus

| Variable | Description | Default | Containers | Workers |
| :----------------------------- | :-------------------------------------------------------------------------------------------- | :-----: | :--------- | :----------------- |
| `IMMICH_METRICS`<sup>\*1</sup> | Toggle all metrics (one of [`true`, `false`]) | | server | api, microservices |
| `IMMICH_API_METRICS` | Toggle metrics for endpoints and response times (one of [`true`, `false`]) | | server | api, microservices |
| `IMMICH_HOST_METRICS` | Toggle metrics for CPU and memory utilization for host and process (one of [`true`, `false`]) | | server | api, microservices |
| `IMMICH_IO_METRICS` | Toggle metrics for database queries, image processing, etc. (one of [`true`, `false`]) | | server | api, microservices |
| `IMMICH_JOB_METRICS` | Toggle metrics for jobs and queues (one of [`true`, `false`]) | | server | api, microservices |

\*1: Overridden for a metric group when its corresponding environmental variable is set.
| Variable | Description | Default | Containers | Workers |
| :------------------------- | :-------------------------------------------------------------------------------------------------------------------- | :-----: | :--------- | :----------------- |
| `IMMICH_TELEMETRY_INCLUDE` | Collect these telemetries. List of `host`, `api`, `io`, `repo`, `job`. Note: You can also specify `all` to enable all | | server | api, microservices |
| `IMMICH_TELEMETRY_EXCLUDE` | Do not collect these telemetries. List of `host`, `api`, `io`, `repo`, `job` | | server | api, microservices |

## Docker Secrets

Expand Down
8 changes: 8 additions & 0 deletions server/src/enum.ts
Original file line number Diff line number Diff line change
Expand Up @@ -363,3 +363,11 @@ export enum ImmichWorker {
API = 'api',
MICROSERVICES = 'microservices',
}

export enum ImmichTelemetry {
HOST = 'host',
API = 'api',
IO = 'io',
REPO = 'repo',
JOB = 'job',
}
8 changes: 2 additions & 6 deletions server/src/interfaces/config.interface.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { RegisterQueueOptions } from '@nestjs/bullmq';
import { QueueOptions } from 'bullmq';
import { RedisOptions } from 'ioredis';
import { OpenTelemetryModuleOptions } from 'nestjs-otel/lib/interfaces';
import { ImmichEnvironment, ImmichWorker, LogLevel } from 'src/enum';
import { ImmichEnvironment, ImmichTelemetry, ImmichWorker, LogLevel } from 'src/enum';
import { VectorExtension } from 'src/interfaces/database.interface';

export const IConfigRepository = 'IConfigRepository';
Expand Down Expand Up @@ -77,11 +77,7 @@ export interface EnvData {
telemetry: {
apiPort: number;
microservicesPort: number;
enabled: boolean;
apiMetrics: boolean;
hostMetrics: boolean;
repoMetrics: boolean;
jobMetrics: boolean;
metrics: Set<ImmichTelemetry>;
};

storage: {
Expand Down
45 changes: 18 additions & 27 deletions server/src/repositories/config.repository.spec.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import { ImmichTelemetry } from 'src/enum';
import { clearEnvCache, ConfigRepository } from 'src/repositories/config.repository';

const getEnv = () => {
Expand All @@ -12,11 +13,8 @@ const resetEnv = () => {
'IMMICH_TRUSTED_PROXIES',
'IMMICH_API_METRICS_PORT',
'IMMICH_MICROSERVICES_METRICS_PORT',
'IMMICH_METRICS',
'IMMICH_API_METRICS',
'IMMICH_HOST_METRICS',
'IMMICH_IO_METRICS',
'IMMICH_JOB_METRICS',
'IMMICH_TELEMETRY_INCLUDE',
'IMMICH_TELEMETRY_EXCLUDE',

'DB_URL',
'DB_HOSTNAME',
Expand Down Expand Up @@ -210,11 +208,7 @@ describe('getEnv', () => {
expect(telemetry).toEqual({
apiPort: 8081,
microservicesPort: 8082,
enabled: false,
apiMetrics: false,
hostMetrics: false,
jobMetrics: false,
repoMetrics: false,
metrics: new Set([]),
});
});

Expand All @@ -225,32 +219,29 @@ describe('getEnv', () => {
expect(telemetry).toMatchObject({
apiPort: 2001,
microservicesPort: 2002,
metrics: expect.any(Set),
});
});

it('should run with telemetry enabled', () => {
process.env.IMMICH_METRICS = 'true';
process.env.IMMICH_TELEMETRY_INCLUDE = 'all';
const { telemetry } = getEnv();
expect(telemetry).toMatchObject({
enabled: true,
apiMetrics: true,
hostMetrics: true,
jobMetrics: true,
repoMetrics: true,
});
expect(telemetry.metrics).toEqual(new Set(Object.values(ImmichTelemetry)));
});

it('should run with telemetry enabled and jobs disabled', () => {
process.env.IMMICH_METRICS = 'true';
process.env.IMMICH_JOB_METRICS = 'false';
process.env.IMMICH_TELEMETRY_INCLUDE = 'all';
process.env.IMMICH_TELEMETRY_EXCLUDE = 'job';
const { telemetry } = getEnv();
expect(telemetry).toMatchObject({
enabled: true,
apiMetrics: true,
hostMetrics: true,
jobMetrics: false,
repoMetrics: true,
});
expect(telemetry.metrics).toEqual(
new Set([ImmichTelemetry.API, ImmichTelemetry.HOST, ImmichTelemetry.IO, ImmichTelemetry.REPO]),
);
});

it('should run with specific telemetry metrics', () => {
process.env.IMMICH_TELEMETRY_INCLUDE = 'io, host, api';
const { telemetry } = getEnv();
expect(telemetry.metrics).toEqual(new Set([ImmichTelemetry.API, ImmichTelemetry.HOST, ImmichTelemetry.IO]));
});
});
});
43 changes: 22 additions & 21 deletions server/src/repositories/config.repository.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { Injectable } from '@nestjs/common';
import { join } from 'node:path';
import { citiesFile, excludePaths } from 'src/constants';
import { Telemetry } from 'src/decorators';
import { ImmichEnvironment, ImmichWorker, LogLevel } from 'src/enum';
import { ImmichEnvironment, ImmichTelemetry, ImmichWorker, LogLevel } from 'src/enum';
import { EnvData, IConfigRepository } from 'src/interfaces/config.interface';
import { DatabaseExtension } from 'src/interfaces/database.interface';
import { QueueName } from 'src/interfaces/job.interface';
Expand All @@ -25,18 +25,17 @@ const stagingKeys = {
};

const WORKER_TYPES = new Set(Object.values(ImmichWorker));
const TELEMETRY_TYPES = new Set(Object.values(ImmichTelemetry));

const asSet = (value: string | undefined, defaults: ImmichWorker[]) => {
const asSet = <T>(value: string | undefined, defaults: T[]) => {
const values = (value || '').replaceAll(/\s/g, '').split(',').filter(Boolean);
return new Set(values.length === 0 ? defaults : (values as ImmichWorker[]));
return new Set(values.length === 0 ? defaults : (values as T[]));
};

const parseBoolean = (value: string | undefined, defaultValue: boolean) => (value ? value === 'true' : defaultValue);

const getEnv = (): EnvData => {
const included = asSet(process.env.IMMICH_WORKERS_INCLUDE, [ImmichWorker.API, ImmichWorker.MICROSERVICES]);
const excluded = asSet(process.env.IMMICH_WORKERS_EXCLUDE, []);
const workers = [...setDifference(included, excluded)];
const includedWorkers = asSet(process.env.IMMICH_WORKERS_INCLUDE, [ImmichWorker.API, ImmichWorker.MICROSERVICES]);
const excludedWorkers = asSet(process.env.IMMICH_WORKERS_EXCLUDE, []);
const workers = [...setDifference(includedWorkers, excludedWorkers)];
for (const worker of workers) {
if (!WORKER_TYPES.has(worker)) {
throw new Error(`Invalid worker(s) found: ${workers.join(',')}`);
Expand Down Expand Up @@ -69,12 +68,18 @@ const getEnv = (): EnvData => {
}
}

const globalEnabled = parseBoolean(process.env.IMMICH_METRICS, false);
const hostMetrics = parseBoolean(process.env.IMMICH_HOST_METRICS, globalEnabled);
const apiMetrics = parseBoolean(process.env.IMMICH_API_METRICS, globalEnabled);
const repoMetrics = parseBoolean(process.env.IMMICH_IO_METRICS, globalEnabled);
const jobMetrics = parseBoolean(process.env.IMMICH_JOB_METRICS, globalEnabled);
const telemetryEnabled = globalEnabled || hostMetrics || apiMetrics || repoMetrics || jobMetrics;
const includedTelemetries =
process.env.IMMICH_TELEMETRY_INCLUDE === 'all'
? new Set(Object.values(ImmichTelemetry))
: asSet<ImmichTelemetry>(process.env.IMMICH_TELEMETRY_INCLUDE, []);

const excludedTelemetries = asSet<ImmichTelemetry>(process.env.IMMICH_TELEMETRY_EXCLUDE, []);
const telemetries = setDifference(includedTelemetries, excludedTelemetries);
for (const telemetry of telemetries) {
if (!TELEMETRY_TYPES.has(telemetry)) {
throw new Error(`Invalid telemetry found: ${telemetry}`);
}
}

return {
host: process.env.IMMICH_HOST,
Expand Down Expand Up @@ -136,9 +141,9 @@ const getEnv = (): EnvData => {

otel: {
metrics: {
hostMetrics,
hostMetrics: telemetries.has(ImmichTelemetry.HOST),
apiMetrics: {
enable: apiMetrics,
enable: telemetries.has(ImmichTelemetry.API),
ignoreRoutes: excludePaths,
},
},
Expand Down Expand Up @@ -168,11 +173,7 @@ const getEnv = (): EnvData => {
telemetry: {
apiPort: Number(process.env.IMMICH_API_METRICS_PORT || '') || 8081,
microservicesPort: Number(process.env.IMMICH_MICROSERVICES_METRICS_PORT || '') || 8082,
enabled: telemetryEnabled,
hostMetrics,
apiMetrics,
repoMetrics,
jobMetrics,
metrics: telemetries,
},

workers,
Expand Down
15 changes: 8 additions & 7 deletions server/src/repositories/telemetry.repository.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import { snakeCase, startCase } from 'lodash';
import { MetricService } from 'nestjs-otel';
import { copyMetadataFromFunctionToFunction } from 'nestjs-otel/lib/opentelemetry.utils';
import { serverVersion } from 'src/constants';
import { MetadataKey } from 'src/enum';
import { ImmichTelemetry, MetadataKey } from 'src/enum';
import { IConfigRepository } from 'src/interfaces/config.interface';
import { ILoggerRepository } from 'src/interfaces/logger.interface';
import { IMetricGroupRepository, ITelemetryRepository, MetricGroupOptions } from 'src/interfaces/telemetry.interface';
Expand Down Expand Up @@ -99,17 +99,18 @@ export class TelemetryRepository implements ITelemetryRepository {
@Inject(ILoggerRepository) private logger: ILoggerRepository,
) {
const { telemetry } = this.configRepository.getEnv();
const { apiMetrics, hostMetrics, jobMetrics, repoMetrics } = telemetry;
const { metrics } = telemetry;

this.api = new MetricGroupRepository(metricService).configure({ enabled: apiMetrics });
this.host = new MetricGroupRepository(metricService).configure({ enabled: hostMetrics });
this.jobs = new MetricGroupRepository(metricService).configure({ enabled: jobMetrics });
this.repo = new MetricGroupRepository(metricService).configure({ enabled: repoMetrics });
this.api = new MetricGroupRepository(metricService).configure({ enabled: metrics.has(ImmichTelemetry.API) });
this.host = new MetricGroupRepository(metricService).configure({ enabled: metrics.has(ImmichTelemetry.HOST) });
this.jobs = new MetricGroupRepository(metricService).configure({ enabled: metrics.has(ImmichTelemetry.JOB) });
this.repo = new MetricGroupRepository(metricService).configure({ enabled: metrics.has(ImmichTelemetry.REPO) });
}

setup({ repositories }: { repositories: ClassConstructor<unknown>[] }) {
const { telemetry } = this.configRepository.getEnv();
if (!telemetry.enabled || !telemetry.repoMetrics) {
const { metrics } = telemetry;
if (!metrics.has(ImmichTelemetry.REPO)) {
return;
}

Expand Down
2 changes: 1 addition & 1 deletion server/src/workers/api.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ async function bootstrap() {
process.title = 'immich-api';

const { telemetry, network } = new ConfigRepository().getEnv();
if (telemetry.enabled) {
if (telemetry.metrics.size > 0) {
bootstrapTelemetry(telemetry.apiPort);
}

Expand Down
2 changes: 1 addition & 1 deletion server/src/workers/microservices.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import { isStartUpError } from 'src/services/storage.service';

export async function bootstrap() {
const { telemetry } = new ConfigRepository().getEnv();
if (telemetry.enabled) {
if (telemetry.metrics.size > 0) {
bootstrapTelemetry(telemetry.microservicesPort);
}

Expand Down
6 changes: 1 addition & 5 deletions server/test/repositories/config.repository.mock.ts
Original file line number Diff line number Diff line change
Expand Up @@ -73,11 +73,7 @@ const envData: EnvData = {
telemetry: {
apiPort: 8081,
microservicesPort: 8082,
enabled: false,
hostMetrics: false,
apiMetrics: false,
jobMetrics: false,
repoMetrics: false,
metrics: new Set(),
},

workers: [ImmichWorker.API, ImmichWorker.MICROSERVICES],
Expand Down
Loading