Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Otlp] Add disk retry enablement #5527

Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,14 @@
`parent_is_remote` information.
([#5563](https://github.com/open-telemetry/opentelemetry-dotnet/pull/5563))

* Introduced experimental support for automatically retrying export to the otlp
endpoint by storing the telemetry offline during transient network errors.
Users can enable this feature by setting the
`OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY` environment variable to `disk` and
setting `OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH` to the path
on disk for storing the telemetry.
([#5527](https://github.com/open-telemetry/opentelemetry-dotnet/pull/5527))

## 1.8.1

Released 2024-Apr-17
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ internal sealed class ExperimentalOptions

public const string OtlpRetryEnvVar = "OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY";

public const string OtlpDiskRetryDirectoryPathEnvVar = "OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an app is designed to run in a platform-independent environment and wants to use a temporary location, the restriction to set the environment variable OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH can make things complex. Should we write to the temporary path if this environment variable is not set? We could rely on .NET API to get temp path Path.GetTempPath. Customers will use OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY to enable offline storage anyway. We could explain the behavior in the documentation along with this environment variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. Its more simple for users who want minimal configuration for trying out this feature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


public ExperimentalOptions()
: this(new ConfigurationBuilder().AddEnvironmentVariables().Build())
{
Expand All @@ -29,9 +31,28 @@ public ExperimentalOptions(IConfiguration configuration)
this.EmitLogEventAttributes = emitLogEventAttributes;
}

if (configuration.TryGetStringValue(OtlpRetryEnvVar, out var retryPolicy) && retryPolicy != null && retryPolicy.Equals("in_memory", StringComparison.OrdinalIgnoreCase))
if (configuration.TryGetStringValue(OtlpRetryEnvVar, out var retryPolicy) && retryPolicy != null)
{
this.EnableInMemoryRetry = true;
if (retryPolicy.Equals("in_memory", StringComparison.OrdinalIgnoreCase))
{
this.EnableInMemoryRetry = true;
}
else if (retryPolicy.Equals("disk", StringComparison.OrdinalIgnoreCase))
{
if (configuration.TryGetStringValue(OtlpDiskRetryDirectoryPathEnvVar, out var path) && path != null)
{
this.EnableDiskRetry = true;
this.DiskRetryDirectoryPath = path;
}
else
{
throw new ArgumentException($"{OtlpDiskRetryDirectoryPathEnvVar} is required when '{retryPolicy}' retry is used");
}
}
else
{
throw new NotSupportedException($"Retry Policy '{retryPolicy}' is not supported.");
}
}
}

Expand All @@ -48,4 +69,14 @@ public ExperimentalOptions(IConfiguration configuration)
/// href="https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/exporter.md#retry"/>.
/// </remarks>
public bool EnableInMemoryRetry { get; }

/// <summary>
/// Gets a value indicating whether or not retry via disk should be enabled for transient errors.
/// </summary>
public bool EnableDiskRetry { get; }

/// <summary>
/// Gets the path on disk where the telemetry will be stored for retries at a later point.
/// </summary>
public string? DiskRetryDirectoryPath { get; }
}
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#if NETSTANDARD2_1 || NET6_0_OR_GREATER
using Grpc.Net.Client;
#endif
using Google.Protobuf;
using OpenTelemetry.Exporter.OpenTelemetryProtocol.Implementation.Transmission;
using LogOtlpCollector = OpenTelemetry.Proto.Collector.Logs.V1;
using MetricsOtlpCollector = OpenTelemetry.Proto.Collector.Metrics.V1;
Expand Down Expand Up @@ -100,9 +101,27 @@ public static THeaders GetHeaders<THeaders>(this OtlpExporterOptions options, Ac
? httpTraceExportClient.HttpClient.Timeout.TotalMilliseconds
: options.TimeoutMilliseconds;

return experimentalOptions.EnableInMemoryRetry
? new OtlpExporterRetryTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(exportClient, timeoutMilliseconds)
: new OtlpExporterTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(exportClient, timeoutMilliseconds);
if (experimentalOptions.EnableInMemoryRetry)
{
return new OtlpExporterRetryTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(exportClient, timeoutMilliseconds);
}
else if (experimentalOptions.EnableDiskRetry)
{
vishweshbankwar marked this conversation as resolved.
Show resolved Hide resolved
return new OtlpExporterPersistentStorageTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(
exportClient,
timeoutMilliseconds,
(byte[] data) =>
{
var request = new TraceOtlpCollector.ExportTraceServiceRequest();
request.MergeFrom(data);
return request;
},
Path.Combine(experimentalOptions.DiskRetryDirectoryPath, "traces"));
}
else
{
return new OtlpExporterTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(exportClient, timeoutMilliseconds);
}
}

public static OtlpExporterTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest> GetMetricsExportTransmissionHandler(this OtlpExporterOptions options, ExperimentalOptions experimentalOptions)
Expand All @@ -116,9 +135,27 @@ public static THeaders GetHeaders<THeaders>(this OtlpExporterOptions options, Ac
? httpMetricsExportClient.HttpClient.Timeout.TotalMilliseconds
: options.TimeoutMilliseconds;

return experimentalOptions.EnableInMemoryRetry
? new OtlpExporterRetryTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(exportClient, timeoutMilliseconds)
: new OtlpExporterTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(exportClient, timeoutMilliseconds);
if (experimentalOptions.EnableInMemoryRetry)
{
return new OtlpExporterRetryTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(exportClient, timeoutMilliseconds);
}
else if (experimentalOptions.EnableDiskRetry)
{
return new OtlpExporterPersistentStorageTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(
exportClient,
timeoutMilliseconds,
(byte[] data) =>
{
var request = new MetricsOtlpCollector.ExportMetricsServiceRequest();
request.MergeFrom(data);
return request;
},
Path.Combine(experimentalOptions.DiskRetryDirectoryPath, "metrics"));
}
else
{
return new OtlpExporterTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(exportClient, timeoutMilliseconds);
}
}

public static OtlpExporterTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest> GetLogsExportTransmissionHandler(this OtlpExporterOptions options, ExperimentalOptions experimentalOptions)
Expand All @@ -128,9 +165,27 @@ public static THeaders GetHeaders<THeaders>(this OtlpExporterOptions options, Ac
? httpLogExportClient.HttpClient.Timeout.TotalMilliseconds
: options.TimeoutMilliseconds;

return experimentalOptions.EnableInMemoryRetry
? new OtlpExporterRetryTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(exportClient, timeoutMilliseconds)
: new OtlpExporterTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(exportClient, timeoutMilliseconds);
if (experimentalOptions.EnableInMemoryRetry)
{
return new OtlpExporterRetryTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(exportClient, timeoutMilliseconds);
}
else if (experimentalOptions.EnableDiskRetry)
{
return new OtlpExporterPersistentStorageTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(
exportClient,
timeoutMilliseconds,
(byte[] data) =>
{
var request = new LogOtlpCollector.ExportLogsServiceRequest();
request.MergeFrom(data);
return request;
},
Path.Combine(experimentalOptions.DiskRetryDirectoryPath, "logs"));
}
else
{
return new OtlpExporterTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(exportClient, timeoutMilliseconds);
}
}

public static IExportClient<TraceOtlpCollector.ExportTraceServiceRequest> GetTraceExportClient(this OtlpExporterOptions options) =>
Expand Down
5 changes: 5 additions & 0 deletions src/OpenTelemetry.Exporter.OpenTelemetryProtocol/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,11 @@ want to solicit feedback from the community.

Added in `1.8.0`.

When set to `disk` along with setting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a user, I think I would have a lot of questions about this feature. What are the file names? What is the file structure? What is the retention policy? Are there multiple files? What do these files mean? How are they managed? How are they cleaned up? Stuff like that. I think we may need more docs. Doesn't have to be on this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, in addition, who has access to these files, any security/privacy concern, etc.

Copy link
Member Author

@vishweshbankwar vishweshbankwar May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree we need to explain these.

I think it would be better to improve the doc here to cover these details: https://github.com/open-telemetry/opentelemetry-dotnet-contrib/tree/main/src/OpenTelemetry.PersistentStorage.FileSystem. We could then link from here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code for persistent storage is vendored in this exporter, and it does not use a persistent storage package reference. It’s appropriate to document this here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a process defined: https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol/PersistentStorage/README.md#persistent-storage-apis-for-otlp-exporter. Currently we are not forking Readmes, but I think we should do that since we do not make any customizations to the forked files.

`OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH` to the path on
disk, it enables retries by storing telemetry on disk during transient
errors. TODO: Add package version.
vishweshbankwar marked this conversation as resolved.
Show resolved Hide resolved

* Logs

* `OTEL_DOTNET_EXPERIMENTAL_OTLP_EMIT_EVENT_LOG_ATTRIBUTES`
Expand Down